Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Greg Smith
On Wed, 20 Jun 2007, Bruce Momjian wrote: I don't expect this patch to be perfect when it is applied. I do expect to be a best effort, and it will get continual real-world testing during beta and we can continue to improve this. This is completely fair. Consider my suggestions something that

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Greg Smith
On Wed, 20 Jun 2007, Heikki Linnakangas wrote: You mean the shift and "flattening" of the graph to the right in the delivery response time distribution graph? Right, that's what ends up happening during the problematic cases. To pick numbers out of the air, instead of 1% of the transactions

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Heikki Linnakangas
Joshua D. Drake wrote: The only comment I have is that is could be useful to be able to turn this feature off via GUC. Other than that, I think it is great. Yeah, you can do that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcas

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Joshua D. Drake
Bruce Momjian wrote: Greg Smith wrote: I don't expect this patch to be perfect when it is applied. I do expect to be a best effort, and it will get continual real-world testing during beta and we can continue to improve this. Right now, we know we have a serious issue with checkpoint I/O, an

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Heikki Linnakangas
Greg Smith wrote: While it shows up in the 90% figure, what happens is most obvious in the response time distribution graphs. Someone who is currently getting a run like #295 right now: http://community.enterprisedb.com/ldc/295/rt.html Might be really unhappy if they turn on LDC expecting to

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Bruce Momjian
Greg Smith wrote: > I think it does a better job of showing how LDC can shift the top > percentile around under heavy load, even though there are runs where it's > a clear improvement. Since there is so much variability in results when > you get into this territory, you really need to run a lot

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Greg Smith
On Wed, 20 Jun 2007, Heikki Linnakangas wrote: Another series with 150 warehouses is more interesting. At that # of warehouses, the data disks are 100% busy according to iostat. The 90% percentile response times are somewhat higher with LDC, though the variability in both the baseline and LDC

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-20 Thread Heikki Linnakangas
I've uploaded the latest test results to the results page at http://community.enterprisedb.com/ldc/ The test results on the index page are not in a completely logical order, sorry about that. I ran a series of tests with 115 warehouses, and no surprises there. LDC smooths the checkpoints nic

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-18 Thread Greg Smith
On Mon, 18 Jun 2007, Simon Riggs wrote: Smoother checkpoints mean smaller resource queues when a burst coincides with a checkpoint, so anybody with throughput-maximised or bursty apps should want longer, smooth checkpoints. True as long as two conditions hold: 1) Buffers needed to fill alloc

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-18 Thread Simon Riggs
On Sun, 2007-06-17 at 01:36 -0400, Greg Smith wrote: > The last project I was working on, any checkpoint that caused a > transaction to slip for more than 5 seconds would cause a data loss. One > of the defenses against that happening is that you have a wicked fast > transaction rate to clear

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-16 Thread Heikki Linnakangas
Josh Berkus wrote: Where is the most current version of this patch? I want to test it on TPCE, but there seem to be 4-5 different versions floating around, and the patch tracker hasn't been updated. It would be the ldc-justwrites-2.patch: http://archives.postgresql.org/pgsql-patches/2007-06/

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-16 Thread Greg Smith
On Fri, 15 Jun 2007, Gregory Stark wrote: But what you're concerned about is not OLTP performance at all. It's an OLTP system most of the time that periodically gets unexpectedly high volume. The TPC-E OLTP test suite actually has a MarketFeed component to in it that has similar properties

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-16 Thread Josh Berkus
All, Where is the most current version of this patch? I want to test it on TPCE, but there seem to be 4-5 different versions floating around, and the patch tracker hasn't been updated. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)---

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-15 Thread PFC
On Fri, 15 Jun 2007 22:28:34 +0200, Gregory Maxwell <[EMAIL PROTECTED]> wrote: On 6/15/07, Gregory Stark <[EMAIL PROTECTED]> wrote: While in theory spreading out the writes could have a detrimental effect I think we should wait until we see actual numbers. I have a pretty strong suspicion t

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-15 Thread Gregory Maxwell
On 6/15/07, Gregory Stark <[EMAIL PROTECTED]> wrote: While in theory spreading out the writes could have a detrimental effect I think we should wait until we see actual numbers. I have a pretty strong suspicion that the effect would be pretty minimal. We're still doing the same amount of i/o tota

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-15 Thread Gregory Stark
"Greg Smith" <[EMAIL PROTECTED]> writes: > On Fri, 15 Jun 2007, Gregory Stark wrote: > >> If I understand it right Greg Smith's concern is that in a busier system >> where even *with* the load distributed checkpoint the i/o bandwidth demand >> during t he checkpoint was *still* being pushed over 1

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-15 Thread Heikki Linnakangas
Gregory Stark wrote: "Heikki Linnakangas" <[EMAIL PROTECTED]> writes: Now that the checkpoints are spread out more, the response times are very smooth. So obviously the reason the results are so dramatic is that the checkpoints used to push the i/o bandwidth demand up over 100%. By spreading i

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-15 Thread Greg Smith
On Fri, 15 Jun 2007, Gregory Stark wrote: If I understand it right Greg Smith's concern is that in a busier system where even *with* the load distributed checkpoint the i/o bandwidth demand during t he checkpoint was *still* being pushed over 100% then spreading out the load would only exacerb

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-15 Thread Gregory Stark
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > I ran another series of tests, with a less aggressive bgwriter_delay setting, > which also affects the minimum rate of the writes in the WIP patch I used. > > Now that the checkpoints are spread out more, the response times are very > smooth. So

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-15 Thread Heikki Linnakangas
Heikki Linnakangas wrote: Here's results from a batch of test runs with LDC. This patch only spreads out the writes, fsyncs work as before. This patch also includes the optimization that we don't write buffers that were dirtied after starting the checkpoint. http://community.enterprisedb.com/

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-14 Thread ITAGAKI Takahiro
Heikki Linnakangas <[EMAIL PROTECTED]> wrote: > Here's results from a batch of test runs with LDC. This patch only > spreads out the writes, fsyncs work as before. I saw similar results in my tests. Spreading only writes are enough for OLTP at least on Linux with middle-or-high-grade storage sy

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-13 Thread Josh Berkus
Greg, > However TPC-E has even more stringent requirements: I'll see if I can get our TPCE people to test this, but I'd say that the existing patch is already good enough to be worth accepting based on the TPCC results. However, I would like to see some community testing on oddball workloads (

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-13 Thread Gregory Stark
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > The response time graphs show that the patch reduces the max (new-order) > response times during checkpoints from ~40-60 s to ~15-20 s. I think that's the headline number here. The worst-case response time is reduced from about 60s to about 17s

[HACKERS] Load Distributed Checkpoints test results

2007-06-13 Thread Heikki Linnakangas
Here's results from a batch of test runs with LDC. This patch only spreads out the writes, fsyncs work as before. This patch also includes the optimization that we don't write buffers that were dirtied after starting the checkpoint. http://community.enterprisedb.com/ldc/ See tests 276-280. 28