Re: [HACKERS] Perf Benchmarking and regression.

Andres Freund Thu, 12 May 2016 08:59:47 -0700

On 2016-05-12 10:49:06 -0400, Robert Haas wrote:
> On Thu, May 12, 2016 at 8:39 AM, Ashutosh Sharma <ashu.coe...@gmail.com> 
> wrote:
> > Please find the test results for the following set of combinations taken at
> > 128 client counts:
> >
> > 1) Unpatched master, default *_flush_after :  TPS = 10925.882396
> >
> > 2) Unpatched master, *_flush_after=0 :  TPS = 18613.343529
> >
> > 3) That line removed with #if 0, default *_flush_after :  TPS = 9856.809278
> >
> > 4) That line removed with #if 0, *_flush_after=0 :  TPS = 18158.648023
> 
> I'm getting increasingly unhappy about the checkpoint flush control.
> I saw major regressions on my parallel COPY test, too:


Yes, I'm concerned too.

The workload in this thread is a bit of an "artificial" workload (all
data is constantly updated, doesn't fit into shared_buffers, fits into
the OS page cache), and only measures throughput not latency.  But I
agree that that's way too large a regression to accept, and that there's
a significant number of machines with way undersized shared_buffer
values.


> http://www.postgresql.org/message-id/ca+tgmoyouqf9cgcpgygngzqhcy-gcckryaqqtdu8kfe4n6h...@mail.gmail.com
> 
> That was a completely different machine (POWER7 instead of Intel,
> lousy disks instead of good ones) and a completely different workload.
> Considering these results, I think there's now plenty of evidence to
> suggest that this feature is going to be horrible for a large number
> of users.  A 45% regression on pgbench is horrible.

I asked you over there whether you could benchmark with just different
values for backend_flush_after... I chose the current value because it
gives the best latency / most consistent throughput numbers, but 128kb
isn't a large window.  I suspect we might need to disable backend guided
flushing if that's not sufficient :(


> > Here, That line points to "AddWaitEventToSet(FeBeWaitSet,
> > WL_POSTMASTER_DEATH, -1, NULL, NULL); in pq_init()."
> 
> Given the above results, it's not clear whether that is making things
> better or worse.

Yea, me neither. I think it's doubful that you'd see performance
difference due to the original ac1d7945f866b1928c2554c0f80fd52d7f977772
, independent of the WaitEventSet stuff, at these throughput rates.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perf Benchmarking and regression.

Reply via email to