On Tue, Mar 29, 2016 at 6:02 PM, Tomas Vondra
> And why not? I mean, why should it be acceptable to slow down?
My point was that over 80% of execution time was spent in the
HashAggregate, which outputs tuples to the sort. That, and the huge
i5/Xeon inconsistency (in the extent to which this is regressed --
it's not at all, or it's regressed a lot) makes me suspicious that
there is something else going on. Possibly involving the scheduling of
> That may be easily due to differences between the CPUs and configuration.
> For example the Xeon uses a way older CPU with different amounts of CPU
> cache, and it's also a multi-socket system. And so on.
We're talking about a huge relative difference with that HashAggregate
plan, though. I don't think that those relative differences are
explained by differing CPU characteristics. But I guess we'll find out
>> If there is ever a regression, it is only really sensible to talk
>> about it while looking at trace_sort output (and, I guess, the query
>> plan). I've asked Tomas for trace_sort output in all relevant cases.
>> There is no point in "flying blind" and speculating what the problem
>> was from a distance.
> The updated benchmarks are currently running. I'm out of office until
> Friday, and I'd like to process the results over the weekend. FWIW I'll have
> results for these cases:
> 1) unpatched (a414d96a)
> 2) patched, default settings
> 3) patched, replacement_sort_mem=64
> Also, I'll have trace_sort=on output for all the queries, so we can
> investigate further.
Thanks! That will tell us a lot more.
> Yeah. That was one of the goals of the benchmark, to come up with some
> tuning recommendations. On some systems significantly increasing memory GUCs
> may not be possible, though - say, on very small systems with very limited
> amounts of RAM.
Fortunately, such systems will probably mostly use external sorts for
CREATE INDEX cases, and there seems to be very little if any downside
there, at least according to your similarly, varied tests of CREATE
>> I don't think they are representative. Greg Stark characterized the
>> regressions as being fairly limited, mostly at the very low end. And
>> that was *before* all the memory fragmentation stuff made that
>> better. I haven't done any analysis of how much better that made the
>> problem *across the board* yet, but for int4 cases I could make 1MB
>> work_mem queries faster with gigabytes of data on my laptop. I
>> believe I tested various datum sort cases there, like "select
>> count(distinct(foo)) from bar"; those are a very pure test of the
> Well, I'd guess those conclusions may be a bit subjective.
I think that the conclusion that we should do something or not do
something based on this information is subjective. OTOH, whether and
to what extent these tests are representative of real user workloads
seems much less subjective. This is not a criticism of the test cases
you came up with, which rightly emphasized possibly regressed cases. I
think everyone already understood that the picture was very positive
at the high end, in memory rich environments.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: