On Tue, Mar 22, 2016 at 2:27 PM, Tomas Vondra
> Each query was executed 5x for each work_mem value (between 8MB and 1GB),
> and then a median of the runs was computed and that's what's on the
> "comparison". This compares a414d96ad2b without (master) and with the
> patches applied (patched). The last set of columns is simply a "speedup"
> where "<1.0" means the patched code is faster, while >1.0 means it's slower.
> Values below 0.9 or 1.1 are using green or red background, to make the most
> significant improvements or regressions clearly visible.
> For the smaller data set (1M rows), things works pretty fine. There are
> pretty much no red cells (so no significant regressions), but quite a few
> green ones (with duration reduced by up to 50%). There are some results in
> the 1.0-1.05 range, but considering how short the queries are, I don't think
> this is a problem. Overall the total duration was reduced by ~20%, which is
> For the 10M data sets, total speedup is also almost ~20%, and the speedups
> for most queries are also very nice (often ~50%).
To be clear, you seem to mean that ~50% of the runtime of the query
was removed. In other words, the quicksort version is twice as fast.
> But the number of
> regressions is considerably higher - there's a small number of queries that
> got significantly slower for multiple data sets, particularly for smaller
> work_mem values.
No time to fully consider these benchmarks right now, but: Did you
make sure to set replacement_sort_mem very low so that it was never
used when patched? And, was this on the latest version of the patch,
where memory contexts were reset (i.e. the version that got committed
recently)? You said something about memory batching, so ISTM that you
should set that to '64', to make sure you don't get one longer run.
That might mess with merging.
Note that the master branch has the memory batching patch as of a few
days back, so it that's the problem at the low end, then that's bad.
But I don't think it is: I think that the regressions at the low end
are about abbreviated keys, particularly the numeric cases. There is a
huge gulf in the cost of those comparisons (abbreviated vs
authoritative), and it is legitimately a weakness of the patch that it
reduces the number in play. I think it's still well worth it, but it
is a downside. There is no reason why the authoritative numeric
comparator has to allocate memory, but right now that case isn't
I find it weird that the patch is exactly the same as master in a lot
of cases. ISTM that with a case where you use 1GB of memory to sort 1
million rows, you're so close to an internal sort that it hardly
matters (master will not need a merge step at all, most likely). The
patch works best with sorts that take tens of seconds, and I don't
think I see any here, nor any high memory tests where RS flops. Now, I
think you focused on regressions because that was what was
interesting, which is good. I just want to put that in context.
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: