On Sun, Sep 10, 2017 at 5:59 PM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > OK, so 1MB, 4MB, 8MB, 32MB?
Right. > Ah, so you suggest doing all the tests on current master, by only > tweaking the replacement_sort_tuples value? I've been testing master vs. > your patch, but I guess setting replacement_sort_tuples=0 should have > the same effect. I think that there may be a very slight advantage to RS-free performance with my patch actually applied, because it removes the number of instructions that heap maintenance (merging) routines consist of. This is due to my removing HEAPCOMPARE()/tupindex handling. However, it's probably a low single digit percentage difference -- not enough to be worth taking into account, probably. I assume that just setting replacement_sort_tuples to zero is more convenient for you (that's what I did). To be clear, you'll still need to set replacement_sort_tuples high when testing RS, to make sure that we really use it for at least the first run when we're expected to. (There is no easy way to have testing mechanically verify that we really do only have one run in the end with RS, but I assume that such paranoia is unneeded.) > I probably won't eliminate the random/DESC data sets, though. At least > not from the two smaller data sets - I want to do a bit of benchmarking > on Heikki's polyphase merge removal patch, and for that patch those data > sets are still relevant. Also, it's useful to have a subset of results > where we know we don't expect any change. Okay. The DESC dataset is going to make my patch look good, because it won't change anything for merging (same number of runs in the end), but sorting will be slower for the first run with RS. > Meh, more data is probably better. And with the reduced work_mem values > and skipping of random/DESC data sets it should complete much faster. Note that my own test case had a much higher number of tuples than even 10 million -- it had 100 million tuples. I think that if any of your test cases bring about a new insight, it will not be due to the number of distinct tuples. But, if the extra time it takes doesn't matter to you, then it doesn't matter to me either. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers