On Thu, Apr 7, 2016 at 1:17 PM, Peter Geoghegan <p...@heroku.com> wrote: >> I certainly agree that GUCs that aren't easy to tune are bad. I'm >> wondering whether the fact that this one is hard to tune is something >> that can be fixed. The comments about "padding" - a term I don't >> like, because it to me implies a deliberate attempt to game the >> benchmark when in reality wanting to sort a wide row is entirely >> reasonable - make me wonder if this should be based on a number of >> tuples rather than an amount of memory. If considering the row width >> makes us get the wrong answer, then let's not do that. > > That's a good point. While I don't think it will make it easy to tune > the GUC, it will make it easier. Although, I think that it should > probably still be GUC_UNIT_KB. That should just be something that my > useselection() function compares to the overall size of memtuples > alone when we must initially spill, not the value of > work_mem/maintenance_work_mem. The degree of padding isn't entirely > irrelevant, because not all comparisons will be resolved at the > stup.datum1 level, but it's still clearly an improvement to not have > wide tuples mess with things. > > Would you like me to revise the patch along those lines? Or, do you > prefer units of tuples? Tuples are basically equivalent, but make it > way less obvious what the relationship with CPU cache might be. If I > revise the patch along these lines, I should also reduce the default > replacement_sort_mem to produce roughly equivalent behavior for > non-padded cases.
I prefer units of tuples, with the GUC itself therefore being unitless. I suggest we call the parameter replacement_sort_threshold and document that (1) the ideal value may depend on the amount of CPU cache available to running processes, with more cache implying higher values; and (2) the ideal value may depend somewhat on the input data, with more correlation implying higher values. And then pick some value that you think is likely to work well for most people and call it good. If you could prepare a new patch with those changes and also making the changes requested in my other email, I will try to commit that before the deadline. Thanks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers