Tomas Vondra <tomas.von...@2ndquadrant.com> writes: > On 09/11/2015 07:16 PM, Robert Haas wrote: >> On Fri, Sep 11, 2015 at 1:12 PM, Tomas Vondra >> <tomas.von...@2ndquadrant.com> wrote: >>> I'm arguing for fixing the existing bug, and then addressing the case of >>> over-estimation separately, with proper analysis.
>> Well, this is part of how we're looking it differently. I think the >> bug is "we're passing a value to palloc that is too large, so >> sometimes it fails" and the way to fix that is to properly limit the >> value. You are clearly defining the bug a bit differently. > Yes, I see it differently. > I don't quite understand why limiting the value is more "proper" than > using a function that can handle the actual value. > The proposed bugfix addresses the issue in the most straightforward way, > without introducing additional considerations about possible > over-estimations (which the current code completely ignores, so this is > a new thing). I think bugfixes should not introduce such changes to > behavior (albeit internal), especially not without any numbers. This thread seems to have stalled out... After re-reading it, I'm going to agree with Robert that we should clamp the initial pointer-array size to work with palloc (ie, 512MB worth of pointers, which please recall is going to represent at least 10X that much in hashtable entries, probably more). The argument that allowing it to be larger would be a performance win seems entirely unbased on any evidence, while the risks of allowing arbitrarily large allocations based on planner estimates seem pretty obvious to me. And the argument that such overestimates are a bug that we can easily fix is based on even less evidence; in fact, I would dismiss that out of hand as hubris. Now there is a separate issue about whether we should allow hashtable resizes to exceed that limit. There I would vote yes, as long as the resize is based on arrival of actual data and not estimates (following Robert's point that the existing uses of repalloc_huge are driven by actual need). So I don't like any of the proposed patches exactly as-is. BTW, just looking at the code in question, it seems to be desperately in need of editorial review. A couple of examples: * Some of the code goes to the trouble of maintaining a log2_nbuckets_optimal value, but ExecHashIncreaseNumBuckets doesn't know about that and recomputes the value. * ExecHashIncreaseNumBuckets starts out with a run-time test on something that its sole caller has just Assert()'d to not be the case, and which really ought to be impossible with or without that Assert. * ExecHashTableInsert believes it can only increase nbuckets_optimal if we're still in the first batch, but MultiExecHash() will call ExecHashIncreaseNumBuckets at the end of input-acquisition whether we've created more than one batch or not. Meanwhile, ExecHashIncreaseNumBatches thinks it can change the number of buckets in any case. Maybe that's all okay, but at the very least those tests ought to look like they'd heard of each other. And of those three places, having the safety check where it is seems like the least reasonable place. Tracking an "optimal" number of buckets seems like an activity that should not be constrained by whether we have any hope of being able to apply the number. So I'm not having a lot of faith that there aren't other bugs in the immediate area. regards, tom lane -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers