On Tue, Oct 16, 2012 at 4:47 PM, Peter Geoghegan <pe...@2ndquadrant.com> wrote: > The same basic strategy for sizing the tuplesort memtuples array in > also exists in tuplestore. I wonder if we should repeat this there? I > suppose that that could follow later.
I think it'd be a good idea to either adjust tuplestore as well or explain in the comments there why the algorithm is different, because it has a comment which references this comment, and now the logic won't be in sync in spite of a comment implying that it is. I didn't initially understand why the proposed heuristic proposed is safe. It's clear that we have to keep LACKMEM() from becoming true, or we'll bail out of this code with elog(ERROR). It will become true if we increase the size of the array more by a number of elements which exceeds state->availmem / sizeof(SortTuple), but the actual number of elements being added is memtupsize * allowedMem / memNowUsed - memtupsize, which doesn't look closely related. However, I think I figured it out. We know that memNowUsed >= memtupsize * sizeof(SortTuple); substituting, we're adding no more than memtupsize * allowedMem / (memtupsize * sizeof(SortTuple)) - memtupsize elements, which simplifies to allowedMem / sizeof(SortTuple) - memtupsize = (allowedMem - sizeof(SortTuple) * memtupsize) / sizeof(SortTuple). Since we know availMem < allowedMem - sizeof(SortTuple) * memtupsize, that means the number of new elements is less than state->availMem/sizeof(SortTuple). Whew. In the attached version, I updated the comment to reflect this, and also reversed the order of the if/else block to put the shorter and more common case first, which I think is better style. I'm still not too sure why this part is OK: /* This may not be our first time through */ if (newmemtupsize <= memtupsize) return false; Suppose we enlarge the memtuples array by something other than a multiple of 2, and then we kick out all of the tuples that are currently in memory and load a new group with a smaller average size. ISTM that it could now appear that the memtuples array can be useful further enlarged, perhaps by as little as one tuple, and that that this could happen repeatedly. I think we should either refuse to grow the array unless we can increase it by at least, say, 10%, or else set a flag after the final enlargement that acts as a hard stop to any further growth. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
sortmem_grow-v4.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers