On Tue, 18 Jan 2022 at 19:45, Julien Rouhaud <rjuju...@gmail.com> wrote: > I'm also a bit confused about which patch(es) should actually be reviewed in > that thread. My understanding is that the original patch might not be > relevant > anymore but I might be wrong. The CF entry should probably be updated to > clarify that, with an annotation/ title change / author update or something. > > In the meantime I will switch the entry to Waiting on Author.
I think what's going on here is a bit confusing. There's quite a few patches on here that aim to do very different things. The patch I originally proposed was just to make tuplesort use generation memory contexts. This helps speed up tuplesort because generation contexts allow palloc() to be faster than the standard allocator. Additionally, there's no power-of-2 wastage with generation contexts like there is with the standard allocator. This helps fit more tuples on fewer CPU cache lines. I believe Andres then mentioned that the fixed block size for generation contexts might become a problem. With the standard allocator the actual size of the underlying malloc() grows up until a maximum size. With generation contexts this is always the same, so we could end up with a very large number of blocks which means lots of small mallocs which could end up slow. Tomas then posted a series of patches to address this. I then posted another patch that has the planner make a choice on the size of the blocks to use for the generation context based on the tuple width and number of tuple estimates. My hope there was to improve performance further by making a better choice for how large to make the blocks in the first place. This reduces the need for Tomas' patch, but does not eliminate the need for it. As of now, I still believe we'll need Tomas' patches to allow the block size to grow up to a maximum size. I think those patches are likely needed before we think about making tuplesort use generation contexts. The reason I believe this is that we don't always have good estimates for the number of tuples we're going to sort. nodeSort.c is fairly easy as it just fetches tuples once and then sorts them. The use of tuplesort for nodeIncrementalSort.c is much more complex as we'll likely be performing a series of smaller sorts which are much harder to get size estimates for. This means there's likely no magic block size that we can use for the generation context that's used to store the tuples in tuplesort. This is also the case for order by aggregates and probably most other usages of tuplesort. David