On Tue, May 31, 2022 at 11:09 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Yeah, we don't have any hard data here. It could be that it's a win to > switch to a rule that chunks must present an offset (instead of a pointer) > back to a block header, which'd then be required to contain a link to the > actual context, meaning that every context has to do something like what > I proposed for generation.c. But nobody's coded that up let alone done > any testing on it, and I feel like it's too late in the v15 cycle for > changes as invasive as that. Quite aside from that change in itself, > you wouldn't get any actual space savings in aset.c contexts unless > you did something with the chunk-size field too, and that seems a lot > messier. > > Right now my vote would be to leave things as they stand for v15 --- > the performance loss that started this thread occurs in a narrow > enough set of circumstances that I don't feel too much angst about > it being the price of winning in most other circumstances. We can > investigate these options at leisure for v16 or later.
I don't think it's all that narrow a case, but I think it's narrow enough that I agree that we can live with it if we don't feel like the risk-reward trade-offs are looking favorable. I don't think tuples whose size, after rounding to a multiple of 8, is exactly a power of 2 are going to be terribly uncommon. It is not very likely that many people will have tuples between 25 and 31 bytes, but tables with lots of 57-64 byte tuples are probably pretty common, and tables with lots of 121-128 and 249-256 byte tuples are somewhat plausible as well. I think we will win more than we lose, but I think we will lose often enough that I wouldn't be very surprised if we get >0 additional complaints about this over the next 5 years. While it may seem risky to do something about that now, it will certainly seem a lot riskier once the release is out, and there is some risk in doing nothing, because we don't know how many people are going to run into the bad cases or how severely they might be impacted. I think the biggest risk in your patch as presented is that slowing down pfree() might suck in some set of workloads upon which we can't easily identify now. I don't really know how to rule out that possibility. In general terms, I think we're not doing ourselves any favors by relying partly on memory context cleanup and partly on pfree. The result is that pfree() has to work for every memory context type and can't be unreasonably slow, which rules out a lot of really useful ideas, both in terms of how to make allocation faster, and also in terms of how to make it denser. If you're ever of a mind to put some efforts into really driving things forward in this area, I think figuring out some way to break that hard dependence would be effort really well-spent. -- Robert Haas EDB: http://www.enterprisedb.com