Hi, On 2023-02-21 15:00:15 -0600, Jim Nasby wrote: > On 10/28/22 9:54 PM, Andres Freund wrote: > > b) I found that is quite beneficial to bulk-extend the relation with > > smgrextend() even without concurrency. The reason for that is the > > primarily > > the aforementioned dirty buffers that our current extension method > > causes. > > > > One bit that stumped me for quite a while is to know how much to extend > > the > > relation by. RelationGetBufferForTuple() drives the decision whether / > > how > > much to bulk extend purely on the contention on the extension lock, > > which > > obviously does not work for non-concurrent workloads. > > > > After quite a while I figured out that we actually have good > > information on > > how much to extend by, at least for COPY / > > heap_multi_insert(). heap_multi_insert() can compute how much space is > > needed to store all tuples, and pass that on to > > RelationGetBufferForTuple(). > > > > For that to be accurate we need to recompute that number whenever we > > use an > > already partially filled page. That's not great, but doesn't appear to > > be a > > measurable overhead. > Some food for thought: I think it's also completely fine to extend any > relation over a certain size by multiple blocks, regardless of concurrency. > E.g. 10 extra blocks on an 80MB relation is 0.1%. I don't have a good feel > for what algorithm would make sense here; maybe something along the lines of > extend = max(relpages / 2048, 128); if extend < 8 extend = 1; (presumably > extending by just a couple extra pages doesn't help much without > concurrency).
I previously implemented just that. It's not easy to get right. You can easily end up with several backends each extending the relation by quite a bit, at the same time (or you re-introduce contention). Which can end up with a relation being larger by a bunch if data loading stops at some point. We might want that as well at some point, but the approach implemented in the patchset is precise and thus always a win, and thus should be the baseline. Greetings, Andres Freund