On Wed, Sep 28, 2022 at 4:08 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > As far as that goes, I'm entirely prepared to accept a conclusion > that the benefits of widening relfilenodes justify whatever space > or speed penalties may exist there. However, we cannot honestly > make that conclusion if we haven't measured said penalties. > The same goes for the other issues you raise here.
I generally agree, but the devil is in the details. I tend to agree with Robert that many individual WAL record types just don't appear frequently enough to matter (it also helps that even the per-record space overhead with wider 56-bit relfilenodes isn't so bad). Just offhand I'd say that ginxlogSplit, ginxlogDeletePage, ginxlogUpdateMeta, gistxlogPageReuse and xl_btree_reuse_page are likely to be in this category (though would be nice to see some numbers for those). I'm much less sure about the other record types. Any WAL records with a variable number of relfilenode entries seem like they might be more of a problem. But I'm not ready to accept that that cannot be ameliorated in some way. Just for example, it wouldn't be impossible to do some kind of varbyte encoding for some record types. How many times will the cluster actually need billions of relfilenodes? It has to work, but maybe it can be suboptimal from a space overhead perspective. I'm not saying that we need to do anything fancy just yet. I'm just saying that there definitely *are* options. Maybe it's not really necessary to come up with something like a varbyte encoding, and maybe the complexity it imposes just won't be worth it -- I really have no opinion on that just yet. -- Peter Geoghegan