On Tue, Jun 3, 2025 at 1:24 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> Robert Haas <robertmh...@gmail.com> writes:
> > I think the proposed patch should be committed and back-patched, after
> > fixing it so that it's pgindent-clean and adding a comment.  Does
> > anyone have strong objection to that?
>
> Not here.  I do wonder if we can't find a more memory-efficient way,
> but I concur that any such change would likely not be back-patch
> material.

Cool. Regarding memory efficiency, I was just discussing radix trees
this morning with Sawada-san and others. "lib/radixtree.h" seems
unsuitable here because it supposes that each node should cover 1 byte
of key material, which seems unsuitable for a 20-byte key, but I
wonder if we could reuse "common/blkreftable.h" which I created for
incremental backup but which seems like it could probably work here as
well.

Implementation sketch: The "limit block" concept would be irrelevant
in this case, so you just wouldn't call BlockRefTableSetLimitBlock.
Instead, you'd call BlockRefTableMarkBlockModified for each
memory-resident block -- the name would be a misnomer, and we might
want to do some renaming. blkreftable.c/h already has infrastructure
to read and write the data structure from and to disk, so we wouldn't
need to reinvent that.

The reason I think this might work out well is that for moderately
sparse block sets, the space usage is about 2 bytes per block, while
for dense sets block sets, it converges to 1 bit per block. If you
make the set of blocks super-duper sparse then it might use more
memory than what we do today, but I'm not sure that's a very realistic
scenario in practice -- you'd probably need to have a gigantic number
of relations in the database and absolutely no locality of access. See
the comment in blkreftable.c just above "#define BLOCKS_PER_CHUNK" for
what the format actually is.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


Reply via email to