Re: XLog size reductions: smaller XLRec block header for PG17

Heikki Linnakangas Fri, 02 Feb 2024 05:53:11 -0800

On 22/01/2024 19:23, Robert Haas wrote:

In the case of this particular patch, I think the problem is that
there's no consensus on the design. There's not a ton of debate on
this thread, but thread [1] linked in the original post contains a lot
of vigorous debate about what the right thing to do is here and I
don't believe we reached any meeting of the minds.


Yeah, so it seems.

It looks like I never replied to
https://www.postgresql.org/message-id/20221019192130.ebjbycpw6bzjry4v%40awork3.anarazel.de
but, FWIW, I agree with Andres that applying the same technique to
multiple fields that are stored together (DB OID, TS OID, rel #, block
#) is unlikely in practice to produce many cases that regress. But the
question for this thread is really more about whether we're OK with
using ad-hoc bit swizzling to reduce the size of xlog records or
whether we want to insist on the use of a uniform varint encoding.
Heikki and Andres both seem to favor the latter. IIRC, I was initially
more optimistic about ad-hoc bit swizzling being a potentially
acceptable technique, but I'm not convinced enough about it to argue
against two very smart committers both of whom know more about
micro-optimizing performance than I do, and nobody else seems to
making this argument on this thread either, so I just don't really see
how this patch is ever going to go anywhere in its current form.

I don't have a clear idea of how to proceed with this either. Somethoughts I have:

Using varint encoding makes sense for length fields. The common valuesare small, and if a length of anything is large, then the size of thelength field itself is insignificant compared to the actual data.

I don't like using varint encoding for OID. They might be small incommon cases, but it feels wrong to rely on that. They're just arbitrarynumbers. We could pick them randomly, it's just an implementation detailthat we use a counter to choose the next one. I really dislike the ideathat someone would do a pg_dump + restore, just to get smaller OIDs andsmaller WAL as a result.


It does make sense to have a fast-path (small-path?) for 0 OIDs though.

To shrink OIDs fields, you could refer to earlier WAL records. A specialvalue for "same relation as in previous record", or something like that.Now we're just re-inventing LZ-style compression though. Might as welluse LZ4 or Snappy or something to compress the whole WAL stream. It's abit tricky to get the crash-safety right, but shouldn't be impossible.


Has anyone seriously considered implementing wholesale compression of WAL?

--
Heikki Linnakangas
Neon (https://neon.tech)

Re: XLog size reductions: smaller XLRec block header for PG17

Reply via email to