Hi,

On 05/08/2019 09:26, Andres Freund wrote:
Hi,

On 2019-08-05 00:19:28 +0200, Petr Jelinek wrote:
It carries that information inside the compressed value, like I said in the
other reply, that's why the extra byte.

I'm not convinced that that is a good plan - imo the reference to the
compressed data should carry that information.

I.e. in the case of toast, at least toast pointers should hold enough
information to determine the compression algorithm. And in the case of
WAL, the WAL record should contain that.

Point taken.


For external datums I suggest encoding the compression method as a
distinct VARTAG_ONDISK_COMPRESSED, and then have that include the
compression method as a field.

So the new reads/writes will use this and reads of old format won't change? Sounds fine.


For in-line compressed values (so VARATT_IS_4B_C), doing something
roughly like you did, indicating the type of metadata following using
the high bit sounds reasonable. But I think I'd make it so that if the
highbit is set, the struct is instead entirely different, keeping a full
4 byte byte length, and including the compression type header inside the
struct. Perhaps I'd combine the compression type with the high-bit-set
part?  So when the high bit is set, it'd be something like

{
     int32              vl_len_;                /* varlena header (do not touch 
directly!) */

     /*
      * Actually only 7 bit, the high bit determines whether this
      * is the old compression header (if unset), or this type of header
      * (if set).
      */
     uint8 type;

     /*
      * Stored as uint8[4], to avoid unnecessary alignment padding.
      */
     uint8[4] length;

     char va_data[FLEXIBLE_ARRAY_MEMBER];
}


Won't this break BW compatibility on big-endian (if I understand corretly what you are trying to achieve here)?

I think it's worth spending some effort trying to get this right - we'll
be bound by design choices for a while.


Sure, however I am not in business of redesigning TOAST from scratch right now, even if I do agree that the current header is far from ideal.

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/


Reply via email to