On 12/12/2017 10:33 PM, Robert Haas wrote:
> On Mon, Dec 11, 2017 at 2:53 PM, Tomas Vondra
> <tomas.von...@2ndquadrant.com> wrote:
>> But let me play the devil's advocate for a while and question the
>> usefulness of this approach to compression. Some of the questions were
>> mentioned in the thread before, but I don't think they got the attention
>> they deserve.
> 
> Sure, thanks for chiming in.  I think it is good to make sure we are
> discussing this stuff.
> 
>> But perhaps we should simply make it an initdb option (in which case the
>> whole cluster would simply use e.g. lz4 instead of pglz)?
>>
>> That seems like a much simpler approach - it would only require some
>> ./configure options to add --with-lz4 (and other compression libraries),
>> an initdb option to pick compression algorithm, and probably noting the
>> choice in cluster controldata.
>>
>> No dependencies tracking, no ALTER TABLE issues, etc.
>>
>> Of course, it would not allow using different compression algorithms for
>> different columns (although it might perhaps allow different compression
>> level, to some extent).
>>
>> Conclusion: If we want to offer a simple cluster-wide pglz alternative,
>> perhaps this patch is not the right way to do that.
> 
> I actually disagree with your conclusion here.   I mean, if you do it
> that way, then it has the same problem as checksums: changing
> compression algorithms requires a full dump-and-reload of the
> database, which makes it more or less a non-starter for large
> databases.  On the other hand, with the infrastructure provided by
> this patch, we can have a default_compression_method GUC that will be
> set to 'pglz' initially.  If the user changes it to 'lz4', or we ship
> a new release where the new default is 'lz4', then new tables created
> will use that new setting, but the existing stuff keeps working.  If
> you want to upgrade your existing tables to use lz4 rather than pglz,
> you can change the compression option for those columns to COMPRESS
> lz4 PRESERVE pglz if you want to do it incrementally or just COMPRESS
> lz4 to force a rewrite of an individual table.  That's really
> powerful, and I think users will like it a lot.
> 
> In short, your approach, while perhaps a little simpler to code, seems
> like it is fraught with operational problems which this design avoids.
> 

I agree the checksum-like limitations are annoying and make it
impossible to change the compression algorithm after the cluster is
initialized (although I recall a discussion about addressing that).

So yeah, if such flexibility is considered valuable/important, then the
patch is a better solution.

>> Custom datatype-aware compression (e.g. the tsvector)
>> ----------------------------------------------------------------------
>>
>> Exploiting knowledge of the internal data type structure is a promising
>> way to improve compression ratio and/or performance.
>>
>> The obvious question of course is why shouldn't this be done by the data
>> type code directly, which would also allow additional benefits like
>> operating directly on the compressed values.
>>
>> Another thing is that if the datatype representation changes in some
>> way, the compression method has to change too. So it's tightly coupled
>> to the datatype anyway.
>>
>> This does not really require any new infrastructure, all the pieces are
>> already there.
>>
>> In some cases that may not be quite possible - the datatype may not be
>> flexible enough to support alternative (compressed) representation, e.g.
>> because there are no bits available for "compressed" flag, etc.
>>
>> Conclusion: IMHO if we want to exploit the knowledge of the data type
>> internal structure, perhaps doing that in the datatype code directly
>> would be a better choice.
> 
> I definitely think there's a place for compression built right into
> the data type.  I'm still happy about commit
> 145343534c153d1e6c3cff1fa1855787684d9a38 -- although really, more
> needs to be done there.  But that type of improvement and what is
> proposed here are basically orthogonal.  Having either one is good;
> having both is better.
> 

Why orthogonal?

For example, why couldn't (or shouldn't) the tsvector compression be
done by tsvector code itself? Why should we be doing that at the varlena
level (so that the tsvector code does not even know about it)?

For example we could make the datatype EXTERNAL and do the compression
on our own, using a custom algorithm. Of course, that would require
datatype-specific implementation, but tsvector_compress does that too.

It seems to me the main reason is that tsvector actually does not allow
us to do that, as there's no good way to distinguish the different
internal format (e.g. by storing a flag or format version in some sort
of header, etc.).

> I think there may also be a place for declaring that a particular data
> type has a "privileged" type of TOAST compression; if you use that
> kind of compression for that data type, the data type will do smart
> things, and if not, it will have to decompress in more cases.  But I
> think this infrastructure makes that kind of thing easier, not harder.
> 

I don't quite understand how that would be done. Isn't TOAST meant to be
entirely transparent for the datatypes? I can imagine custom TOAST
compression (which is pretty much what the patch does, after all), but I
don't see how the datatype could do anything smart about it, because it
has no idea which particular compression was used. And considering the
OIDs of the compression methods do change, I'm not sure that's fixable.

>> Custom datatype-aware compression with additional column-specific
>> metadata (e.g. the jsonb with external dictionary).
>> ----------------------------------------------------------------------
>>
>> Exploiting redundancy in multiple values in the same column (instead of
>> compressing them independently) is another attractive way to help the
>> compression. It is inherently datatype-aware, but currently can't be
>> implemented directly in datatype code as there's no concept of
>> column-specific storage (e.g. to store dictionary shared by all values
>> in a particular column).
>>
>> I believe any patch addressing this use case would have to introduce
>> such column-specific storage, and any solution doing that would probably
>> need to introduce the same catalogs, etc.
>>
>> The obvious disadvantage of course is that we need to decompress the
>> varlena value before doing pretty much anything with it, because the
>> datatype is not aware of the compression.
>>
>> So I wonder if the patch should instead provide infrastructure for doing
>> that in the datatype code directly.
>>
>> The other question is if the patch should introduce some infrastructure
>> for handling the column context (e.g. column dictionary). Right now,
>> whoever implements the compression has to implement this bit too.
> 
> I agree that having a place to store a per-column compression
> dictionary would be awesome, but I think that could be added later on
> top of this infrastructure.  For example, suppose we stored each
> per-column compression dictionary in a separate file and provided some
> infrastructure for WAL-logging changes to the file on a logical basis
> and checkpointing those updates.  Then we wouldn't be tied to the
> MVCC/transactional issues which storing the blobs in a table would
> have, which seems like a big win.  Of course, it also creates a lot of
> little tiny files inside a directory that already tends to have too
> many files, but maybe with some more work we can figure out a way
> around that problem.  Here again, it seems to me that the proposed
> design is going more in the right direction than the wrong direction:
> if some day we have per-column dictionaries, they will need to be tied
> to specific compression methods on specific columns.  If we already
> have that concept, extending it to do something new is easier than if
> we have to create it from scratch.
> 

Well, it wasn't my goal to suddenly widen the scope of the patch and
require it adds all these pieces. My intent was more to point to pieces
that need to be filled in the future.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to