On 06/15/2013 01:56 PM, Andres Freund wrote:
> On 2013-06-15 13:25:49 +0200, Hannu Krosing wrote:
>> On 06/15/2013 02:20 AM, Andres Freund wrote:
>>> On 2013-06-14 17:12:01 -0700, Josh Berkus wrote:
>>>> On 06/14/2013 04:01 PM, Andres Freund wrote:
>>>>> It still contains a guc as described in the above message to control the
>>>>> algorithm used for compressing new tuples but I think we should remove
>>>>> that guc after testing.
>>>> Did you add the storage attribute?
>>> No. I think as long as we only have pglz and one new algorithm (even if
>>> that is lz4 instead of the current snappy) we should just always use the
>>> new algorithm. Unless I missed it nobody seemed to have voiced a
>>> contrary position?
>>> For testing/evaluation the guc seems to be sufficient.
>> If not significantly harder than what you currently do, I'd prefer a
>> true pluggable compression support which is
>> a) dynamically configurable , say by using a GUG
>> and
>> b) self-describing, that is, the compressed data should have enough
>> info to determine how to decompress it.
> Could you perhaps actually read the the description and the discussion
> before making wild suggestions? Possibly even the patch.
> Compressed toast datums now *do* have an identifier of the compression
> algorithm used. 
> That's how we can discern between pglz and whatever
> algorithm we come up with.
Claiming that the algorithm will be one of only two (current and
"whatever algorithm we come up with ") suggests that it is
only one bit, which is undoubtedly too little for having a "pluggable"
compression API :)


> But those identifiers should be *small* (since they are added to all
> Datums) 
if there will be any alignment at all between the datums, then
one byte will be lost in the noise ("remember: nobody will need
more than 256 compression algorithms")
OTOH, if you plan to put these format markers in the compressed
stream and change the compression algorithm while reading it, I am lost.
> and they need to be stable, even across pg_upgrade. So I think
> making this user configurable would be grave error at this point.
"at this point" in what sense ?
>
>> additionally it *could* have the property Simon proposed earlier
>> of *uncompressed* pages having some predetermined size, so we
>> could retain optimisations of substring() even on compressed TOAST
>> values.
> We are not changing the toast format here, so I don't think that's
> applicable. That's a completely separate feature.
>
>> the latter of course could also be achieved by adding offset
>> column to toast tables as well.
>> One more idea - if we are already changing toast table structure, we
>> could introduce a notion of "compress block", which could run over
>> several storage pages for much improved compression compared
>> to compressing only a single page at a time.
> We aren't changing the toast table structure. And we can't easily do so,
> think of pg_upgrade.
Where was the page for "features rejected based on of pg_upgrade" ;)
> Besides, toast always has compressed datums over several chunks. What
> would be beneficial would be to compress in a way that you can compress
> several datums together, but that's several magnitudes more complex and
> is unrelated to this feature.
>
> Greetings,
>
> Andres Freund
>


-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to