Apologies for the very delayed reply.

Does unframed LZ4 provide a checksum of the content before compression?


I don't believe so, we would have need to add basic minimal metadata like
checksum/uncompressed length.  I think this is still fairly simple compared
to implementing the block format.

On Sat, Aug 31, 2024 at 12:11 PM Piotr Findeisen <piotr.findei...@gmail.com>
wrote:

> Hi Micah,
>
> Good point.
> Does unframed LZ4 provide a checksum of the content before compression?
>
> Best
> Piotr
>
>
> On Fri, 30 Aug 2024 at 23:34, Micah Kornfield <emkornfi...@gmail.com>
> wrote:
>
>> The Iceberg implementation was supposed to be based on aircompressor pure
>>> Java implementation https://github.com/airlift/aircompressor/pull/142.
>>> AFAICT, aircompressor started to favor (or be more OK with) native
>>> implementations (because of Project Panama), so adding LZ4 framed
>>> compression might be simpler these days.
>>
>>
>> Since this work was never completed,  I'd personally be in favor of
>> deprecating LZ4 framed and using LZ4 withing framing which already has high
>> quality native java implementation.
>>
>> Cheers,
>> Micah
>>
>> On Tue, Aug 27, 2024 at 5:44 AM Piotr Findeisen <
>> piotr.findei...@gmail.com> wrote:
>>
>>> Hi Gabor
>>>
>>> Thanks for creating this discussion thread. This is indeed a good topic
>>> to discuss.
>>>
>>> The idea was to have lightweight compression for the footer for cass
>>> when Puffin files are bigger.
>>> It is true that the implementation didn't follow the spec yet.
>>> If we remove this from the Puffin spec, we will probably want to add it
>>> later.
>>>
>>> The Iceberg implementation was supposed to be based on
>>> aircompressor pure Java implementation
>>> https://github.com/airlift/aircompressor/pull/142.
>>> AFAICT, aircompressor started to favor (or be more OK with) native
>>> implementations (because of Project Panama), so adding LZ4 framed
>>> compression might be simpler these days.
>>>
>>> I would prefer to spend the effort on completing the compression.
>>>
>>> Best
>>> Piotr
>>>
>>>
>>>
>>>
>>> On Tue, 27 Aug 2024 at 14:29, Gabor Kaszab
>>> <gaborkas...@cloudera.com.invalid> wrote:
>>>
>>>> Hi Iceberg Community,
>>>>
>>>> I saw in the Puffin spec <https://iceberg.apache.org/puffin-spec> that
>>>> the footer of the Puffin file or the blobs themselves could be compressed
>>>> by LZ4. I checked the code
>>>> <https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/puffin/PuffinFormat.java#L110>
>>>> however, and for me it seems that currently LZ4 is not supported.
>>>> My first question is do I miss anything here?
>>>> The second, is if we in fact don't support LZ4, can I remove it from
>>>> the spec to avoid confusions? (I believe this requires a vote in a separate
>>>> thread)
>>>>
>>>> Thanks,
>>>> Gabor
>>>>
>>>>

Reply via email to