Re: Zstd decoder support

2018-05-17 Thread Dain Sundstrom
Our expectation is maybe in a quarter.  

-dain

> On May 17, 2018, at 11:42 AM, Xiening Dai  wrote:
> 
> Hi Dain,
> 
> Do you have a roughly timeline regarding when the Java zstd compressor will 
> be available? Thanks.
> 
> 
>> On May 7, 2018, at 12:34 PM, Dain Sundstrom  wrote:
>> 
>> The fixes are released in v0.11
>> 
>> -dain
>> 
>>> On May 6, 2018, at 9:36 PM, Xiening Dai  wrote:
>>> 
>>> Thanks for clarification. It makes sense to wait for your fixes. Thx.
>>> 
 On May 5, 2018, at 1:04 PM, Dain Sundstrom  wrote:
 
 
> On May 5, 2018, at 11:46 AM, Xiening Dai  wrote:
> 
 BTW we are about to do a release that fixes a bug with zstd.
> 
> I am curious which bug you are referring to. Is it a bug with Java 
> implementation or it affects C++ as well?
 
 The bugs were in the Java implementation.  IIRC there were two problems.  
 In v0.10, we added support for zstd concatenated frames, and it had a rare 
 buffer overrun problem.  The second problem has been around since the 
 beginning.  When the file contains checksums they weren’t being validated 
 correctly.  We missed this one because the default native implementation 
 was not adding checksums so the code wasn’t actually being tested.
 
 -dain
>>> 
>> 
> 



Re: Zstd decoder support

2018-05-17 Thread Xiening Dai
Hi Dain,

Do you have a roughly timeline regarding when the Java zstd compressor will be 
available? Thanks.


> On May 7, 2018, at 12:34 PM, Dain Sundstrom  wrote:
> 
> The fixes are released in v0.11
> 
> -dain
> 
>> On May 6, 2018, at 9:36 PM, Xiening Dai  wrote:
>> 
>> Thanks for clarification. It makes sense to wait for your fixes. Thx.
>> 
>>> On May 5, 2018, at 1:04 PM, Dain Sundstrom  wrote:
>>> 
>>> 
 On May 5, 2018, at 11:46 AM, Xiening Dai  wrote:
 
>>> BTW we are about to do a release that fixes a bug with zstd.
 
 I am curious which bug you are referring to. Is it a bug with Java 
 implementation or it affects C++ as well?
>>> 
>>> The bugs were in the Java implementation.  IIRC there were two problems.  
>>> In v0.10, we added support for zstd concatenated frames, and it had a rare 
>>> buffer overrun problem.  The second problem has been around since the 
>>> beginning.  When the file contains checksums they weren’t being validated 
>>> correctly.  We missed this one because the default native implementation 
>>> was not adding checksums so the code wasn’t actually being tested.
>>> 
>>> -dain
>> 
> 



Re: Zstd decoder support

2018-05-07 Thread Dain Sundstrom
The fixes are released in v0.11

-dain

> On May 6, 2018, at 9:36 PM, Xiening Dai  wrote:
> 
> Thanks for clarification. It makes sense to wait for your fixes. Thx.
> 
>> On May 5, 2018, at 1:04 PM, Dain Sundstrom  wrote:
>> 
>> 
>>> On May 5, 2018, at 11:46 AM, Xiening Dai  wrote:
>>> 
>> BTW we are about to do a release that fixes a bug with zstd.
>>> 
>>> I am curious which bug you are referring to. Is it a bug with Java 
>>> implementation or it affects C++ as well?
>> 
>> The bugs were in the Java implementation.  IIRC there were two problems.  In 
>> v0.10, we added support for zstd concatenated frames, and it had a rare 
>> buffer overrun problem.  The second problem has been around since the 
>> beginning.  When the file contains checksums they weren’t being validated 
>> correctly.  We missed this one because the default native implementation was 
>> not adding checksums so the code wasn’t actually being tested.
>> 
>> -dain
> 



Re: Zstd decoder support

2018-05-06 Thread Xiening Dai
Thanks for clarification. It makes sense to wait for your fixes. Thx.

> On May 5, 2018, at 1:04 PM, Dain Sundstrom  wrote:
> 
> 
>> On May 5, 2018, at 11:46 AM, Xiening Dai  wrote:
>> 
> BTW we are about to do a release that fixes a bug with zstd.
>> 
>> I am curious which bug you are referring to. Is it a bug with Java 
>> implementation or it affects C++ as well?
> 
> The bugs were in the Java implementation.  IIRC there were two problems.  In 
> v0.10, we added support for zstd concatenated frames, and it had a rare 
> buffer overrun problem.  The second problem has been around since the 
> beginning.  When the file contains checksums they weren’t being validated 
> correctly.  We missed this one because the default native implementation was 
> not adding checksums so the code wasn’t actually being tested.
> 
> -dain



Re: Zstd decoder support

2018-05-05 Thread Dain Sundstrom

> On May 5, 2018, at 11:46 AM, Xiening Dai  wrote:
> 
 BTW we are about to do a release that fixes a bug with zstd.
> 
> I am curious which bug you are referring to. Is it a bug with Java 
> implementation or it affects C++ as well?

The bugs were in the Java implementation.  IIRC there were two problems.  In 
v0.10, we added support for zstd concatenated frames, and it had a rare buffer 
overrun problem.  The second problem has been around since the beginning.  When 
the file contains checksums they weren’t being validated correctly.  We missed 
this one because the default native implementation was not adding checksums so 
the code wasn’t actually being tested.

-dain

Re: Zstd decoder support

2018-05-05 Thread Xiening Dai
>>> BTW we are about to do a release that fixes a bug with zstd.

I am curious which bug you are referring to. Is it a bug with Java 
implementation or it affects C++ as well?


> On May 4, 2018, at 7:56 PM, Dain Sundstrom  wrote:
> 
> 0.11 is released.
> 
> -dain
> 
> Sent from my iPhone
> 
>> On May 4, 2018, at 1:41 PM, Owen O'Malley  wrote:
>> 
>> I just upgraded ORC to use aircompressor 0.10. I assume we'll want to move
>> to 0.11 before we use zstd?
>> 
>> .. Owen
>> 
>>> On Fri, May 4, 2018 at 12:49 PM, Dain Sundstrom  wrote:
>>> 
>>> The maintained location (and the version we use in prod) is:
>>> https://github.com/airlift/aircompressor/tree/master/src/
>>> main/java/io/airlift/compress/zstd
>>> 
>>> We plan on writing the compressor soon as we need it for our production
>>> systems.
>>> 
>>> BTW we are about to do a release that fixes a bug with zstd.
>>> 
>>> -dain
>>> 
 On May 4, 2018, at 11:19 AM, Xiening Dai  wrote:
 
 Hi all,
 
 I think the major reason that we don’t support zstd compressor today is
>>> that there’s no native java library currently. But I do see a java
>>> decompressor in presto code base -
 
 https://github.com/prestodb/presto/blob/8f4e5bb9340890f01291ee1b777a1b
>>> 2b921a90c4/presto-orc/src/main/java/com/facebook/presto/
>>> orc/zstd/ZstdDecompressor.java
 
 I wonder if we can just integrate this so at least we have the
>>> decompress ability. I believe this is valuable as there are Orc files
>>> compressed in zstd, including some presto workload, and some of the
>>> workload in our installation (generated by c++ writer with zstd
>>> compression).
 
 
>>> 
>>> 



Re: Zstd decoder support

2018-05-04 Thread Dain Sundstrom
0.11 is released.

-dain

Sent from my iPhone

> On May 4, 2018, at 1:41 PM, Owen O'Malley  wrote:
>
> I just upgraded ORC to use aircompressor 0.10. I assume we'll want to move
> to 0.11 before we use zstd?
>
> .. Owen
>
>> On Fri, May 4, 2018 at 12:49 PM, Dain Sundstrom  wrote:
>>
>> The maintained location (and the version we use in prod) is:
>> https://github.com/airlift/aircompressor/tree/master/src/
>> main/java/io/airlift/compress/zstd
>>
>> We plan on writing the compressor soon as we need it for our production
>> systems.
>>
>> BTW we are about to do a release that fixes a bug with zstd.
>>
>> -dain
>>
>>> On May 4, 2018, at 11:19 AM, Xiening Dai  wrote:
>>>
>>> Hi all,
>>>
>>> I think the major reason that we don’t support zstd compressor today is
>> that there’s no native java library currently. But I do see a java
>> decompressor in presto code base -
>>>
>>> https://github.com/prestodb/presto/blob/8f4e5bb9340890f01291ee1b777a1b
>> 2b921a90c4/presto-orc/src/main/java/com/facebook/presto/
>> orc/zstd/ZstdDecompressor.java
>>>
>>> I wonder if we can just integrate this so at least we have the
>> decompress ability. I believe this is valuable as there are Orc files
>> compressed in zstd, including some presto workload, and some of the
>> workload in our installation (generated by c++ writer with zstd
>> compression).
>>>
>>>
>>
>>


Re: Zstd decoder support

2018-05-04 Thread Owen O'Malley
I just upgraded ORC to use aircompressor 0.10. I assume we'll want to move
to 0.11 before we use zstd?

.. Owen

On Fri, May 4, 2018 at 12:49 PM, Dain Sundstrom  wrote:

> The maintained location (and the version we use in prod) is:
> https://github.com/airlift/aircompressor/tree/master/src/
> main/java/io/airlift/compress/zstd
>
> We plan on writing the compressor soon as we need it for our production
> systems.
>
> BTW we are about to do a release that fixes a bug with zstd.
>
> -dain
>
> > On May 4, 2018, at 11:19 AM, Xiening Dai  wrote:
> >
> > Hi all,
> >
> > I think the major reason that we don’t support zstd compressor today is
> that there’s no native java library currently. But I do see a java
> decompressor in presto code base -
> >
> > https://github.com/prestodb/presto/blob/8f4e5bb9340890f01291ee1b777a1b
> 2b921a90c4/presto-orc/src/main/java/com/facebook/presto/
> orc/zstd/ZstdDecompressor.java
> >
> > I wonder if we can just integrate this so at least we have the
> decompress ability. I believe this is valuable as there are Orc files
> compressed in zstd, including some presto workload, and some of the
> workload in our installation (generated by c++ writer with zstd
> compression).
> >
> >
>
>


Re: Zstd decoder support

2018-05-04 Thread Dain Sundstrom
The maintained location (and the version we use in prod) is: 
https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/zstd
 

We plan on writing the compressor soon as we need it for our production systems.

BTW we are about to do a release that fixes a bug with zstd.

-dain

> On May 4, 2018, at 11:19 AM, Xiening Dai  wrote:
> 
> Hi all,
> 
> I think the major reason that we don’t support zstd compressor today is that 
> there’s no native java library currently. But I do see a java decompressor in 
> presto code base -
> 
> https://github.com/prestodb/presto/blob/8f4e5bb9340890f01291ee1b777a1b2b921a90c4/presto-orc/src/main/java/com/facebook/presto/orc/zstd/ZstdDecompressor.java
> 
> I wonder if we can just integrate this so at least we have the decompress 
> ability. I believe this is valuable as there are Orc files compressed in 
> zstd, including some presto workload, and some of the workload in our 
> installation (generated by c++ writer with zstd compression).
> 
>