Re: Zstd decoder support
Our expectation is maybe in a quarter. -dain > On May 17, 2018, at 11:42 AM, Xiening Dai wrote: > > Hi Dain, > > Do you have a roughly timeline regarding when the Java zstd compressor will > be available? Thanks. > > >> On May 7, 2018, at 12:34 PM, Dain Sundstrom wrote: >> >> The fixes are released in v0.11 >> >> -dain >> >>> On May 6, 2018, at 9:36 PM, Xiening Dai wrote: >>> >>> Thanks for clarification. It makes sense to wait for your fixes. Thx. >>> On May 5, 2018, at 1:04 PM, Dain Sundstrom wrote: > On May 5, 2018, at 11:46 AM, Xiening Dai wrote: > BTW we are about to do a release that fixes a bug with zstd. > > I am curious which bug you are referring to. Is it a bug with Java > implementation or it affects C++ as well? The bugs were in the Java implementation. IIRC there were two problems. In v0.10, we added support for zstd concatenated frames, and it had a rare buffer overrun problem. The second problem has been around since the beginning. When the file contains checksums they weren’t being validated correctly. We missed this one because the default native implementation was not adding checksums so the code wasn’t actually being tested. -dain >>> >> >
Re: Zstd decoder support
Hi Dain, Do you have a roughly timeline regarding when the Java zstd compressor will be available? Thanks. > On May 7, 2018, at 12:34 PM, Dain Sundstrom wrote: > > The fixes are released in v0.11 > > -dain > >> On May 6, 2018, at 9:36 PM, Xiening Dai wrote: >> >> Thanks for clarification. It makes sense to wait for your fixes. Thx. >> >>> On May 5, 2018, at 1:04 PM, Dain Sundstrom wrote: >>> >>> On May 5, 2018, at 11:46 AM, Xiening Dai wrote: >>> BTW we are about to do a release that fixes a bug with zstd. I am curious which bug you are referring to. Is it a bug with Java implementation or it affects C++ as well? >>> >>> The bugs were in the Java implementation. IIRC there were two problems. >>> In v0.10, we added support for zstd concatenated frames, and it had a rare >>> buffer overrun problem. The second problem has been around since the >>> beginning. When the file contains checksums they weren’t being validated >>> correctly. We missed this one because the default native implementation >>> was not adding checksums so the code wasn’t actually being tested. >>> >>> -dain >> >
Re: Zstd decoder support
The fixes are released in v0.11 -dain > On May 6, 2018, at 9:36 PM, Xiening Dai wrote: > > Thanks for clarification. It makes sense to wait for your fixes. Thx. > >> On May 5, 2018, at 1:04 PM, Dain Sundstrom wrote: >> >> >>> On May 5, 2018, at 11:46 AM, Xiening Dai wrote: >>> >> BTW we are about to do a release that fixes a bug with zstd. >>> >>> I am curious which bug you are referring to. Is it a bug with Java >>> implementation or it affects C++ as well? >> >> The bugs were in the Java implementation. IIRC there were two problems. In >> v0.10, we added support for zstd concatenated frames, and it had a rare >> buffer overrun problem. The second problem has been around since the >> beginning. When the file contains checksums they weren’t being validated >> correctly. We missed this one because the default native implementation was >> not adding checksums so the code wasn’t actually being tested. >> >> -dain >
Re: Zstd decoder support
Thanks for clarification. It makes sense to wait for your fixes. Thx. > On May 5, 2018, at 1:04 PM, Dain Sundstrom wrote: > > >> On May 5, 2018, at 11:46 AM, Xiening Dai wrote: >> > BTW we are about to do a release that fixes a bug with zstd. >> >> I am curious which bug you are referring to. Is it a bug with Java >> implementation or it affects C++ as well? > > The bugs were in the Java implementation. IIRC there were two problems. In > v0.10, we added support for zstd concatenated frames, and it had a rare > buffer overrun problem. The second problem has been around since the > beginning. When the file contains checksums they weren’t being validated > correctly. We missed this one because the default native implementation was > not adding checksums so the code wasn’t actually being tested. > > -dain
Re: Zstd decoder support
> On May 5, 2018, at 11:46 AM, Xiening Dai wrote: > BTW we are about to do a release that fixes a bug with zstd. > > I am curious which bug you are referring to. Is it a bug with Java > implementation or it affects C++ as well? The bugs were in the Java implementation. IIRC there were two problems. In v0.10, we added support for zstd concatenated frames, and it had a rare buffer overrun problem. The second problem has been around since the beginning. When the file contains checksums they weren’t being validated correctly. We missed this one because the default native implementation was not adding checksums so the code wasn’t actually being tested. -dain
Re: Zstd decoder support
>>> BTW we are about to do a release that fixes a bug with zstd. I am curious which bug you are referring to. Is it a bug with Java implementation or it affects C++ as well? > On May 4, 2018, at 7:56 PM, Dain Sundstrom wrote: > > 0.11 is released. > > -dain > > Sent from my iPhone > >> On May 4, 2018, at 1:41 PM, Owen O'Malley wrote: >> >> I just upgraded ORC to use aircompressor 0.10. I assume we'll want to move >> to 0.11 before we use zstd? >> >> .. Owen >> >>> On Fri, May 4, 2018 at 12:49 PM, Dain Sundstrom wrote: >>> >>> The maintained location (and the version we use in prod) is: >>> https://github.com/airlift/aircompressor/tree/master/src/ >>> main/java/io/airlift/compress/zstd >>> >>> We plan on writing the compressor soon as we need it for our production >>> systems. >>> >>> BTW we are about to do a release that fixes a bug with zstd. >>> >>> -dain >>> On May 4, 2018, at 11:19 AM, Xiening Dai wrote: Hi all, I think the major reason that we don’t support zstd compressor today is >>> that there’s no native java library currently. But I do see a java >>> decompressor in presto code base - https://github.com/prestodb/presto/blob/8f4e5bb9340890f01291ee1b777a1b >>> 2b921a90c4/presto-orc/src/main/java/com/facebook/presto/ >>> orc/zstd/ZstdDecompressor.java I wonder if we can just integrate this so at least we have the >>> decompress ability. I believe this is valuable as there are Orc files >>> compressed in zstd, including some presto workload, and some of the >>> workload in our installation (generated by c++ writer with zstd >>> compression). >>> >>>
Re: Zstd decoder support
0.11 is released. -dain Sent from my iPhone > On May 4, 2018, at 1:41 PM, Owen O'Malley wrote: > > I just upgraded ORC to use aircompressor 0.10. I assume we'll want to move > to 0.11 before we use zstd? > > .. Owen > >> On Fri, May 4, 2018 at 12:49 PM, Dain Sundstrom wrote: >> >> The maintained location (and the version we use in prod) is: >> https://github.com/airlift/aircompressor/tree/master/src/ >> main/java/io/airlift/compress/zstd >> >> We plan on writing the compressor soon as we need it for our production >> systems. >> >> BTW we are about to do a release that fixes a bug with zstd. >> >> -dain >> >>> On May 4, 2018, at 11:19 AM, Xiening Dai wrote: >>> >>> Hi all, >>> >>> I think the major reason that we don’t support zstd compressor today is >> that there’s no native java library currently. But I do see a java >> decompressor in presto code base - >>> >>> https://github.com/prestodb/presto/blob/8f4e5bb9340890f01291ee1b777a1b >> 2b921a90c4/presto-orc/src/main/java/com/facebook/presto/ >> orc/zstd/ZstdDecompressor.java >>> >>> I wonder if we can just integrate this so at least we have the >> decompress ability. I believe this is valuable as there are Orc files >> compressed in zstd, including some presto workload, and some of the >> workload in our installation (generated by c++ writer with zstd >> compression). >>> >>> >> >>
Re: Zstd decoder support
I just upgraded ORC to use aircompressor 0.10. I assume we'll want to move to 0.11 before we use zstd? .. Owen On Fri, May 4, 2018 at 12:49 PM, Dain Sundstrom wrote: > The maintained location (and the version we use in prod) is: > https://github.com/airlift/aircompressor/tree/master/src/ > main/java/io/airlift/compress/zstd > > We plan on writing the compressor soon as we need it for our production > systems. > > BTW we are about to do a release that fixes a bug with zstd. > > -dain > > > On May 4, 2018, at 11:19 AM, Xiening Dai wrote: > > > > Hi all, > > > > I think the major reason that we don’t support zstd compressor today is > that there’s no native java library currently. But I do see a java > decompressor in presto code base - > > > > https://github.com/prestodb/presto/blob/8f4e5bb9340890f01291ee1b777a1b > 2b921a90c4/presto-orc/src/main/java/com/facebook/presto/ > orc/zstd/ZstdDecompressor.java > > > > I wonder if we can just integrate this so at least we have the > decompress ability. I believe this is valuable as there are Orc files > compressed in zstd, including some presto workload, and some of the > workload in our installation (generated by c++ writer with zstd > compression). > > > > > >
Re: Zstd decoder support
The maintained location (and the version we use in prod) is: https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/zstd We plan on writing the compressor soon as we need it for our production systems. BTW we are about to do a release that fixes a bug with zstd. -dain > On May 4, 2018, at 11:19 AM, Xiening Dai wrote: > > Hi all, > > I think the major reason that we don’t support zstd compressor today is that > there’s no native java library currently. But I do see a java decompressor in > presto code base - > > https://github.com/prestodb/presto/blob/8f4e5bb9340890f01291ee1b777a1b2b921a90c4/presto-orc/src/main/java/com/facebook/presto/orc/zstd/ZstdDecompressor.java > > I wonder if we can just integrate this so at least we have the decompress > ability. I believe this is valuable as there are Orc files compressed in > zstd, including some presto workload, and some of the workload in our > installation (generated by c++ writer with zstd compression). > >