Thank you so much Luben! Here
<https://github.com/apache/parquet-mr/pull/793> is the PR. Please have a
look!

On Wed, May 20, 2020 at 6:51 PM Любен <karave...@gmail.com> wrote:

> Hi,
>
> I don't know any performance or correctness problems with Zstd-JNI. It
> tracks very closely the upstream (the native part) and tries to expose most
> of the functionality. Regarding streaming interfaces, assuming that you are
> going to use them,  there are currently 2 approaches:
>
> - ZstdInputStream/ZstdOutputStream filters that decompress/compress
> streams, similar to the Gzip implementation from the standard library.
> - variants that work with direct buffers. If it fits with how your code is
> structured, it may be slightly faster.
>
> If you have any specific questions, please let me know. Also you can send
> me your PR when it's ready so I may have suggestions.
>
> BTW, it's strange Hadoop decided to reimplement it their own way. The rest
> of the ecosystem is using Zstd-JNI, e.g. Spark, Flink, Cassandra, etc.
>
> Regards,
> luben
>
>
>
>
> On Thu, May 21, 2020 at 2:34 AM Xinli shang <sha...@uber.com> wrote:
>
>> Hi all,
>>
>> I see parquet-mr has been using ZSTD-JNI
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_luben_zstd-2Djni&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=FQ88AmOZ4TMjDdqNBGu-ag&m=OwMxoSaxdP-kXD9aHpK8orXERL4hJVC5SqNa9Qvd6ek&s=LO0yXYHXoWUpVFKpuvUoJi5BVOiE7AH8ItThuc0PCZw&e=>for
>> the parquet-cli
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_parquet-2Dmr_blob_master_parquet-2Dcli_pom.xml-23L48&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=FQ88AmOZ4TMjDdqNBGu-ag&m=OwMxoSaxdP-kXD9aHpK8orXERL4hJVC5SqNa9Qvd6ek&s=pbMGYR8ZDFJ5C-a0nZuZ_RfZorwmmRJfuLx8SlHiIJg&e=>
>> project. It is a clean approach to use this JNI for testing ZSTD instead of
>> using Hadoop implementation, especially when testing in localhost. I am
>> wondering maybe we can promote it to parquet-hadoop project as ZSTD
>> becomes more and more popular. I have a prototype working but I would like
>> to ask if anybody knows any issues (performance, reliability etc) of
>> ZSTD-JNI
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_luben_zstd-2Djni&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=FQ88AmOZ4TMjDdqNBGu-ag&m=OwMxoSaxdP-kXD9aHpK8orXERL4hJVC5SqNa9Qvd6ek&s=LO0yXYHXoWUpVFKpuvUoJi5BVOiE7AH8ItThuc0PCZw&e=>?
>> It is welcome to share any feedback on using this JNI.
>>
>> BTW, I am also trying out the AirCompressor
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_airlift_aircompressor&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=FQ88AmOZ4TMjDdqNBGu-ag&m=OwMxoSaxdP-kXD9aHpK8orXERL4hJVC5SqNa9Qvd6ek&s=AWRDbQ7XL7can-3rUwioL-QGc5r_jQpzpE86RmQuUq8&e=>
>>  approach,
>> but it seems the ZSTD compression level is not adjustable.
>>
>> --
>> Xinli Shang
>>
>

-- 
Xinli Shang

Reply via email to