Hi Falak,

I was able to set the compression level in Spark using
spark.io.compression.zstd.level.

Cheers, Fokko

Op do 17 okt. 2019 om 20:53 schreef Radev, Martin <[email protected]>:

> Hi Falak,
>
>
> I was one of the people who recently exposed this to Arrow but this is not
> part of the Parquet specification.
>
> In particular, any implementation for writing parquet files can decide
> whether to expose this or select a reasonable value internally.
>
>
> If you're using Arrow, you would have to read the documentation of the
> specified compressor. Arrow doesn't do checks for whether specified
> compression level is within the range of what's supported by the codec. For
> ZSTD, the range should be [1, 22].
>
> Let me know if you're using Arrow and I can check locally that there isn't
> by any chance a bug with propagating the value. At the moment there are
> only smoke tests that nothing crashes.
>
>
> Regards,
>
> Martin
> ------------------------------
> *From:* Falak Kansal <[email protected]>
> *Sent:* Thursday, October 17, 2019 4:43:54 PM
> *To:* Driesprong, Fokko
> *Cc:* [email protected]
> *Subject:* Re: custom CompressionCodec support
>
> Hi Fokko,
>
> Thanks for replying, yes sure.
> The problem we are facing is that with parquet zstd we are not able to
> control the compression level, we tried setting different compression
> levels but it doesn't make any difference in the size. We tested/have made
> sure that we are getting the same compression level in
> *ZStandardCompressor
> *as we are setting in the configuration file. Are we missing something? How
> can we set a different compression level of zstd? Help would be
> appreciated.
>
> Thanks
> Falak
>
> On Thu, Oct 17, 2019 at 7:47 PM Driesprong, Fokko <[email protected]>
> wrote:
>
> > Hi Manik,
> >
> > The supported compression codecs that ship with Parquet are tested and
> > validated in the CI pipeline. Sometimes there are issues with
> compressors,
> > therefore they are not easily pluggable. Feel free to open up a PR to the
> > project if you believe if there are compressors missing, then we can
> have a
> > discussion.
> >
> > It is part of the Thrift definition:
> >
> https://github.com/apache/parquet-format/blob/37bdba0a18cff18da706a0d353c65e726c8edca6/src/main/thrift/parquet.thrift#L470-L478
> >
> > Hope this clarifies the design decision.
> >
> > Cheers, Fokko
> >
> > Op di 15 okt. 2019 om 11:52 schreef Manik Singla <[email protected]>:
> >
> >> Hi
> >>
> >> Current java code is not open to use custom compressor.
> >> I believe mostly read/write is done by same team/company.  In that case,
> >> it
> >> would be beneficial to add this support that user can plug new
> compressor
> >> easily instead of doing local changes which will be prone to uses across
> >> version upgrades.
> >>
> >> Do you guys think it will be worth to add
> >>
> >> Regards
> >> Manik Singla
> >> +91-9996008893
> >> +91-9665639677
> >>
> >> "Life doesn't consist in holding good cards but playing those you hold
> >> well."
> >>
> >
>

Reply via email to