Hi Falak, I was able to set the compression level in Spark using spark.io.compression.zstd.level.
Cheers, Fokko Op do 17 okt. 2019 om 20:53 schreef Radev, Martin <[email protected]>: > Hi Falak, > > > I was one of the people who recently exposed this to Arrow but this is not > part of the Parquet specification. > > In particular, any implementation for writing parquet files can decide > whether to expose this or select a reasonable value internally. > > > If you're using Arrow, you would have to read the documentation of the > specified compressor. Arrow doesn't do checks for whether specified > compression level is within the range of what's supported by the codec. For > ZSTD, the range should be [1, 22]. > > Let me know if you're using Arrow and I can check locally that there isn't > by any chance a bug with propagating the value. At the moment there are > only smoke tests that nothing crashes. > > > Regards, > > Martin > ------------------------------ > *From:* Falak Kansal <[email protected]> > *Sent:* Thursday, October 17, 2019 4:43:54 PM > *To:* Driesprong, Fokko > *Cc:* [email protected] > *Subject:* Re: custom CompressionCodec support > > Hi Fokko, > > Thanks for replying, yes sure. > The problem we are facing is that with parquet zstd we are not able to > control the compression level, we tried setting different compression > levels but it doesn't make any difference in the size. We tested/have made > sure that we are getting the same compression level in > *ZStandardCompressor > *as we are setting in the configuration file. Are we missing something? How > can we set a different compression level of zstd? Help would be > appreciated. > > Thanks > Falak > > On Thu, Oct 17, 2019 at 7:47 PM Driesprong, Fokko <[email protected]> > wrote: > > > Hi Manik, > > > > The supported compression codecs that ship with Parquet are tested and > > validated in the CI pipeline. Sometimes there are issues with > compressors, > > therefore they are not easily pluggable. Feel free to open up a PR to the > > project if you believe if there are compressors missing, then we can > have a > > discussion. > > > > It is part of the Thrift definition: > > > https://github.com/apache/parquet-format/blob/37bdba0a18cff18da706a0d353c65e726c8edca6/src/main/thrift/parquet.thrift#L470-L478 > > > > Hope this clarifies the design decision. > > > > Cheers, Fokko > > > > Op di 15 okt. 2019 om 11:52 schreef Manik Singla <[email protected]>: > > > >> Hi > >> > >> Current java code is not open to use custom compressor. > >> I believe mostly read/write is done by same team/company. In that case, > >> it > >> would be beneficial to add this support that user can plug new > compressor > >> easily instead of doing local changes which will be prone to uses across > >> version upgrades. > >> > >> Do you guys think it will be worth to add > >> > >> Regards > >> Manik Singla > >> +91-9996008893 > >> +91-9665639677 > >> > >> "Life doesn't consist in holding good cards but playing those you hold > >> well." > >> > > >
