We checked many levels like 3, 7, 10, 19. May be 1 or 2 more I can retry the experiments.
Regards Manik Singla +91-9996008893 +91-9665639677 "Life doesn't consist in holding good cards but playing those you hold well." On Tue, Oct 22, 2019 at 1:49 AM Radev, Martin <[email protected]> wrote: > Hello Manik, > > > If the compression level is really propagated to the library, what > compression levels did you check? > > > Regards, > > Martin > ------------------------------ > *From:* Manik Singla <[email protected]> > *Sent:* Monday, October 21, 2019 10:11:36 PM > *To:* Parquet Dev > *Cc:* [email protected]; Radev, Martin > *Subject:* Re: custom CompressionCodec support > > Yes, thats the flag we tried and ensured its getting read and propagated. > > Regards > Manik Singla > +91-9996008893 > +91-9665639677 > > "Life doesn't consist in holding good cards but playing those you hold > well." > > > On Mon, Oct 21, 2019 at 12:51 PM Driesprong, Fokko <[email protected]> > wrote: > >> Thanks Manik, >> >> Did you try setting the Hadoop io.compression.codec.zstd.level config? >> >> Cheers, Fokko >> >> Op za 19 okt. 2019 om 12:24 schreef Manik Singla <[email protected]>: >> >> > Hi Fokko and Martin >> > >> > We are using parquet-hadoop which support compressions from >> parquet-format. >> > In our case, we were getting same compression even after changing >> > compression level of zstd. >> > We confirmed that set level is being passed by ZStandardCompressor in >> init >> > which is native call . >> > >> > To confirm the issue, we tried same by injecting own implementation of >> zstd >> > and that seem to work fine. >> > We will have a look how its working for spark and not for us. >> > >> > Regards >> > Manik Singla >> > +91-9996008893 >> > +91-9665639677 >> > >> > "Life doesn't consist in holding good cards but playing those you hold >> > well." >> > >> > >> > On Fri, Oct 18, 2019 at 5:20 PM Driesprong, Fokko <[email protected] >> > >> > wrote: >> > >> > > Hi Falak, >> > > >> > > I was able to set the compression level in Spark using >> > > spark.io.compression.zstd.level. >> > > >> > > Cheers, Fokko >> > > >> > > Op do 17 okt. 2019 om 20:53 schreef Radev, Martin < >> [email protected]>: >> > > >> > > > Hi Falak, >> > > > >> > > > >> > > > I was one of the people who recently exposed this to Arrow but this >> is >> > > not >> > > > part of the Parquet specification. >> > > > >> > > > In particular, any implementation for writing parquet files can >> decide >> > > > whether to expose this or select a reasonable value internally. >> > > > >> > > > >> > > > If you're using Arrow, you would have to read the documentation of >> the >> > > > specified compressor. Arrow doesn't do checks for whether specified >> > > > compression level is within the range of what's supported by the >> codec. >> > > For >> > > > ZSTD, the range should be [1, 22]. >> > > > >> > > > Let me know if you're using Arrow and I can check locally that there >> > > isn't >> > > > by any chance a bug with propagating the value. At the moment there >> are >> > > > only smoke tests that nothing crashes. >> > > > >> > > > >> > > > Regards, >> > > > >> > > > Martin >> > > > ------------------------------ >> > > > *From:* Falak Kansal <[email protected]> >> > > > *Sent:* Thursday, October 17, 2019 4:43:54 PM >> > > > *To:* Driesprong, Fokko >> > > > *Cc:* [email protected] >> > > > *Subject:* Re: custom CompressionCodec support >> > > > >> > > > Hi Fokko, >> > > > >> > > > Thanks for replying, yes sure. >> > > > The problem we are facing is that with parquet zstd we are not able >> to >> > > > control the compression level, we tried setting different >> compression >> > > > levels but it doesn't make any difference in the size. We >> tested/have >> > > made >> > > > sure that we are getting the same compression level in >> > > > *ZStandardCompressor >> > > > *as we are setting in the configuration file. Are we missing >> something? >> > > How >> > > > can we set a different compression level of zstd? Help would be >> > > > appreciated. >> > > > >> > > > Thanks >> > > > Falak >> > > > >> > > > On Thu, Oct 17, 2019 at 7:47 PM Driesprong, Fokko >> <[email protected] >> > > >> > > > wrote: >> > > > >> > > > > Hi Manik, >> > > > > >> > > > > The supported compression codecs that ship with Parquet are tested >> > and >> > > > > validated in the CI pipeline. Sometimes there are issues with >> > > > compressors, >> > > > > therefore they are not easily pluggable. Feel free to open up a >> PR to >> > > the >> > > > > project if you believe if there are compressors missing, then we >> can >> > > > have a >> > > > > discussion. >> > > > > >> > > > > It is part of the Thrift definition: >> > > > > >> > > > >> > > >> > >> https://github.com/apache/parquet-format/blob/37bdba0a18cff18da706a0d353c65e726c8edca6/src/main/thrift/parquet.thrift#L470-L478 >> > > > > >> > > > > Hope this clarifies the design decision. >> > > > > >> > > > > Cheers, Fokko >> > > > > >> > > > > Op di 15 okt. 2019 om 11:52 schreef Manik Singla < >> > [email protected] >> > > >: >> > > > > >> > > > >> Hi >> > > > >> >> > > > >> Current java code is not open to use custom compressor. >> > > > >> I believe mostly read/write is done by same team/company. In >> that >> > > case, >> > > > >> it >> > > > >> would be beneficial to add this support that user can plug new >> > > > compressor >> > > > >> easily instead of doing local changes which will be prone to uses >> > > across >> > > > >> version upgrades. >> > > > >> >> > > > >> Do you guys think it will be worth to add >> > > > >> >> > > > >> Regards >> > > > >> Manik Singla >> > > > >> +91-9996008893 >> > > > >> +91-9665639677 >> > > > >> >> > > > >> "Life doesn't consist in holding good cards but playing those you >> > hold >> > > > >> well." >> > > > >> >> > > > > >> > > > >> > > >> > >> >
