[ https://issues.apache.org/jira/browse/HADOOP-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233987#comment-17233987 ]
Eric Payne commented on HADOOP-17381: ------------------------------------- One hacky way that might work and be extensible to all codecs is to create an explicit codec for intermediate outputs, like IntermediateOutputCodec, that has its own config namespace with a prefix, like io.compression.codec.intermediate. Users can then configure the same codec being used for the output codec but as a sub-codec to IntermediateOutputCodec. IntermediateOutputCodec would gather all configs with the io.compression.codec.intermediate prefix, strip the prefix, then reapply to a new Configuration that becomes the config object for the sub-codec, allowing it to have different settings than the output codec even though the same codec type is being used underneath. > Ability to specify separate compression settings when intermediate and final > output use the same codec > ------------------------------------------------------------------------------------------------------ > > Key: HADOOP-17381 > URL: https://issues.apache.org/jira/browse/HADOOP-17381 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Eric Payne > Priority: Major > > The ZStandard codec may become a codec that users will want to use for both > intermediate data and for final output data yet specify different compression > levels for those use cases. > It would be nice if there was a way we could create a "meta codec" like > IntermediateCodec that used conf prefix techniques, like Oozie does with > oozie.launcher for the Oozie launcher configs, to create a custom config > namespace of sorts for setting arbitrary codec settings specific to the > intermediate codec separate from the final output codec even if the same > underlying codec is used for both. > However Codecs don't allow a configuration to be passed when obtaining a > codec stream, and I think we would have to bypass the CodecPool entirely to > be able to pass a custom conf to an arbitrary Codec. > Another approach is to skip trying to generalize the solution and > specifically focus on ZStandard. It would be easy to create a wrapper codec > around the existing ZStandardCompressor and ZStandardDecompressor which take > the relevant parameters directly in their constructors. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org