[ 
https://issues.apache.org/jira/browse/HADOOP-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233987#comment-17233987
 ] 

Eric Payne commented on HADOOP-17381:
-------------------------------------

One hacky way that might work and be extensible to all codecs is to create an 
explicit codec for intermediate outputs, like IntermediateOutputCodec, that has 
its own config namespace with a prefix, like io.compression.codec.intermediate. 
Users can then configure the same codec being used for the output codec but as 
a sub-codec to IntermediateOutputCodec. IntermediateOutputCodec would gather 
all configs with the io.compression.codec.intermediate prefix, strip the 
prefix, then reapply to a new Configuration that becomes the config object for 
the sub-codec, allowing it to have different settings than the output codec 
even though the same codec type is being used underneath.

> Ability to specify separate compression settings when intermediate and final 
> output use the same codec
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-17381
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17381
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Eric Payne
>            Priority: Major
>
> The ZStandard codec may become a codec that users will want to use for both 
> intermediate data and for final output data yet specify different compression 
> levels for those use cases.
> It would be nice if there was a way we could create a "meta codec" like 
> IntermediateCodec that used conf prefix techniques, like Oozie does with 
> oozie.launcher for the Oozie launcher configs, to create a custom config 
> namespace of sorts for setting arbitrary codec settings specific to the 
> intermediate codec separate from the final output codec even if the same 
> underlying codec is used for both.
> However Codecs don't allow a configuration to be passed when obtaining a 
> codec stream, and I think we would have to bypass the CodecPool entirely to 
> be able to pass a custom conf to an arbitrary Codec.
> Another approach is to skip trying to generalize the solution and 
> specifically focus on ZStandard. It would be easy to create a wrapper codec 
> around the existing ZStandardCompressor and ZStandardDecompressor which take 
> the relevant parameters directly in their constructors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to