Eric Payne created HADOOP-17381:
-----------------------------------
Summary: Ability to specify separate compression settings when
intermediate and final output use the same codec
Key: HADOOP-17381
URL: https://issues.apache.org/jira/browse/HADOOP-17381
Project: Hadoop Common
Issue Type: Improvement
Reporter: Eric Payne
The ZStandard codec may become a codec that users will want to use for both
intermediate data and for final output data yet specify different compression
levels for those use cases.
It would be nice if there was a way we could create a "meta codec" like
IntermediateCodec that used conf prefix techniques, like Oozie does with
oozie.launcher for the Oozie launcher configs, to create a custom config
namespace of sorts for setting arbitrary codec settings specific to the
intermediate codec separate from the final output codec even if the same
underlying codec is used for both.
However Codecs don't allow a configuration to be passed when obtaining a codec
stream, and I think we would have to bypass the CodecPool entirely to be able
to pass a custom conf to an arbitrary Codec.
Another approach is to skip trying to generalize the solution and specifically
focus on ZStandard. It would be easy to create a wrapper codec around the
existing ZStandardCompressor and ZStandardDecompressor which take the relevant
parameters directly in their constructors.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]