[
https://issues.apache.org/jira/browse/HADOOP-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Douglas updated HADOOP-5879:
----------------------------------
Status: Open (was: Patch Available)
{quote}
So one possible way is to let CodecPool do special for Gzip codec, and does
either
1) keeps a map for holding gzip codec of different settings.
or
2) treats the setting as a global setting, and when the setting is changed,
clean all gzip codecs cached in CodecPool.
Does the changes for CodecPool sound reasonable/acceptable?
{quote}
I'm not sure the "clean" semantics have clear triggers (or they're not clear to
me). I'd suggest an analog to {{end}} in the {{(Dec|C)ompressor}} interface
that reinitializes a (de)compressor, then use those interfaces in the
{{CodecPool}}. This would be a better fix for HADOOP-5281, but it requires
updates to other implementors of {{Compressor}}. Something like {{reinit}} that
destroys (with {{end}}) and recreates (with {{init}}) the underlying stream.
Overloading {{CodecPool::getCompressor}} to take a {{Configuration}} and...
well, tracing the implications through the rest of the Codec classes makes it
easy to trace where compressors are recycled. Calling {{reinit}} with
parameters matching the current ones should be a noop and calling
{{CodecPool::getCompressor}} without any arguments should use default params.
Since this is a fair amount of work, if you wanted to narrow the issue to be
global settings for GzipCodec, then an approach like that in the current patch
is probably sufficient for many applications.
Quick asides on the current patch: {{ZlibCompressor::construct}} should be
final; if overridden in a subclass, the partially created object would call the
subclass instance from the base cstr. Also, since the parameters are specific
to GzipCodc, they should not have generic names like "io.compress.level".
> GzipCodec should read compression level etc from configuration
> --------------------------------------------------------------
>
> Key: HADOOP-5879
> URL: https://issues.apache.org/jira/browse/HADOOP-5879
> Project: Hadoop Core
> Issue Type: Improvement
> Components: io
> Reporter: Zheng Shao
> Attachments: hadoop-5879-5-21.patch
>
>
> GzipCodec currently uses the default compression level. We should allow
> overriding the default value from Configuration.
> {code}
> static final class GzipZlibCompressor extends ZlibCompressor {
> public GzipZlibCompressor() {
> super(ZlibCompressor.CompressionLevel.DEFAULT_COMPRESSION,
> ZlibCompressor.CompressionStrategy.DEFAULT_STRATEGY,
> ZlibCompressor.CompressionHeader.GZIP_FORMAT, 64*1024);
> }
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.