[ 
https://issues.apache.org/jira/browse/HADOOP-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HADOOP-5879:
----------------------------------

    Status: Open  (was: Patch Available)

{quote}
So one possible way is to let CodecPool do special for Gzip codec, and does 
either
1) keeps a map for holding gzip codec of different settings.
or
2) treats the setting as a global setting, and when the setting is changed, 
clean all gzip codecs cached in CodecPool.

Does the changes for CodecPool sound reasonable/acceptable?
{quote}

I'm not sure the "clean" semantics have clear triggers (or they're not clear to 
me). I'd suggest an analog to {{end}} in the {{(Dec|C)ompressor}} interface 
that reinitializes a (de)compressor, then use those interfaces in the 
{{CodecPool}}. This would be a better fix for HADOOP-5281, but it requires 
updates to other implementors of {{Compressor}}. Something like {{reinit}} that 
destroys (with {{end}}) and recreates (with {{init}}) the underlying stream. 
Overloading {{CodecPool::getCompressor}} to take a {{Configuration}} and... 
well, tracing the implications through the rest of the Codec classes makes it 
easy to trace where compressors are recycled. Calling {{reinit}} with 
parameters matching the current ones should be a noop and calling 
{{CodecPool::getCompressor}} without any arguments should use default params.

Since this is a fair amount of work, if you wanted to narrow the issue to be 
global settings for GzipCodec, then an approach like that in the current patch 
is probably sufficient for many applications.

Quick asides on the current patch: {{ZlibCompressor::construct}} should be 
final; if overridden in a subclass, the partially created object would call the 
subclass instance from the base cstr. Also, since the parameters are specific 
to GzipCodc, they should not have generic names like "io.compress.level".

> GzipCodec should read compression level etc from configuration
> --------------------------------------------------------------
>
>                 Key: HADOOP-5879
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5879
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: io
>            Reporter: Zheng Shao
>         Attachments: hadoop-5879-5-21.patch
>
>
> GzipCodec currently uses the default compression level. We should allow 
> overriding the default value from Configuration.
> {code}
>   static final class GzipZlibCompressor extends ZlibCompressor {
>     public GzipZlibCompressor() {
>       super(ZlibCompressor.CompressionLevel.DEFAULT_COMPRESSION,
>           ZlibCompressor.CompressionStrategy.DEFAULT_STRATEGY,
>           ZlibCompressor.CompressionHeader.GZIP_FORMAT, 64*1024);
>     }
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to