phrocker edited a comment on pull request #140:
URL: https://github.com/apache/accumulo/pull/140#issuecomment-1018608281


   > After some research into recent Hadoop improvements and since production 
hasn't encountered memory issues with compressors recently, I think this can be 
closed. Hadoop has made changes in version 2.9.0 to `CodecPool` that 
essentially does no pooling, one improvement this PR was making. Here is the 
code from `CodecPool` that checks for the `DoNotPool` annotation: 
https://github.com/apache/hadoop/blob/rel/release-2.9.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CodecPool.java#L160-L165
   
   @milleruntime  thanks for doing that leg work! I suspect this issue wouldn't 
be encountered while source limiting exists on a production systems. I haven't 
been part of enough to know to be honest, but I did confirm it in 2020 ( my 
prior comment was meant to be 2016 since I've encountered it on production 
system ) in the last round of comments with an iterator that deep copied a lot 
of sources.  I believe that annotation is needed on that class, though. A 
cursory searched shows that it only the built in gzip compressor/decompresor 
does not pool (https://github.com/apache/hadoop/search?q=DoNotPool) , but I'm 
likely mistaken.
   
    I think the primary motivation was changing how delegation of compressors 
worked and to make it pluggable less correcting the initial issue but as Keith 
identified it may be better suited in spi anyway, which means closing this 
makes sense.  Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to