phrocker edited a comment on pull request #140:
URL: https://github.com/apache/accumulo/pull/140#issuecomment-1018608281


   > After some research into recent Hadoop improvements and since production 
hasn't encountered memory issues with compressors recently, I think this can be 
closed. Hadoop has made changes in version 2.9.0 to `CodecPool` that 
essentially does no pooling, one improvement this PR was making. Here is the 
code from `CodecPool` that checks for the `DoNotPool` annotation: 
https://github.com/apache/hadoop/blob/rel/release-2.9.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CodecPool.java#L160-L165
   
   @milleruntime  thanks for doing that leg work! I suspect this issue wouldn't 
be encountered while source limiting exists on a production systems. I haven't 
been part of enough to know to be honest, but I did confirm it in 2020 ( my 
prior comment was meant to be 2016 since I've encountered it on production 
system ) in the last round of comments with an iterator that deep copied a lot 
of sources.  I believe that annotation is needed on that class, though. A 
cursory searched shows that it only the built in built in gzip 
compressor/decompresor does not pool 
(https://github.com/apache/hadoop/search?q=DoNotPool) . I think the primary 
motivation was changing how delegation of compressors worked and to make it 
pluggable less correcting the initial issue but as Keith identified it may be 
better suited in spi anyway, which means closing this makes sense.  Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to