phrocker commented on pull request #140:
URL: https://github.com/apache/accumulo/pull/140#issuecomment-1018608281


   > After some research into recent Hadoop improvements and since production 
hasn't encountered memory issues with compressors recently, I think this can be 
closed. Hadoop has made changes in version 2.9.0 to `CodecPool` that 
essentially does no pooling, one improvement this PR was making. Here is the 
code from `CodecPool` that checks for the `DoNotPool` annotation: 
https://github.com/apache/hadoop/blob/rel/release-2.9.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CodecPool.java#L160-L165
   
   @milleruntime  thanks for doing that leg work! I suspect this issue wouldn't 
be encountered while source limiting exists on a production systems. I haven't 
been part of enough to know to be honest, but I did confirm it in 2020 ( my 
prior comment was meant to be on a production system ) in the last round of 
comments with an iterator that deep copied a lot of sources.  I believe that 
annotation is needed on that class, though. A cursory searched shows that it 
only the built in built in gzip compressor/decompresor does not pool 
(https://github.com/apache/hadoop/search?q=DoNotPool) . I think the primary 
motivation was changing how delegation of compressors worked and to make it 
pluggable less correcting the initial issue but as Keith identified it may be 
better suited in spi anyway, which means closing this makes sense. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to