SequenceFile.Reader keeps around buffer whose size is that of largest item read 
-> results in lots of dead heap
---------------------------------------------------------------------------------------------------------------

                 Key: HBASE-1097
                 URL: https://issues.apache.org/jira/browse/HBASE-1097
             Project: Hadoop HBase
          Issue Type: Bug
         Environment: apurtell 25 node TRUNK on hadoop 0.18.1 cluster
            Reporter: stack
             Fix For: 0.19.0


Andrew is OOMEing again.  Looking at some of his heaps, I can count Reader with 
DataOutputBuffers of ~600MB in a 2G heap.  Testing I see that the 
DataOutputBuffer allocated at head of Mapfile.Reader is reused when we call 
next, a reset is called. If I trace, the DataOutputBuffer has in it an internal 
Buffer class which is based on ByteArrayOutputStream.  Reset of the DOB 
eventually goes through to the BAOS reset.  This just sets the position.  It 
keeps the buffer sized to whatever it grew to last time this BAOS was used 
(Figuring this was a little complicated by the fact that DOB does some fancy 
footwork in a reset override to avoid copies).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to