SequenceFile.Reader keeps around buffer whose size is that of largest item read
-> results in lots of dead heap
---------------------------------------------------------------------------------------------------------------
Key: HBASE-1097
URL: https://issues.apache.org/jira/browse/HBASE-1097
Project: Hadoop HBase
Issue Type: Bug
Environment: apurtell 25 node TRUNK on hadoop 0.18.1 cluster
Reporter: stack
Fix For: 0.19.0
Andrew is OOMEing again. Looking at some of his heaps, I can count Reader with
DataOutputBuffers of ~600MB in a 2G heap. Testing I see that the
DataOutputBuffer allocated at head of Mapfile.Reader is reused when we call
next, a reset is called. If I trace, the DataOutputBuffer has in it an internal
Buffer class which is based on ByteArrayOutputStream. Reset of the DOB
eventually goes through to the BAOS reset. This just sets the position. It
keeps the buffer sized to whatever it grew to last time this BAOS was used
(Figuring this was a little complicated by the fact that DOB does some fancy
footwork in a reset override to avoid copies).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.