[
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589621#comment-16589621
]
Hudson commented on HBASE-21031:
--------------------------------
Results for branch branch-2.1
[build #226 on
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226/]:
(x) *{color:red}-1 overall{color}*
----
details (if available):
(/) {color:green}+1 general checks{color}
-- For more information [see general
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226//General_Nightly_Build_Report/]
(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2)
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226//JDK8_Nightly_Build_Report_(Hadoop2)/]
(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3)
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226//JDK8_Nightly_Build_Report_(Hadoop3)/]
(/) {color:green}+1 source release artifact{color}
-- See build output for details.
(/) {color:green}+1 client integration test{color}
> Memory leak if replay edits failed during region opening
> --------------------------------------------------------
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.1.0, 2.0.1
> Reporter: Allan Yang
> Assignee: Allan Yang
> Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21031.branch-2.0.001.patch,
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch,
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch,
> HBASE-21031.branch-2.0.006.patch, HBASE-21031.branch-2.0.006.patch,
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the
> memstore won't flush, a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2]
> handler.OpenRegionHandler(302): Failed open of
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41.,
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is
> used, all the chunk allocated won't release for ever. Those memory is leak
> forever...
> We need to rollback the memory if open region fails(For now, only global
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting
> to record how many memory used during replaying. And decrease the global
> memstore size if replay fails. This is not right, since during replaying, we
> may also flush the memstore, the size in the map of replayEditsPerRegion is
> not accurate at all!
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)