[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

Hudson (JIRA) Wed, 22 Aug 2018 19:55:35 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589621#comment-16589621
 ]


Hudson commented on HBASE-21031:
--------------------------------

Results for branch branch-2.1
        [build #226 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226/]: 
(x) *{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Memory leak if replay edits failed during region opening
> --------------------------------------------------------
>
>                 Key: HBASE-21031
>                 URL: https://issues.apache.org/jira/browse/HBASE-21031
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.1.0, 2.0.1
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>             Fix For: 3.0.0, 2.1.1, 2.0.2
>
>         Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> HBASE-21031.branch-2.0.006.patch, HBASE-21031.branch-2.0.006.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
>         at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>         at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
>         at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
>         at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
>         at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
>         at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
>         at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
>         at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
>         at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
>         at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
>         at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

Reply via email to