[jira] [Updated] (HBASE-23591) Negative memStoreSizing

Szabolcs Bukros (Jira) Wed, 18 Dec 2019 06:12:23 -0800


     [ 
https://issues.apache.org/jira/browse/HBASE-23591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Szabolcs Bukros updated HBASE-23591:
------------------------------------
    Description: 
After a flush on the replica region the memStoreSizing becomes negative:
{code:java}
2019-12-17 08:31:59,983 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: COMMIT_FLUSH 
table_name: "IntegrationTestRegionReplicaReplicati
on" encoded_region_name: "544affde3e027454f67c8ea46c8f69ee" 
flush_sequence_number: 41392 store_flushes { family_name: "f1" store_home_dir: 
"f1" flush_output: "3c48a23eac784a348a18e10e337d80a2" } store_flushes { 
family_name: "f2" store_home_dir: "f2" flush_output: 
"9a5283ec95694667b4ead2398af5f01e" } store_flushes { family_name: "f3" 
store_home_dir: "f3" flush_output: "e6f25e6b0eca4d22af15d0626d0f8759" } 
region_name: 
"IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
2019-12-17 08:31:59,984 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Received a flush commit marker with 
seqId:41392 and a previous prepared snapshot was found
2019-12-17 08:31:59,993 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Region: 0beaae111b0f6e98bfde31ba35be5408 added 
hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f1/3c48a23eac784a348a18e10e337d80a2,
 entries=32445, sequenceid=41392, filesize=27.6 M
2019-12-17 08:32:00,016 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Region: 0beaae111b0f6e98bfde31ba35be5408 added 
hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f2/9a5283ec95694667b4ead2398af5f01e,
 entries=12264, sequenceid=41392, filesize=10.9 M
2019-12-17 08:32:00,121 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Region: 0beaae111b0f6e98bfde31ba35be5408 added 
hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f3/e6f25e6b0eca4d22af15d0626d0f8759,
 entries=32379, sequenceid=41392, filesize=27.5 M
2019-12-17 08:32:00,122 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
CustomLog decrMemStoreSize. Current: dataSize=135810071, getHeapSize=148400960, 
getOffHeapSize=0, getCellsCount=167243 delta: dataSizeDelta=155923644, 
heapSizeDelta=170112320, offHeapSizeDelta=0, cellsCountDelta=188399
2019-12-17 08:32:00,122 ERROR org.apache.hadoop.hbase.regionserver.HRegion: 
Asked to modify this region's 
(IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be54
08.) memStoreSizing to a negative value which is incorrect. Current 
memStoreSizing=135810071, delta=-155923644
java.lang.Exception
        at 
org.apache.hadoop.hbase.regionserver.HRegion.checkNegativeMemStoreDataSize(HRegion.java:1323)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.decrMemStoreSize(HRegion.java:1316)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.decrMemStoreSize(HRegion.java:1303)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushCommitMarker(HRegion.java:5194)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushMarker(HRegion.java:5025)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.doReplayBatchOp(RSRpcServices.java:1143)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:2232)
        at 
org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29754)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)

{code}
I added some custom logging to the snapshot logic to be able to see snapshot 
sizes: 
{code:java}
2019-12-17 08:31:56,900 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: START_FLUSH 
table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: 
"544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 41392 store_flushes { 
family_name: "f1" store_home_dir: "f1" } store_flushes { family_name: "f2" 
store_home_dir: "f2" } store_flushes { family_name: "f3" store_home_dir: "f3" } 
region_name: 
"IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
2019-12-17 08:31:56,900 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Flushing 0beaae111b0f6e98bfde31ba35be5408 3/3 column families, dataSize=126.49 
MB heapSize=138.24 MB
2019-12-17 08:31:56,900 WARN 
org.apache.hadoop.hbase.regionserver.DefaultMemStore: Snapshot called again 
without clearing previous. Doing nothing. Another ongoing flush or did we fail 
last attempt?
2019-12-17 08:31:56,900 DEBUG 
org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for 
region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => 
'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
 STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: 
dataSize=41660529, getHeapSize=45422304, getOffHeapSize=0, getCellsCount=49951
2019-12-17 08:31:56,901 DEBUG 
org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for 
region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => 
'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
 STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: 
dataSize=72246315, getHeapSize=78890240, getOffHeapSize=0, getCellsCount=88222
2019-12-17 08:31:56,901 WARN 
org.apache.hadoop.hbase.regionserver.DefaultMemStore: Snapshot called again 
without clearing previous. Doing nothing. Another ongoing flush or did we fail 
last attempt?
2019-12-17 08:31:56,901 DEBUG 
org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for 
region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => 
'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
 STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: 
dataSize=42016800, getHeapSize=45799776, getOffHeapSize=0, getCellsCount=50226
2019-12-17 08:31:56,901 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 :  Prepared flush with seqId:41392

{code}
This shows that snapshots combined size is actually bigger than the region's 
memStoreSizing. So when do we decrease memStoreSizing without actually flushing?

A bit earlier in the log I found this:
{code:java}
2019-12-17 08:31:43,581 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: START_FLUSH 
table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: 
"544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 40445 store_flushes { 
family_name: "f2" store_home_dir: "f2" } region_name: 
"IntegrationTestRegionReplicaReplication,bbbbbb
bb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
2019-12-17 08:31:43,582 WARN org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Received a flush prepare marker with a 
larger seqId: 40445 before clearing the previous one with seqId: 20747. Ignoring
2019-12-17 08:31:43,582 TRACE org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Store files for region: 
2019-12-17 08:31:43,582 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
writing data to region 
IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.
 with WAL disabled. Data may be lost in the event of a crash.
2019-12-17 08:31:43,586 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: COMMIT_FLUSH 
table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: 
"544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 40445 store_flushes { 
family_name: "f2" store_home_dir: "f2" flush_output: 
"3f6a5302b36a4a65a6caa291a03f108d" } region_name: 
"IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
2019-12-17 08:31:43,586 WARN org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Received a flush commit marker with larger 
seqId: 40445 than what we have prepared with seqId: 
20747. Picking up new file and dropping prepared memstore snapshot
2019-12-17 08:31:43,587 WARN org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Unexpected: flush commit marker received 
from store f2 but no associated flush context. Ignoring
2019-12-17 08:31:43,587 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
CustomLog decrMemStoreSize. Current: dataSize=184492444, getHeapSize=201769960, 
getOffHeapSize=0, getCellsCount=229450 delta: dataSizeDelta=83677329, 
heapSizeDelta=91222080, offHeapSizeDelta=0, cellsCountDelta=100177
2019-12-17 08:31:43,587 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Not dropping memstore contents since 
replayed flush seqId: 40445 is smaller than current seqId:40481

{code}
Based on this the issues seems to originate from HRegion.replayFlushInStores. 
The StoreFlushContext is missing, we show a warning and continue the execution 
with the next StoreFlushDescriptor without actually flushing the snapshot in 
question. However replayWALFlushCommitMarker is unaware of the warn and 
continues assuming the flush was successful and decreasing the memStoreSizing 
(and doing cleanup).

 

We should consider decreasing the memStoreSizing on a per store level instead 
of on the region level and do the negativity check before we do it, potentially 
preventing a flush if needed, to make sure we keep the memStoreSizing positive.

  was:
After a flush on the replica region the memStoreSizing becomes negative:

 
{code:java}
2019-12-17 08:31:59,983 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: COMMIT_FLUSH 
table_name: "IntegrationTestRegionReplicaReplicati
on" encoded_region_name: "544affde3e027454f67c8ea46c8f69ee" 
flush_sequence_number: 41392 store_flushes { family_name: "f1" store_home_dir: 
"f1" flush_output: "3c48a23eac784a348a18e10e337d80a2" } store_flushes { 
family_name: "f2" store_home_dir: "f2" flush_output: 
"9a5283ec95694667b4ead2398af5f01e" } store_flushes { family_name: "f3" 
store_home_dir: "f3" flush_output: "e6f25e6b0eca4d22af15d0626d0f8759" } 
region_name: 
"IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
2019-12-17 08:31:59,984 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Received a flush commit marker with 
seqId:41392 and a previous prepared snapshot was found
2019-12-17 08:31:59,993 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Region: 0beaae111b0f6e98bfde31ba35be5408 added 
hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f1/3c48a23eac784a348a18e10e337d80a2,
 entries=32445, sequenceid=41392, filesize=27.6 M
2019-12-17 08:32:00,016 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Region: 0beaae111b0f6e98bfde31ba35be5408 added 
hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f2/9a5283ec95694667b4ead2398af5f01e,
 entries=12264, sequenceid=41392, filesize=10.9 M
2019-12-17 08:32:00,121 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Region: 0beaae111b0f6e98bfde31ba35be5408 added 
hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f3/e6f25e6b0eca4d22af15d0626d0f8759,
 entries=32379, sequenceid=41392, filesize=27.5 M
2019-12-17 08:32:00,122 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
CustomLog decrMemStoreSize. Current: dataSize=135810071, getHeapSize=148400960, 
getOffHeapSize=0, getCellsCount=167243 delta: dataSizeDelta=155923644, 
heapSizeDelta=170112320, offHeapSizeDelta=0, cellsCountDelta=188399
2019-12-17 08:32:00,122 ERROR org.apache.hadoop.hbase.regionserver.HRegion: 
Asked to modify this region's 
(IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be54
08.) memStoreSizing to a negative value which is incorrect. Current 
memStoreSizing=135810071, delta=-155923644
java.lang.Exception
        at 
org.apache.hadoop.hbase.regionserver.HRegion.checkNegativeMemStoreDataSize(HRegion.java:1323)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.decrMemStoreSize(HRegion.java:1316)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.decrMemStoreSize(HRegion.java:1303)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushCommitMarker(HRegion.java:5194)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushMarker(HRegion.java:5025)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.doReplayBatchOp(RSRpcServices.java:1143)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:2232)
        at 
org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29754)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)

{code}
 

 

I added some custom logging to the snapshot logic to be able to see snapshot 
sizes:

 
{code:java}
2019-12-17 08:31:56,900 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: START_FLUSH 
table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: 
"544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 41392 store_flushes { 
family_name: "f1" store_home_dir: "f1" } store_flushes { family_name: "f2" 
store_home_dir: "f2" } store_flushes { family_name: "f3" store_home_dir: "f3" } 
region_name: 
"IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
2019-12-17 08:31:56,900 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Flushing 0beaae111b0f6e98bfde31ba35be5408 3/3 column families, dataSize=126.49 
MB heapSize=138.24 MB
2019-12-17 08:31:56,900 WARN 
org.apache.hadoop.hbase.regionserver.DefaultMemStore: Snapshot called again 
without clearing previous. Doing nothing. Another ongoing flush or did we fail 
last attempt?
2019-12-17 08:31:56,900 DEBUG 
org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for 
region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => 
'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
 STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: 
dataSize=41660529, getHeapSize=45422304, getOffHeapSize=0, getCellsCount=49951
2019-12-17 08:31:56,901 DEBUG 
org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for 
region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => 
'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
 STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: 
dataSize=72246315, getHeapSize=78890240, getOffHeapSize=0, getCellsCount=88222
2019-12-17 08:31:56,901 WARN 
org.apache.hadoop.hbase.regionserver.DefaultMemStore: Snapshot called again 
without clearing previous. Doing nothing. Another ongoing flush or did we fail 
last attempt?
2019-12-17 08:31:56,901 DEBUG 
org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for 
region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => 
'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
 STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: 
dataSize=42016800, getHeapSize=45799776, getOffHeapSize=0, getCellsCount=50226
2019-12-17 08:31:56,901 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 :  Prepared flush with seqId:41392

{code}
This shows that snapshots combined size is actually bigger than the region's 
memStoreSizing. So when do we decrease memStoreSizing without actually flushing?

 

A bit earlier in the log I found this:

 
{code:java}
2019-12-17 08:31:43,581 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: START_FLUSH 
table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: 
"544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 40445 store_flushes { 
family_name: "f2" store_home_dir: "f2" } region_name: 
"IntegrationTestRegionReplicaReplication,bbbbbb
bb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
2019-12-17 08:31:43,582 WARN org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Received a flush prepare marker with a 
larger seqId: 40445 before clearing the previous one with seqId: 20747. Ignoring
2019-12-17 08:31:43,582 TRACE org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Store files for region: 
2019-12-17 08:31:43,582 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
writing data to region 
IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.
 with WAL disabled. Data may be lost in the event of a crash.
2019-12-17 08:31:43,586 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: COMMIT_FLUSH 
table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: 
"544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 40445 store_flushes { 
family_name: "f2" store_home_dir: "f2" flush_output: 
"3f6a5302b36a4a65a6caa291a03f108d" } region_name: 
"IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
2019-12-17 08:31:43,586 WARN org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Received a flush commit marker with larger 
seqId: 40445 than what we have prepared with seqId: 
20747. Picking up new file and dropping prepared memstore snapshot
2019-12-17 08:31:43,587 WARN org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Unexpected: flush commit marker received 
from store f2 but no associated flush context. Ignoring
2019-12-17 08:31:43,587 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
CustomLog decrMemStoreSize. Current: dataSize=184492444, getHeapSize=201769960, 
getOffHeapSize=0, getCellsCount=229450 delta: dataSizeDelta=83677329, 
heapSizeDelta=91222080, offHeapSizeDelta=0, cellsCountDelta=100177
2019-12-17 08:31:43,587 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
0beaae111b0f6e98bfde31ba35be5408 : Not dropping memstore contents since 
replayed flush seqId: 40445 is smaller than current seqId:40481

{code}
Based on this the issues seems to originate from HRegion.replayFlushInStores. 
The StoreFlushContext is missing, we show a warning and continue the execution 
with the next StoreFlushDescriptor without actually flushing the snapshot in 
question. However replayWALFlushCommitMarker is unaware of the warn and 
continues assuming the flush was sucessful and decreasing the memStoreSizing 
(and doing cleanup).

 

We should consider decreasing the memStoreSizing on a per store level instead 
of on the region level and do the negativity check before we do it, potentially 
preventing a flush if needed, to make sure we keep the memStoreSizing positive.

 


> Negative memStoreSizing
> -----------------------
>
>                 Key: HBASE-23591
>                 URL: https://issues.apache.org/jira/browse/HBASE-23591
>             Project: HBase
>          Issue Type: Bug
>          Components: read replicas
>            Reporter: Szabolcs Bukros
>            Priority: Major
>             Fix For: 2.2.2
>
>
> After a flush on the replica region the memStoreSizing becomes negative:
> {code:java}
> 2019-12-17 08:31:59,983 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: 
> COMMIT_FLUSH table_name: "IntegrationTestRegionReplicaReplicati
> on" encoded_region_name: "544affde3e027454f67c8ea46c8f69ee" 
> flush_sequence_number: 41392 store_flushes { family_name: "f1" 
> store_home_dir: "f1" flush_output: "3c48a23eac784a348a18e10e337d80a2" } 
> store_flushes { family_name: "f2" store_home_dir: "f2" flush_output: 
> "9a5283ec95694667b4ead2398af5f01e" } store_flushes { family_name: "f3" 
> store_home_dir: "f3" flush_output: "e6f25e6b0eca4d22af15d0626d0f8759" } 
> region_name: 
> "IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
> 2019-12-17 08:31:59,984 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 : Received a flush commit marker with 
> seqId:41392 and a previous prepared snapshot was found
> 2019-12-17 08:31:59,993 INFO org.apache.hadoop.hbase.regionserver.HStore: 
> Region: 0beaae111b0f6e98bfde31ba35be5408 added 
> hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f1/3c48a23eac784a348a18e10e337d80a2,
>  entries=32445, sequenceid=41392, filesize=27.6 M
> 2019-12-17 08:32:00,016 INFO org.apache.hadoop.hbase.regionserver.HStore: 
> Region: 0beaae111b0f6e98bfde31ba35be5408 added 
> hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f2/9a5283ec95694667b4ead2398af5f01e,
>  entries=12264, sequenceid=41392, filesize=10.9 M
> 2019-12-17 08:32:00,121 INFO org.apache.hadoop.hbase.regionserver.HStore: 
> Region: 0beaae111b0f6e98bfde31ba35be5408 added 
> hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f3/e6f25e6b0eca4d22af15d0626d0f8759,
>  entries=32379, sequenceid=41392, filesize=27.5 M
> 2019-12-17 08:32:00,122 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> CustomLog decrMemStoreSize. Current: dataSize=135810071, 
> getHeapSize=148400960, getOffHeapSize=0, getCellsCount=167243 delta: 
> dataSizeDelta=155923644, heapSizeDelta=170112320, offHeapSizeDelta=0, 
> cellsCountDelta=188399
> 2019-12-17 08:32:00,122 ERROR org.apache.hadoop.hbase.regionserver.HRegion: 
> Asked to modify this region's 
> (IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be54
> 08.) memStoreSizing to a negative value which is incorrect. Current 
> memStoreSizing=135810071, delta=-155923644
> java.lang.Exception
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkNegativeMemStoreDataSize(HRegion.java:1323)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.decrMemStoreSize(HRegion.java:1316)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.decrMemStoreSize(HRegion.java:1303)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushCommitMarker(HRegion.java:5194)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushMarker(HRegion.java:5025)
>         at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doReplayBatchOp(RSRpcServices.java:1143)
>         at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:2232)
>         at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29754)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>         at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>         at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
> {code}
> I added some custom logging to the snapshot logic to be able to see snapshot 
> sizes: 
> {code:java}
> 2019-12-17 08:31:56,900 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: START_FLUSH 
> table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: 
> "544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 41392 store_flushes 
> { family_name: "f1" store_home_dir: "f1" } store_flushes { family_name: "f2" 
> store_home_dir: "f2" } store_flushes { family_name: "f3" store_home_dir: "f3" 
> } region_name: 
> "IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
> 2019-12-17 08:31:56,900 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Flushing 0beaae111b0f6e98bfde31ba35be5408 3/3 column families, 
> dataSize=126.49 MB heapSize=138.24 MB
> 2019-12-17 08:31:56,900 WARN 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore: Snapshot called again 
> without clearing previous. Doing nothing. Another ongoing flush or did we 
> fail last attempt?
> 2019-12-17 08:31:56,900 DEBUG 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for 
> region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => 
> 'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
>  STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: 
> dataSize=41660529, getHeapSize=45422304, getOffHeapSize=0, getCellsCount=49951
> 2019-12-17 08:31:56,901 DEBUG 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for 
> region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => 
> 'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
>  STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: 
> dataSize=72246315, getHeapSize=78890240, getOffHeapSize=0, getCellsCount=88222
> 2019-12-17 08:31:56,901 WARN 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore: Snapshot called again 
> without clearing previous. Doing nothing. Another ongoing flush or did we 
> fail last attempt?
> 2019-12-17 08:31:56,901 DEBUG 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for 
> region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME => 
> 'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
>  STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore: 
> dataSize=42016800, getHeapSize=45799776, getOffHeapSize=0, getCellsCount=50226
> 2019-12-17 08:31:56,901 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 :  Prepared flush with seqId:41392
> {code}
> This shows that snapshots combined size is actually bigger than the region's 
> memStoreSizing. So when do we decrease memStoreSizing without actually 
> flushing?
> A bit earlier in the log I found this:
> {code:java}
> 2019-12-17 08:31:43,581 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: START_FLUSH 
> table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name: 
> "544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 40445 store_flushes 
> { family_name: "f2" store_home_dir: "f2" } region_name: 
> "IntegrationTestRegionReplicaReplication,bbbbbb
> bb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
> 2019-12-17 08:31:43,582 WARN org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 : Received a flush prepare marker with a 
> larger seqId: 40445 before clearing the previous one with seqId: 20747. 
> Ignoring
> 2019-12-17 08:31:43,582 TRACE org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 : Store files for region: 
> 2019-12-17 08:31:43,582 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> writing data to region 
> IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.
>  with WAL disabled. Data may be lost in the event of a crash.
> 2019-12-17 08:31:43,586 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: 
> COMMIT_FLUSH table_name: "IntegrationTestRegionReplicaReplication" 
> encoded_region_name: "544affde3e027454f67c8ea46c8f69ee" 
> flush_sequence_number: 40445 store_flushes { family_name: "f2" 
> store_home_dir: "f2" flush_output: "3f6a5302b36a4a65a6caa291a03f108d" } 
> region_name: 
> "IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
> 2019-12-17 08:31:43,586 WARN org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 : Received a flush commit marker with larger 
> seqId: 40445 than what we have prepared with seqId: 
> 20747. Picking up new file and dropping prepared memstore snapshot
> 2019-12-17 08:31:43,587 WARN org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 : Unexpected: flush commit marker received 
> from store f2 but no associated flush context. Ignoring
> 2019-12-17 08:31:43,587 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> CustomLog decrMemStoreSize. Current: dataSize=184492444, 
> getHeapSize=201769960, getOffHeapSize=0, getCellsCount=229450 delta: 
> dataSizeDelta=83677329, heapSizeDelta=91222080, offHeapSizeDelta=0, 
> cellsCountDelta=100177
> 2019-12-17 08:31:43,587 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> 0beaae111b0f6e98bfde31ba35be5408 : Not dropping memstore contents since 
> replayed flush seqId: 40445 is smaller than current seqId:40481
> {code}
> Based on this the issues seems to originate from HRegion.replayFlushInStores. 
> The StoreFlushContext is missing, we show a warning and continue the 
> execution with the next StoreFlushDescriptor without actually flushing the 
> snapshot in question. However replayWALFlushCommitMarker is unaware of the 
> warn and continues assuming the flush was successful and decreasing the 
> memStoreSizing (and doing cleanup).
>  
> We should consider decreasing the memStoreSizing on a per store level instead 
> of on the region level and do the negativity check before we do it, 
> potentially preventing a flush if needed, to make sure we keep the 
> memStoreSizing positive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HBASE-23591) Negative memStoreSizing

Reply via email to