[
https://issues.apache.org/jira/browse/HBASE-23591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guanghao Zhang updated HBASE-23591:
-----------------------------------
Fix Version/s: (was: 2.2.2)
2.2.7
> Negative memStoreSizing
> -----------------------
>
> Key: HBASE-23591
> URL: https://issues.apache.org/jira/browse/HBASE-23591
> Project: HBase
> Issue Type: Bug
> Components: read replicas
> Reporter: Szabolcs Bukros
> Priority: Major
> Fix For: 2.2.7
>
>
> After a flush on the replica region the memStoreSizing becomes negative:
> {code:java}
> 2019-12-17 08:31:59,983 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action:
> COMMIT_FLUSH table_name: "IntegrationTestRegionReplicaReplicati
> on" encoded_region_name: "544affde3e027454f67c8ea46c8f69ee"
> flush_sequence_number: 41392 store_flushes { family_name: "f1"
> store_home_dir: "f1" flush_output: "3c48a23eac784a348a18e10e337d80a2" }
> store_flushes { family_name: "f2" store_home_dir: "f2" flush_output:
> "9a5283ec95694667b4ead2398af5f01e" } store_flushes { family_name: "f3"
> store_home_dir: "f3" flush_output: "e6f25e6b0eca4d22af15d0626d0f8759" }
> region_name:
> "IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
> 2019-12-17 08:31:59,984 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Received a flush commit marker with
> seqId:41392 and a previous prepared snapshot was found
> 2019-12-17 08:31:59,993 INFO org.apache.hadoop.hbase.regionserver.HStore:
> Region: 0beaae111b0f6e98bfde31ba35be5408 added
> hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f1/3c48a23eac784a348a18e10e337d80a2,
> entries=32445, sequenceid=41392, filesize=27.6 M
> 2019-12-17 08:32:00,016 INFO org.apache.hadoop.hbase.regionserver.HStore:
> Region: 0beaae111b0f6e98bfde31ba35be5408 added
> hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f2/9a5283ec95694667b4ead2398af5f01e,
> entries=12264, sequenceid=41392, filesize=10.9 M
> 2019-12-17 08:32:00,121 INFO org.apache.hadoop.hbase.regionserver.HStore:
> Region: 0beaae111b0f6e98bfde31ba35be5408 added
> hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/544affde3e027454f67c8ea46c8f69ee/f3/e6f25e6b0eca4d22af15d0626d0f8759,
> entries=32379, sequenceid=41392, filesize=27.5 M
> 2019-12-17 08:32:00,122 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> CustomLog decrMemStoreSize. Current: dataSize=135810071,
> getHeapSize=148400960, getOffHeapSize=0, getCellsCount=167243 delta:
> dataSizeDelta=155923644, heapSizeDelta=170112320, offHeapSizeDelta=0,
> cellsCountDelta=188399
> 2019-12-17 08:32:00,122 ERROR org.apache.hadoop.hbase.regionserver.HRegion:
> Asked to modify this region's
> (IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be54
> 08.) memStoreSizing to a negative value which is incorrect. Current
> memStoreSizing=135810071, delta=-155923644
> java.lang.Exception
> at
> org.apache.hadoop.hbase.regionserver.HRegion.checkNegativeMemStoreDataSize(HRegion.java:1323)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.decrMemStoreSize(HRegion.java:1316)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.decrMemStoreSize(HRegion.java:1303)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushCommitMarker(HRegion.java:5194)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushMarker(HRegion.java:5025)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doReplayBatchOp(RSRpcServices.java:1143)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:2232)
> at
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29754)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
> {code}
> I added some custom logging to the snapshot logic to be able to see snapshot
> sizes:
> {code:java}
> 2019-12-17 08:31:56,900 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: START_FLUSH
> table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name:
> "544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 41392 store_flushes
> { family_name: "f1" store_home_dir: "f1" } store_flushes { family_name: "f2"
> store_home_dir: "f2" } store_flushes { family_name: "f3" store_home_dir: "f3"
> } region_name:
> "IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
> 2019-12-17 08:31:56,900 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Flushing 0beaae111b0f6e98bfde31ba35be5408 3/3 column families,
> dataSize=126.49 MB heapSize=138.24 MB
> 2019-12-17 08:31:56,900 WARN
> org.apache.hadoop.hbase.regionserver.DefaultMemStore: Snapshot called again
> without clearing previous. Doing nothing. Another ongoing flush or did we
> fail last attempt?
> 2019-12-17 08:31:56,900 DEBUG
> org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for
> region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME =>
> 'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
> STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore:
> dataSize=41660529, getHeapSize=45422304, getOffHeapSize=0, getCellsCount=49951
> 2019-12-17 08:31:56,901 DEBUG
> org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for
> region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME =>
> 'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
> STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore:
> dataSize=72246315, getHeapSize=78890240, getOffHeapSize=0, getCellsCount=88222
> 2019-12-17 08:31:56,901 WARN
> org.apache.hadoop.hbase.regionserver.DefaultMemStore: Snapshot called again
> without clearing previous. Doing nothing. Another ongoing flush or did we
> fail last attempt?
> 2019-12-17 08:31:56,901 DEBUG
> org.apache.hadoop.hbase.regionserver.DefaultMemStore: CustomLog snapshot for
> region: {ENCODED => 0beaae111b0f6e98bfde31ba35be5408, NAME =>
> 'IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.',
> STARTKEY => 'bbbbbbbb', ENDKEY => 'cccccccc', REPLICA_ID => 1} memstore:
> dataSize=42016800, getHeapSize=45799776, getOffHeapSize=0, getCellsCount=50226
> 2019-12-17 08:31:56,901 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Prepared flush with seqId:41392
> {code}
> This shows that snapshots combined size is actually bigger than the region's
> memStoreSizing. So when do we decrease memStoreSizing without actually
> flushing?
> A bit earlier in the log I found this:
> {code:java}
> 2019-12-17 08:31:43,581 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action: START_FLUSH
> table_name: "IntegrationTestRegionReplicaReplication" encoded_region_name:
> "544affde3e027454f67c8ea46c8f69ee" flush_sequence_number: 40445 store_flushes
> { family_name: "f2" store_home_dir: "f2" } region_name:
> "IntegrationTestRegionReplicaReplication,bbbbbb
> bb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
> 2019-12-17 08:31:43,582 WARN org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Received a flush prepare marker with a
> larger seqId: 40445 before clearing the previous one with seqId: 20747.
> Ignoring
> 2019-12-17 08:31:43,582 TRACE org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Store files for region:
> 2019-12-17 08:31:43,582 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> writing data to region
> IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697_0001.0beaae111b0f6e98bfde31ba35be5408.
> with WAL disabled. Data may be lost in the event of a crash.
> 2019-12-17 08:31:43,586 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Replaying flush marker action:
> COMMIT_FLUSH table_name: "IntegrationTestRegionReplicaReplication"
> encoded_region_name: "544affde3e027454f67c8ea46c8f69ee"
> flush_sequence_number: 40445 store_flushes { family_name: "f2"
> store_home_dir: "f2" flush_output: "3f6a5302b36a4a65a6caa291a03f108d" }
> region_name:
> "IntegrationTestRegionReplicaReplication,bbbbbbbb,1576599911697.544affde3e027454f67c8ea46c8f69ee."
> 2019-12-17 08:31:43,586 WARN org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Received a flush commit marker with larger
> seqId: 40445 than what we have prepared with seqId:
> 20747. Picking up new file and dropping prepared memstore snapshot
> 2019-12-17 08:31:43,587 WARN org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Unexpected: flush commit marker received
> from store f2 but no associated flush context. Ignoring
> 2019-12-17 08:31:43,587 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> CustomLog decrMemStoreSize. Current: dataSize=184492444,
> getHeapSize=201769960, getOffHeapSize=0, getCellsCount=229450 delta:
> dataSizeDelta=83677329, heapSizeDelta=91222080, offHeapSizeDelta=0,
> cellsCountDelta=100177
> 2019-12-17 08:31:43,587 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> 0beaae111b0f6e98bfde31ba35be5408 : Not dropping memstore contents since
> replayed flush seqId: 40445 is smaller than current seqId:40481
> {code}
> Based on this the issues seems to originate from HRegion.replayFlushInStores.
> The StoreFlushContext is missing, we show a warning and continue the
> execution with the next StoreFlushDescriptor without actually flushing the
> snapshot in question. However replayWALFlushCommitMarker is unaware of the
> warn and continues assuming the flush was successful and decreasing the
> memStoreSizing (and doing cleanup).
>
> We should consider decreasing the memStoreSizing on a per store level instead
> of on the region level and do the negativity check before we do it,
> potentially preventing a flush if needed, to make sure we keep the
> memStoreSizing positive.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)