[
https://issues.apache.org/jira/browse/HBASE-26026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
chenglei updated HBASE-26026:
-----------------------------
Description:
Sometimes I observed that HBase Write might be stuck in my hbase cluster which
enabling {{CompactingMemStore}}. I have simulated the problem by unit test in
my PR. The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}}
:
{code:java}
425 private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell
cellToAdd,
426 MemStoreSizing memstoreSizing) {
427 if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
428 if (currActive.setInMemoryFlushed()) {
429 flushInMemory(currActive);
430 if (setInMemoryCompactionFlag()) {
431 // The thread is dispatched to do in-memory compaction in the
background
......
}
{code}
In line 427, if the sum of {{currActive.getDataSize}} adding the size of
{{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}}, then
{{currActive}} should be flushed, {{MutableSegment.setInMemoryFlushed()}} is
invoked in above line 428 :
{code:java}
public boolean setInMemoryFlushed() {
return flushed.compareAndSet(false, true);
}
{code}
for above line 429 {{currActive.flushed}} is true, and
{{CompactingMemStore.flushInMemory}} invokes
{{CompactingMemStore.pushActiveToPipeline}} furthermore:
{code:java}
protected void pushActiveToPipeline(MutableSegment currActive) {
if (!currActive.isEmpty()) {
pipeline.pushHead(currActive);
resetActive();
}
}
{code}
For above {{CompactingMemStore.pushActiveToPipeline}} , if the
{{currActive.cellSet}} is empty, then nothing is done. But due to concurrent
write and because we add cell size to
{{currActive.getDataSize}} and then add cell to {{currActive.cellSet}}, it is
possible that {{currActive.getDataSize}} could not accommodate more cell but
{{currActive.cellSet}} is empty because pending writes which not yet add cells
to {{currActive.cellSet}}.
So now, {{currActive.flushed}} is true,and new writes still continue target to
{{currActive}}, but {{currActive}} could not enter {{flushInMemory}} again,no
new active segment could be created, and in the end all writes would be stuck.
In my opinion , once {{currActive.flushed}} is set true, it could not use as
{{ActiveSegment}} again, and because of concurrent pending writes, only after
{{currActive.updatesLock.writeLock()}} is acquired in
{{CompactingMemStore.inMemoryCompaction}} ,we can safely check {{currActive}}
is empty or not.
was:
Sometimes I observed that HBase Write might be stuck in my hbase cluster which
enabling {{CompactingMemStore}}. I have simulated the problem by unit test in
my PR. The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}}
:
{code:java}
425 private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell
cellToAdd,
426 MemStoreSizing memstoreSizing) {
427 if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
428 if (currActive.setInMemoryFlushed()) {
429 flushInMemory(currActive);
430 if (setInMemoryCompactionFlag()) {
431 // The thread is dispatched to do in-memory compaction in the
background
......
}
{code}
In line 427, if the sum of {{currActive.getDataSize}} adding the size of
{{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}}, then
{{currActive}} should be flushed, {{MutableSegment.setInMemoryFlushed()}} is
invoked in above line 428 :
{code:java}
public boolean setInMemoryFlushed() {
return flushed.compareAndSet(false, true);
}
{code}
for above line 429 {{currActive.flushed}} is true, and
{{CompactingMemStore.flushInMemory}} invokes
{{CompactingMemStore.pushActiveToPipeline}} furthermore:
{code:java}
protected void pushActiveToPipeline(MutableSegment currActive) {
if (!currActive.isEmpty()) {
pipeline.pushHead(currActive);
resetActive();
}
}
{code:}
For above {{CompactingMemStore.pushActiveToPipeline}} , if the
{{currActive.cellSet}} is empty, then nothing is done. But due to concurrent
write and because we add cell size to
{{currActive.getDataSize}} and then add cell to {{currActive.cellSet}}, it is
possible that {{currActive.getDataSize}} could not accommodate more cell but
{{currActive.cellSet}} is empty because pending writes which not yet add cells
to {{currActive.cellSet}}.
So now, {{currActive.flushed}} is true,and new writes still continue target to
{{currActive}}, but {{currActive}} could not enter {{flushInMemory}} again,no
new active segment could be created, and in the end all writes would be stuck.
In my opinion , once {{currActive.flushed}} is set true, it could not use as
{{ActiveSegment}} again, and because of concurrent pending writes, only after
{{currActive.updatesLock.writeLock()}} is acquired in
{{CompactingMemStore.inMemoryCompaction}} ,we can safely check {{currActive}}
is empty or not.
> HBase Write may be stuck forever when using CompactingMemStore
> --------------------------------------------------------------
>
> Key: HBASE-26026
> URL: https://issues.apache.org/jira/browse/HBASE-26026
> Project: HBase
> Issue Type: Bug
> Components: in-memory-compaction
> Affects Versions: 2.3.0, 2.4.0
> Reporter: chenglei
> Priority: Major
>
> Sometimes I observed that HBase Write might be stuck in my hbase cluster
> which enabling {{CompactingMemStore}}. I have simulated the problem by unit
> test in my PR. The problem is caused by
> {{CompactingMemStore.checkAndAddToActiveSize}} :
> {code:java}
> 425 private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell
> cellToAdd,
> 426 MemStoreSizing memstoreSizing) {
> 427 if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
> 428 if (currActive.setInMemoryFlushed()) {
> 429 flushInMemory(currActive);
> 430 if (setInMemoryCompactionFlag()) {
> 431 // The thread is dispatched to do in-memory compaction in the
> background
> ......
> }
> {code}
> In line 427, if the sum of {{currActive.getDataSize}} adding the size of
> {{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}}, then
> {{currActive}} should be flushed, {{MutableSegment.setInMemoryFlushed()}} is
> invoked in above line 428 :
> {code:java}
> public boolean setInMemoryFlushed() {
> return flushed.compareAndSet(false, true);
> }
> {code}
> for above line 429 {{currActive.flushed}} is true, and
> {{CompactingMemStore.flushInMemory}} invokes
> {{CompactingMemStore.pushActiveToPipeline}} furthermore:
> {code:java}
> protected void pushActiveToPipeline(MutableSegment currActive) {
> if (!currActive.isEmpty()) {
> pipeline.pushHead(currActive);
> resetActive();
> }
> }
> {code}
> For above {{CompactingMemStore.pushActiveToPipeline}} , if the
> {{currActive.cellSet}} is empty, then nothing is done. But due to concurrent
> write and because we add cell size to
> {{currActive.getDataSize}} and then add cell to {{currActive.cellSet}}, it is
> possible that {{currActive.getDataSize}} could not accommodate more cell but
> {{currActive.cellSet}} is empty because pending writes which not yet add
> cells to {{currActive.cellSet}}.
> So now, {{currActive.flushed}} is true,and new writes still continue target
> to {{currActive}}, but {{currActive}} could not enter {{flushInMemory}}
> again,no new active segment could be created, and in the end all writes would
> be stuck.
> In my opinion , once {{currActive.flushed}} is set true, it could not use as
> {{ActiveSegment}} again, and because of concurrent pending writes, only after
> {{currActive.updatesLock.writeLock()}} is acquired in
> {{CompactingMemStore.inMemoryCompaction}} ,we can safely check
> {{currActive}} is empty or not.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)