[ 
https://issues.apache.org/jira/browse/HBASE-26026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-26026:
-----------------------------
    Description: 
Sometimes I observed that HBase Write might be stuck  in my hbase cluster which 
enabling {{CompactingMemStore}}.  I have simulated the problem  by unit test in 
my PR. 
The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} : 
{code:java}
425   private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell 
cellToAdd,
426      MemStoreSizing memstoreSizing) {
427    if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
428      if (currActive.setInMemoryFlushed()) {
429        flushInMemory(currActive);
430        if (setInMemoryCompactionFlag()) {
431         // The thread is dispatched to do in-memory compaction in the 
background
              ......
 }
{code}
In line 427, if  the sum of {{currActive.getDataSize}} adding the size of 
{{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}}, then  
{{currActive}} should be flushed, {{MutableSegment.setInMemoryFlushed()}} is 
invoked in above line 428 :
{code:java}
public boolean setInMemoryFlushed() {
    return flushed.compareAndSet(false, true);
  }
{code}
After set {{currActive.flushed}} to true, in above line 429 
{{flushInMemory(currActive)}} invokes 
{{CompactingMemStore.pushActiveToPipeline}} furthermore:
{code:java}
 protected void pushActiveToPipeline(MutableSegment currActive) {
    if (!currActive.isEmpty()) {
      pipeline.pushHead(currActive);
      resetActive();
    }
  }
{code}
For above {{CompactingMemStore.pushActiveToPipeline}} , if the 
{{currActive.cellSet}} is empty, then nothing is done. But due to  concurrent 
write and because we first add cell size to {{currActive.getDataSize}} and then 
actually add cell to {{currActive.cellSet}}, it is possible that 
{{currActive.getDataSize}} could not accommodate more cell but 
{{currActive.cellSet}} is empty because pending writes which not yet add cells 
to {{currActive.cellSet}}.
So now,  {{currActive.flushed}} is true,and new writes still continue target to 
{{currActive}}, but {{currActive}} could not enter {{flushInMemory}} again,no 
new active segment could be created, and in the end all writes would be stuck.

In my opinion , once  {{currActive.flushed}} is set true, it could not use as 
{{ActiveSegment}} again, and because of concurrent pending writes, only after 
{{currActive.updatesLock.writeLock()}} is acquired in 
{{CompactingMemStore.inMemoryCompaction}} ,we can safely check  {{currActive}}  
is empty or not.







  was:
Sometimes I observed that HBase Write might be stuck  in my hbase cluster which 
enabling {{CompactingMemStore}}.  I have simulated the problem  by unit test in 
my PR. 
The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} : 
{code:java}
425   private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell 
cellToAdd,
426      MemStoreSizing memstoreSizing) {
427    if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
428      if (currActive.setInMemoryFlushed()) {
429        flushInMemory(currActive);
430        if (setInMemoryCompactionFlag()) {
431         // The thread is dispatched to do in-memory compaction in the 
background
              ......
 }
{code}
In line 427, if  the sum of {{currActive.getDataSize}} adding the size of 
{{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}}, then  
{{currActive}} should be flushed, {{MutableSegment.setInMemoryFlushed()}} is 
invoked in above line 428 :
{code:java}
public boolean setInMemoryFlushed() {
    return flushed.compareAndSet(false, true);
  }
{code}
for above line 429 {{currActive.flushed}} is true, and 
{{CompactingMemStore.flushInMemory}} invokes 
{{CompactingMemStore.pushActiveToPipeline}} furthermore:
{code:java}
 protected void pushActiveToPipeline(MutableSegment currActive) {
    if (!currActive.isEmpty()) {
      pipeline.pushHead(currActive);
      resetActive();
    }
  }
{code}
For above {{CompactingMemStore.pushActiveToPipeline}} , if the 
{{currActive.cellSet}} is empty, then nothing is done. But due to  concurrent 
write and because we add cell size to
{{currActive.getDataSize}} and then add cell to {{currActive.cellSet}}, it is 
possible that {{currActive.getDataSize}} could not accommodate more cell but 
{{currActive.cellSet}} is empty because pending writes which not yet add cells 
to {{currActive.cellSet}}.
So now,  {{currActive.flushed}} is true,and new writes still continue target to 
{{currActive}}, but {{currActive}} could not enter {{flushInMemory}} again,no 
new active segment could be created, and in the end all writes would be stuck.

In my opinion , once  {{currActive.flushed}} is set true, it could not use as 
{{ActiveSegment}} again, and because of concurrent pending writes, only after 
{{currActive.updatesLock.writeLock()}} is acquired in 
{{CompactingMemStore.inMemoryCompaction}} ,we can safely check  {{currActive}}  
is empty or not.








> HBase Write may be stuck forever when using CompactingMemStore
> --------------------------------------------------------------
>
>                 Key: HBASE-26026
>                 URL: https://issues.apache.org/jira/browse/HBASE-26026
>             Project: HBase
>          Issue Type: Bug
>          Components: in-memory-compaction
>    Affects Versions: 2.3.0, 2.4.0
>            Reporter: chenglei
>            Priority: Major
>
> Sometimes I observed that HBase Write might be stuck  in my hbase cluster 
> which enabling {{CompactingMemStore}}.  I have simulated the problem  by unit 
> test in my PR. 
> The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} : 
> {code:java}
> 425   private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell 
> cellToAdd,
> 426      MemStoreSizing memstoreSizing) {
> 427    if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
> 428      if (currActive.setInMemoryFlushed()) {
> 429        flushInMemory(currActive);
> 430        if (setInMemoryCompactionFlag()) {
> 431         // The thread is dispatched to do in-memory compaction in the 
> background
>               ......
>  }
> {code}
> In line 427, if  the sum of {{currActive.getDataSize}} adding the size of 
> {{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}}, then  
> {{currActive}} should be flushed, {{MutableSegment.setInMemoryFlushed()}} is 
> invoked in above line 428 :
> {code:java}
> public boolean setInMemoryFlushed() {
>     return flushed.compareAndSet(false, true);
>   }
> {code}
> After set {{currActive.flushed}} to true, in above line 429 
> {{flushInMemory(currActive)}} invokes 
> {{CompactingMemStore.pushActiveToPipeline}} furthermore:
> {code:java}
>  protected void pushActiveToPipeline(MutableSegment currActive) {
>     if (!currActive.isEmpty()) {
>       pipeline.pushHead(currActive);
>       resetActive();
>     }
>   }
> {code}
> For above {{CompactingMemStore.pushActiveToPipeline}} , if the 
> {{currActive.cellSet}} is empty, then nothing is done. But due to  concurrent 
> write and because we first add cell size to {{currActive.getDataSize}} and 
> then actually add cell to {{currActive.cellSet}}, it is possible that 
> {{currActive.getDataSize}} could not accommodate more cell but 
> {{currActive.cellSet}} is empty because pending writes which not yet add 
> cells to {{currActive.cellSet}}.
> So now,  {{currActive.flushed}} is true,and new writes still continue target 
> to {{currActive}}, but {{currActive}} could not enter {{flushInMemory}} 
> again,no new active segment could be created, and in the end all writes would 
> be stuck.
> In my opinion , once  {{currActive.flushed}} is set true, it could not use as 
> {{ActiveSegment}} again, and because of concurrent pending writes, only after 
> {{currActive.updatesLock.writeLock()}} is acquired in 
> {{CompactingMemStore.inMemoryCompaction}} ,we can safely check  
> {{currActive}}  is empty or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to