[ 
https://issues.apache.org/jira/browse/HBASE-26026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-26026:
------------------------------
    Hadoop Flags: Reviewed
      Resolution: Fixed
          Status: Resolved  (was: Patch Available)

Pushed to branch-2.3+.

Thanks [~comnetwork] for contributing.

> HBase Write may be stuck forever when using CompactingMemStore
> --------------------------------------------------------------
>
>                 Key: HBASE-26026
>                 URL: https://issues.apache.org/jira/browse/HBASE-26026
>             Project: HBase
>          Issue Type: Bug
>          Components: in-memory-compaction
>    Affects Versions: 3.0.0-alpha-1, 2.3.0, 2.4.0
>            Reporter: chenglei
>            Assignee: chenglei
>            Priority: Critical
>             Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.6, 2.3.7
>
>
> Sometimes I observed that HBase Write might be stuck  in my hbase cluster 
> which enabling {{CompactingMemStore}}.  I have simulated the problem  by unit 
> test in my PR. 
> The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} : 
> {code:java}
> 425   private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell 
> cellToAdd,
> 426      MemStoreSizing memstoreSizing) {
> 427    if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
> 428      if (currActive.setInMemoryFlushed()) {
> 429        flushInMemory(currActive);
> 430        if (setInMemoryCompactionFlag()) {
> 431         // The thread is dispatched to do in-memory compaction in the 
> background
>               ......
>  }
> {code}
> In line 427, {{shouldFlushInMemory}} checking if  {{currActive.getDataSize}} 
> adding the size of {{cellToAdd}} exceeds 
> {{CompactingMemStore.inmemoryFlushSize}},if true,  then  {{currActive}} 
> should be flushed, {{currActive.setInMemoryFlushed()}} is invoked in  line 
> 428 :
> {code:java}
> public boolean setInMemoryFlushed() {
>     return flushed.compareAndSet(false, true);
>   }
> {code}
> After sucessfully set {{currActive.flushed}} to true, in above line 429 
> {{flushInMemory(currActive)}} invokes 
> {{CompactingMemStore.pushActiveToPipeline}} :
> {code:java}
>  protected void pushActiveToPipeline(MutableSegment currActive) {
>     if (!currActive.isEmpty()) {
>       pipeline.pushHead(currActive);
>       resetActive();
>     }
>   }
> {code}
> In above {{CompactingMemStore.pushActiveToPipeline}} method , if the 
> {{currActive.cellSet}} is empty, then nothing is done. Due to  concurrent 
> writes and because we first add cell size to {{currActive.getDataSize}} and 
> then actually add cell to {{currActive.cellSet}}, it is possible that 
> {{currActive.getDataSize}} could not accommodate {{cellToAdd}}  but 
> {{currActive.cellSet}} is still empty if pending writes which not yet add 
> cells to {{currActive.cellSet}}.
> So if the {{currActive.cellSet}} is empty now, then no {{ActiveSegment}} is 
> created, and new writes still continue target to {{currActive}}, but 
> {{currActive.flushed}} is true, {{currActive}} could not enter 
> {{flushInMemory(currActive)}} again,and new  {{ActiveSegment}} could not be 
> created forever !  In the end all writes would be stuck.
> In my opinion , once  {{currActive.flushed}} is set true, it could not 
> continue use as {{ActiveSegment}} , and because of concurrent pending writes, 
> only after {{currActive.updatesLock.writeLock()}} is acquired(i.e. 
> {{currActive.waitForUpdates}} is called) in 
> {{CompactingMemStore.inMemoryCompaction}} ,we can safely say {{currActive}}  
> is empty or not.
> My fix is remove the {{if (!currActive.isEmpty())}} check here and left the 
> check to background {{InMemoryCompactionRunnable}} after 
> {{currActive.waitForUpdates}} is called. An alternative fix is we use 
> synchronization mechanism in {{checkAndAddToActiveSize}} method to prevent 
> all writes , wait for all pending write completed(i.e. 
> currActive.waitForUpdates is called) and if {{currActive}} is still empty 
> ,then we set {{currActive.flushed}} back to false,but I am not inclined to 
> use so heavy synchronization in write path, and I think we would better 
> maintain lockless implementation for {{CompactingMemStore.add}} method just 
> as now and {{currActive.waitForUpdates}} would better be left in background 
> {{InMemoryCompactionRunnable}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to