Vikas Vishwakarma created HBASE-13592:
-----------------------------------------
Summary: RegionServer sometimes gets stuck during shutdown in case
of cache flush failures
Key: HBASE-13592
URL: https://issues.apache.org/jira/browse/HBASE-13592
Project: HBase
Issue Type: Bug
Affects Versions: 0.98.10
Reporter: Vikas Vishwakarma
Assignee: Vikas Vishwakarma
Observed that RegionServer sometimes gets stuck during shutdown in case of
cache flush failures. On adding few debug logs and looking through the stack
trace RegionServer process looks stuck in closeWAL -> hlog.close ->
closeBarrier.stopAndDrainOps(); during the shutdown sequence in the run method
>From the RegionServer logs we see there are multiple attempts to flush cache
>for a particular region which increments the beginOp count in DrainBarrier but
>all the flush attempts fails somewhere in wal sync and the DrainBarrier endOp
>count decrement never happens. Later on when shutdown is initiated
>RegionServer process is permanently stuck here
In this case hbase stop also does not work and RegionServer process has to be
explicitly killed using kill -9
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)