[
https://issues.apache.org/jira/browse/HBASE-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vikas Vishwakarma updated HBASE-13592:
--------------------------------------
Attachment: HBASE-13592-0.98.patch
> RegionServer sometimes gets stuck during shutdown in case of cache flush
> failures
> ---------------------------------------------------------------------------------
>
> Key: HBASE-13592
> URL: https://issues.apache.org/jira/browse/HBASE-13592
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.98.10
> Reporter: Vikas Vishwakarma
> Assignee: Vikas Vishwakarma
> Attachments: HBASE-13592-0.98.patch
>
>
> Observed that RegionServer sometimes gets stuck during shutdown in case of
> cache flush failures. On adding few debug logs and looking through the stack
> trace RegionServer process looks stuck in closeWAL -> hlog.close ->
> closeBarrier.stopAndDrainOps(); during the shutdown sequence in the run method
> From the RegionServer logs we see there are multiple attempts to flush cache
> for a particular region which increments the beginOp count in DrainBarrier
> but all the flush attempts fails somewhere in wal sync and the DrainBarrier
> endOp count decrement never happens. Later on when shutdown is initiated
> RegionServer process is permanently stuck here
> In this case hbase stop also does not work and RegionServer process has to be
> explicitly killed using kill -9
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)