[
https://issues.apache.org/jira/browse/HBASE-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520238#comment-14520238
]
Hudson commented on HBASE-13592:
--------------------------------
FAILURE: Integrated in HBase-0.98 #968 (See
[https://builds.apache.org/job/HBase-0.98/968/])
HBASE-13592 RegionServer sometimes gets stuck during shutdown in case of cache
flush failures. (Vikas Vishwakarma) (larsh: rev
f6b418544e1174e960a188d6ac3eb0c0c2678af3)
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
> RegionServer sometimes gets stuck during shutdown in case of cache flush
> failures
> ---------------------------------------------------------------------------------
>
> Key: HBASE-13592
> URL: https://issues.apache.org/jira/browse/HBASE-13592
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.98.10
> Reporter: Vikas Vishwakarma
> Assignee: Vikas Vishwakarma
> Fix For: 0.98.13
>
> Attachments: HBASE-13592-0.98.patch
>
>
> Observed that RegionServer sometimes gets stuck during shutdown in case of
> cache flush failures. On adding few debug logs and looking through the stack
> trace RegionServer process looks stuck in closeWAL -> hlog.close ->
> closeBarrier.stopAndDrainOps(); during the shutdown sequence in the run method
> From the RegionServer logs we see there are multiple attempts to flush cache
> for a particular region which increments the beginOp count in DrainBarrier
> but all the flush attempts fails somewhere in wal sync and the DrainBarrier
> endOp count decrement never happens. Later on when shutdown is initiated
> RegionServer process is permanently stuck here
> In this case hbase stop also does not work and RegionServer process has to be
> explicitly killed using kill -9
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)