[
https://issues.apache.org/jira/browse/HBASE-11902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225962#comment-14225962
]
Qiang Tian commented on HBASE-11902:
------------------------------------
the TestLogRolling creates a similar error scenario with this case.
the testcase failure is because of below code:
{code}
// verify the written rows are there
assertTrue(loggedRows.contains("row1002"));
assertTrue(loggedRows.contains("row1003"));
assertTrue(loggedRows.contains("row1004"));
assertTrue(loggedRows.contains("row1005"));
// flush all regions
List<HRegion> regions = new
ArrayList<HRegion>(server.getOnlineRegionsLocalContext());
for (HRegion r: regions) {
r.flushcache(); // <===the re-throwed exception
will end the testcase
}
{code}
adding a try/catch for flushcache call make it pass.
> RegionServer was blocked while aborting
> ---------------------------------------
>
> Key: HBASE-11902
> URL: https://issues.apache.org/jira/browse/HBASE-11902
> Project: HBase
> Issue Type: Bug
> Components: regionserver, wal
> Affects Versions: 0.98.4
> Environment: hbase-0.98.4, hadoop-2.3.0-cdh5.1, jdk1.7
> Reporter: Victor Xu
> Assignee: Qiang Tian
> Attachments: hbase-hadoop-regionserver-hadoop461.cm6.log,
> hbase11902-master.patch, jstack_hadoop461.cm6.log
>
>
> Generally, regionserver automatically aborts when isHealth() returns false.
> But it sometimes got blocked while aborting. I saved the jstack and logs, and
> found out that it was caused by datanodes failures. The "regionserver60020"
> thread was blocked while closing WAL.
> This issue doesn't happen so frequently, but if it happens, it always leads
> to huge amount of requests failure. The only way to do is KILL -9.
> I think it's a bug, but I haven't found a decent solution. Does anyone have
> the same problem?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)