[
https://issues.apache.org/jira/browse/HBASE-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654288#action_12654288
]
Andrew Purtell commented on HBASE-879:
--------------------------------------
Had another of what I first thought was an instance of HBASE-1046.
Some regions were damaged during a write test last night: lease expired
exceptions, NotYetReplicated exceptions, "file does not exist" errors on
mapfile data files, EOF exceptions while processing reconstruction logs.
Compactions/splits failed, leading to dead regions in both of my large tables
and complete loss of ability to scan the tables in question.
Enable/disable table did not resolve the problem of dead regions. However,
after a full restart of all HBase daemons, all tables are fully back online.
This last point is interesting.
Based on this there should be something that can be done to bring the regions
back from the dead without resorting
to a complete shutdown of all HBase daemons, correct? Can the regionservers
just reinitialize their DFS client if they start taking filesystem exceptions?
> When dfs restarts or moves blocks around, hbase regionservers don't notice
> --------------------------------------------------------------------------
>
> Key: HBASE-879
> URL: https://issues.apache.org/jira/browse/HBASE-879
> Project: Hadoop HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.2.0
> Reporter: Michael Bieniosek
>
> Since the hbase regionservers use a DFSClient to keep handles open to the
> dfs, if the dfs blocks move around (typically because of a dfs restart, but
> can also happen if datanodes die or blocks get shuffled around), the
> regionserver will be unable to service the region. It would be nice if the
> DFSClient that the regionservers use could notice this case and refresh the
> block list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.