[
https://issues.apache.org/jira/browse/HBASE-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602334#comment-13602334
]
Time Less commented on HBASE-8105:
----------------------------------
The RS runs the whole time, so yes, still running when ports are re-opened.
It definitely would lose its ZK connection. But then, I would expect when
it begins communicating again with ZK, it would note its "I've been ejected
from the cluster" status and rejoin, or RS process die, or something. RS
process keeps running normally, but not part of the cluster seems an
erroneous state.
On Thu, Mar 14, 2013 at 8:06 AM, Jean-Marc Spaggiari (JIRA) <[email protected]
--
*Tim Ellis: *Fifth Sigma, Inc. Multimedia and Technology++
*Contact: *[email protected], 510-761-6610
*Urgent Contact:* [email protected] (gtalk preferred. if email, CC
no-one)
> RegionServer Doesn't Rejoin Cluster after Netsplit
> --------------------------------------------------
>
> Key: HBASE-8105
> URL: https://issues.apache.org/jira/browse/HBASE-8105
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.92.1
> Environment: Linux Ubuntu 10.04 LTS
> Reporter: philo vivero
>
> Running a 15-node HBase cluster. Testing various failure scenarios. Segregate
> one RegionServer from the cluster by firewalling off every port except SSH
> (because we need to be able to re-enable the node later).
> After the RS is automatically removed from the cluster, we re-enable all
> ports again, but RS never rejoins the cluster.
> I suspect the possibility this is desired behaviour, but haven't found proof
> so far. The code doesn't have any comment indicating this is the behaviour
> desired:
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.2/org/apache/hadoop/hbase/regionserver/HRegionServer.java/
> See lines starting at 624, public void run(). It makes it through the first
> try/catch block, but then loops inside the second try/catch block. Our
> hypothesis is that it never gets out naturally.
> If we bounce the RegionServer process, then it rejoins the cluster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira