[
https://issues.apache.org/jira/browse/HBASE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jean-Daniel Cryans updated HBASE-6321:
--------------------------------------
Attachment: HBASE-6321-0.94.patch
Had a stab at this. What I figured is that the getting of UUIDs was done
outside of {{ReplicationZookeeper}} so it was missing the functionalities from
that class (you can also see the feature envy that was going on there).
I refactored the ugly UUID stuff in {{ReplicationSource.run}} into
{{ReplicationZookeeper.getPeerUUID}}. There I needed to handle the session
expiration issues so I refactored that from another method into
{{reconnectPeer}}. Now that the issue is handled the possibility of a null UUID
remained if the peer wasn't reachable so I added a loop in
{{ReplicationSource}}.
Finally I saw that we were doing the UUID dance in {{ReplicationSource.init}}
for the current cluster so I pushed that to
{{ReplicationZookeeper.getUUIDForCluster}} and refactored {{getPeerUUID}} to
use it.
The code should be clearer a more reliable.
> ReplicationSource dies reading the peer's id
> --------------------------------------------
>
> Key: HBASE-6321
> URL: https://issues.apache.org/jira/browse/HBASE-6321
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.92.1, 0.94.0
> Reporter: Jean-Daniel Cryans
> Fix For: 0.92.2, 0.96.0, 0.94.2
>
> Attachments: HBASE-6321-0.94.patch
>
>
> This is what I saw:
> {noformat}
> 2012-07-01 05:04:01,638 ERROR
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Closing
> source 8 because an error occurred: Could not read peer's cluster id
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode
> = Session expired for /va1-backup/hbaseid
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:259)
> at
> org.apache.hadoop.hbase.zookeeper.ClusterId.readClusterIdZNode(ClusterId.java:61)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:253)
> {noformat}
> The session should just be reopened.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira