Ashu Pachauri created HBASE-18282:
-------------------------------------
Summary: ReplicationLogCleaner can delete WALs not yet replicated
in case of a KeeperException
Key: HBASE-18282
URL: https://issues.apache.org/jira/browse/HBASE-18282
Project: HBase
Issue Type: Bug
Components: Replication
Affects Versions: 2.0.0-alpha-1, 1.1.11, 1.2.6, 1.3.1
Reporter: Ashu Pachauri
Assignee: Ashu Pachauri
Priority: Critical
ReplicationStateZKBase#getListOfReplicators does not rethrow a KeeperException
and returns null in such a case. ReplicationLogCleaner just assumes that there
are no replicators and deletes everything.
ReplicationStateZKBase:
{code:java}
public List<String> getListOfReplicators() {
List<String> result = null;
try {
result = ZKUtil.listChildrenNoWatch(this.zookeeper, this.queuesZNode);
} catch (KeeperException e) {
this.abortable.abort("Failed to get list of replicators", e);
}
return result;
}
{code}
ReplicationLogCleaner:
{code:java}
private Set<String> loadWALsFromQueues() throws KeeperException {
for (int retry = 0; ; retry++) {
int v0 = replicationQueues.getQueuesZNodeCversion();
List<String> rss = replicationQueues.getListOfReplicators();
if (rss == null) {
LOG.debug("Didn't find any region server that replicates, won't prevent
any deletions.");
return ImmutableSet.of();
}
...
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)