Guanghao Zhang created HBASE-16947:
--------------------------------------
Summary: Some improvements for DumpReplicationQueues tool
Key: HBASE-16947
URL: https://issues.apache.org/jira/browse/HBASE-16947
Project: HBase
Issue Type: Improvement
Components: Replication
Reporter: Guanghao Zhang
Recently we met too many replication WALs problem in our production cluster. We
need the DumpReplicationQueues tool to analyze the replication queues info in
zookeeper. So I backport HBASE-16450 to our branch based 0.98 and did some
improvements for it.
1. Show the dead regionservers under replication/rs znode. When there are too
many WALs under znode, it can't be atomic transferred to new rs znode. So the
dead rs znode will be leaved on zookeeper.
2. Make a summary about all the queues that belong to peer has been deleted.
3. Aggregate all regionservers' size of replication queue. Now the regionserver
report ReplicationLoad to master, but there were not a aggregate metrics for
replication.
4. Show how many WALs which can not found on hdfs. But the reason (WAL Not
Found) need more time to dig.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)