apurtell commented on a change in pull request #3009:
URL: https://github.com/apache/hbase/pull/3009#discussion_r586818343
##########
File path:
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationGlobalSourceSource.java
##########
@@ -197,6 +205,20 @@ public void incrFailedRecoveryQueue() {
failedRecoveryQueue.incr(1L);
}
+ @Override
+ public void setPeerZkConnectionFailures(boolean success) {
Review comment:
Yes, add a separate method for set. Set and Increment is fine. Set(0) to
reset.
##########
File path:
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSource.java
##########
@@ -78,4 +77,6 @@
void incrFailedRecoveryQueue();
void setOldestWalAge(long age);
long getOldestWalAge();
+ void setPeerZkConnectionFailures(boolean success);
+ long getPeerZkConnectionFailures();
Review comment:
Getters are not needed by convention here, remove
##########
File path:
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSource.java
##########
@@ -78,4 +77,6 @@
void incrFailedRecoveryQueue();
void setOldestWalAge(long age);
long getOldestWalAge();
+ void setPeerZkConnectionFailures(boolean success);
Review comment:
This is weird. It should be an increment function. See above
incrFailedRecoveryQueue as example.
##########
File path:
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSource.java
##########
@@ -50,8 +50,7 @@
public static final String SOURCE_COMPLETED_LOGS = "source.completedLogs";
public static final String SOURCE_COMPLETED_RECOVERY_QUEUES =
"source.completedRecoverQueues";
public static final String SOURCE_FAILED_RECOVERY_QUEUES =
"source.failedRecoverQueues";
- /* Used to track the age of oldest wal in ms since its creation time */
- String OLDEST_WAL_AGE = "source.oldestWalAge";
+ public static final String SOURCE_PEER_ZK_CONNECTION_FAILURE =
"source.peerZkConnectionFailure";
Review comment:
Can we just call this peerConnectionFailure? ZK may or may not be the
reason, if not now, then in the future. What we want to count is connection
failures, let the naming reflect that. (We need to care about metric names
because it becomes part of operational compat.)
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
##########
@@ -293,9 +293,12 @@ public void run() {
while (this.isSourceActive() && this.peerClusterId == null) {
this.peerClusterId = replicationEndpoint.getPeerUUID();
if (this.isSourceActive() && this.peerClusterId == null) {
+ metrics.setPeerZkConnectionFailures(false);
Review comment:
Agree with @wchevreuil , metrics and logs are consumed differently by
operators and a reasonable request to add a log is not solved by emitting a
metric
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]