[ 
https://issues.apache.org/jira/browse/HBASE-11143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-11143.
-----------------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed

Committed to 0.94, 0.98, and trunk. Thanks J-D, Andy, and Stack.

> Improve replication metrics
> ---------------------------
>
>                 Key: HBASE-11143
>                 URL: https://issues.apache.org/jira/browse/HBASE-11143
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.99.0, 0.94.20, 0.98.3
>
>         Attachments: 11143-0.94-v2.txt, 11143-0.94-v3.txt, 11143-0.94.txt, 
> 11143-trunk.txt
>
>
> We are trying to report on replication lag and find that there is no good 
> single metric to do that.
> ageOfLastShippedOp is close, but unfortunately it is increased even when 
> there is nothing to ship on a particular RegionServer.
> I would like discuss a few options here:
> Add a new metric: replicationQueueTime (or something) with the above meaning. 
> I.e. if we have something to ship we set the age of that last shipped edit, 
> if we fail we increment that last time (just like we do now). But if there is 
> nothing to replicate we set it to current time (and hence that metric is 
> reported to close to 0).
> Alternatively we could change the meaning of ageOfLastShippedOp to mean to do 
> that. That might lead to surprises, but the current behavior is clearly weird 
> when there is nothing to replicate.
> Comments? [~jdcryans], [~stack].
> If approach sounds good, I'll make a patch for all branches.
> Edit: Also adds a new shippedKBs metric to track the amount of data that is 
> shipped via replication.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to