[
https://issues.apache.org/jira/browse/HBASE-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888980#action_12888980
]
ryan rawson commented on HBASE-2838:
------------------------------------
you can publish the other stats via hadoop metrics as well. dont
publish the long of how old the longest one is, but publish the delay
time, ie: the time difference. in a graph normally this will hover
near 0, but during times of trouble it may climb thus making a clear
indication that something is wrong.
Another metric you can track is queue linger time - how long do items
remain in various queues before being processed. You'd probably have
to track and average this.
On Thu, Jul 15, 2010 at 5:42 PM, Jean-Daniel Cryans (JIRA)
> Replication metrics
> -------------------
>
> Key: HBASE-2838
> URL: https://issues.apache.org/jira/browse/HBASE-2838
> Project: HBase
> Issue Type: Sub-task
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Fix For: 0.90.0
>
>
> Replication needs to publish metrics about its performance:
> - WALEdits read, filtered, sent to slave clusters, applied on slaves
> - size of batches sent/received
> - ms spent on reading, sending, applying edits
> This can be done using HadoopMetrics.
> Also we need to publish information not related to performance:
> - size of each HLog queues
> - age of the last replicated edit in each queue
> - time of last successful replication
> These informations can hardly be graphed, but we still need to represent them
> somehow. It has to be accessible by web UI, shell, and other tools in
> general. I don't feel strongly about creating a new public method on HRS's
> interface, and I'm not sure publishing those in ZooKeeper is a good idea
> either (why add another indirection?). Still wondering about a better
> solution.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.