[
https://issues.apache.org/jira/browse/HBASE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13746788#comment-13746788
]
stack commented on HBASE-9286:
------------------------------
[~posix4e] 0.95/0.96 is the focus yes but if this is a fix for a 0.94 metric,
fellas will be interested yes.
Patch looks good to me. Any of the replication heads want to ok it?
> ageOfLastShippedOp replication metric doesn't update if the slave
> regionserver is stalled
> -----------------------------------------------------------------------------------------
>
> Key: HBASE-9286
> URL: https://issues.apache.org/jira/browse/HBASE-9286
> Project: HBase
> Issue Type: Bug
> Reporter: Alex Newman
> Assignee: Alex Newman
> Attachments:
> 0001-HBASE-9286.-ageOfLastShippedOp-replication-metric-do.patch
>
>
> In replicationmanager
> HRegionInterface rrs = getRS();
> rrs.replicateLogEntries(Arrays.copyOf(this.entriesArray,
> currentNbEntries));
> ....
> this.metrics.setAgeOfLastShippedOp(
> this.entriesArray[currentNbEntries-1].getKey().getWriteTime());
> break;
> which makes sense, but is wrong. The problem is that rrs.replicateLogEntries
> will block for a very long time if the slave server is suspended or
> unavailable but not down.
> However this is easy to fix. We just need to call
> refreshAgeOfLastShippedOp();
> on a regular basis, in a different thread. I've attached a patch which fixed
> this for cdh4. I can make one for trunk and the like as well if you need me
> to do but it's a small change.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira