[
https://issues.apache.org/jira/browse/HBASE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Newman updated HBASE-9286:
-------------------------------
Description:
In replicationmanager
HRegionInterface rrs = getRS();
rrs.replicateLogEntries(Arrays.copyOf(this.entriesArray,
currentNbEntries));
....
this.metrics.setAgeOfLastShippedOp(
this.entriesArray[currentNbEntries-1].getKey().getWriteTime());
break;
which makes sense, but is wrong. The problem is that rrs.replicateLogEntries
will block for a very long time if the slave server is for instance suspended.
However this is easy to fix. We just need to call
refreshAgeOfLastShippedOp();
on a regular basis, in a different thread. I've attached a patch which fixed
this for cdh4. I can make one for trunk and the like as well if you need me to
do but it's a small change.
was:
In replicationmanager
HRegionInterface rrs = getRS();
rrs.replicateLogEntries(Arrays.copyOf(this.entriesArray,
currentNbEntries));
....
this.metrics.setAgeOfLastShippedOp(
this.entriesArray[currentNbEntries-1].getKey().getWriteTime());
break;
which makes but is wrong. The problem is that rrs.replicateLogEntries will
block for a very long time if the slave server is for instance suspended.
However this is easy to fix. We just need to call
refreshAgeOfLastShippedOp();
on a regular basis, in a different thread. I've attached a patch which fixed
this for cdh4. I can make one for trunk and the like as well if you need me to
do but it's a small change.
> ageOfLastShippedOp replication metric doesn't update if the slave
> regionserver is stalled
> -----------------------------------------------------------------------------------------
>
> Key: HBASE-9286
> URL: https://issues.apache.org/jira/browse/HBASE-9286
> Project: HBase
> Issue Type: Bug
> Reporter: Alex Newman
> Assignee: Alex Newman
> Attachments:
> 0001-HBASE-9286.-ageOfLastShippedOp-replication-metric-do.patch
>
>
> In replicationmanager
> HRegionInterface rrs = getRS();
> rrs.replicateLogEntries(Arrays.copyOf(this.entriesArray,
> currentNbEntries));
> ....
> this.metrics.setAgeOfLastShippedOp(
> this.entriesArray[currentNbEntries-1].getKey().getWriteTime());
> break;
> which makes sense, but is wrong. The problem is that rrs.replicateLogEntries
> will block for a very long time if the slave server is for instance suspended.
> However this is easy to fix. We just need to call
> refreshAgeOfLastShippedOp();
> on a regular basis, in a different thread. I've attached a patch which fixed
> this for cdh4. I can make one for trunk and the like as well if you need me
> to do but it's a small change.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira