[
https://issues.apache.org/jira/browse/HDFS-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833954#comment-16833954
]
Erik Krogen commented on HDFS-12979:
------------------------------------
Hey [~vagarychen], though I agree this fixes things on the Observer side, I
think we need to update the logic within {{StandbyCheckpointer}} as well. For
starters, we have a field called {{activeNNAddresses}}, but it is really going
to contain active and observer (and other standby nodes... it seems it should
have been renamed when HDFS-6440 was completed). More importantly, today once a
standby NN succeeds in uploading to a single NN, it will stop:
{code:java,name=StandbyCheckpointer}
for (; i < uploads.size(); i++) {
Future<TransferFsImage.TransferResult> upload = uploads.get(i);
try {
// TODO should there be some smarts here about retries nodes that are
not the active NN?
if (upload.get() == TransferFsImage.TransferResult.SUCCESS) {
success = true;
//avoid getting the rest of the results - we don't care since we had
a successful upload
break;
}
} catch (ExecutionException e) {
ioe = new IOException("Exception during image upload", e);
break;
} catch (InterruptedException e) {
ie = e;
break;
}
}
{code}
We need to modify this to continue to monitor the success of all uploads, since
a single Standby NN may need to upload to multiple locations.
> StandbyNode should upload FsImage to ObserverNode after checkpointing.
> ----------------------------------------------------------------------
>
> Key: HDFS-12979
> URL: https://issues.apache.org/jira/browse/HDFS-12979
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs
> Reporter: Konstantin Shvachko
> Assignee: Chen Liang
> Priority: Major
> Attachments: HDFS-12979.001.patch
>
>
> ObserverNode does not create checkpoints. So it's fsimage file can get very
> old making bootstrap of ObserverNode too long. A StandbyNode should copy
> latest fsimage to ObserverNode(s) along with ANN.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]