[
https://issues.apache.org/jira/browse/HDFS-9787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142240#comment-15142240
]
Vinayakumar B commented on HDFS-9787:
-------------------------------------
{quote}The original solution was an attempting to catch the case where we don't
flood the NN with checkpoint requests. Instead, maybe the better solution would
be to do a small RPC to see when the latest image was uploaded. If it was
uploaded the quietMultiplier beyond the checkpoint period, then we attempt to
upload the checkpoint.
Its a bit more work, but I think this more clearly lays out the intentions in
the code, rather than obtaining the same effect, but without the overhead of
actually sending the checkpoint along each time we want to find out if its
behind.{quote}
Yes, thats required to optimize the current approach. But I feel could be done
in follow-up Jira,
First lets fix the current bug. Agree?
So, I see that patch fixes the issue mentioned in this Jira.
+1 for the patch,
> SNNs stop uploading FSImage to ANN once isPrimaryCheckPointer changed to
> false.
> -------------------------------------------------------------------------------
>
> Key: HDFS-9787
> URL: https://issues.apache.org/jira/browse/HDFS-9787
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: ha
> Affects Versions: 3.0.0
> Reporter: Guocui Mi
> Assignee: Guocui Mi
> Attachments: HDFS-9786-v000.patch
>
>
> SNNs stop uploading FSImage to ANN once isPrimaryCheckPointer become false.
> Here is the logic to check if upload FSImage or not.
> In StandbyCheckpointer.java
> boolean sendRequest = isPrimaryCheckPointer || secsSinceLast >=
> checkpointConf.getQuietPeriod();
> doCheckpoint(sendRequest);
> The sendRequest is always false if isPrimaryCheckPointer is false giving
> secsSinceLast (~checkpointPeriod) >= checkpointConf.getQuietPeriod()
> (checkpointPeriod * this.quietMultiplier(default value 1.5)) always returns
> false.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)