[ 
https://issues.apache.org/jira/browse/HDFS-9787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142240#comment-15142240
 ] 

Vinayakumar B commented on HDFS-9787:
-------------------------------------

{quote}The original solution was an attempting to catch the case where we don't 
flood the NN with checkpoint requests. Instead, maybe the better solution would 
be to do a small RPC to see when the latest image was uploaded. If it was 
uploaded the quietMultiplier beyond the checkpoint period, then we attempt to 
upload the checkpoint.
Its a bit more work, but I think this more clearly lays out the intentions in 
the code, rather than obtaining the same effect, but without the overhead of 
actually sending the checkpoint along each time we want to find out if its 
behind.{quote}
Yes, thats required to optimize the current approach. But I feel could be done 
in follow-up Jira,

First lets fix the current bug. Agree?

So, I see that patch fixes the issue mentioned in this Jira.
+1 for the patch, 

> SNNs stop uploading FSImage to ANN once isPrimaryCheckPointer changed to 
> false.
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-9787
>                 URL: https://issues.apache.org/jira/browse/HDFS-9787
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 3.0.0
>            Reporter: Guocui Mi
>            Assignee: Guocui Mi
>         Attachments: HDFS-9786-v000.patch
>
>
> SNNs stop uploading FSImage to ANN once isPrimaryCheckPointer become false. 
> Here is the logic to check if upload FSImage or not.
> In StandbyCheckpointer.java
> boolean sendRequest = isPrimaryCheckPointer || secsSinceLast >= 
> checkpointConf.getQuietPeriod();
>             doCheckpoint(sendRequest);
> The sendRequest is always false if isPrimaryCheckPointer is false giving 
> secsSinceLast (~checkpointPeriod) >= checkpointConf.getQuietPeriod() 
> (checkpointPeriod * this.quietMultiplier(default value 1.5)) always returns 
> false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to