[ 
https://issues.apache.org/jira/browse/HDFS-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793261#comment-16793261
 ] 

star commented on HDFS-14361:
-----------------------------

      Let's go back to original design doc [^Multiple-Standby-NameNodes_V1.pdf] 
SNN will get a HttpServletResponse.SC_CONFLICT status when ANN already download 
image file from other SNN.

So it is not a serious issue if all SNN send checkpoint request to ANN.

     Further more, SNN will send download image request as shown in comments 
above line 429 in 3 cases:
 * {color:#808080}rollback request{color}
 * {color:#808080}are the checkpointer{color}
 * {color:#808080} are outside the quiet period{color}

{color:#808080}{color:#333333}But from the patch only in later two case will 
SNN send download request. I think it causes issue{color} HDFS-12248.{color}

 
{code:java}
if (needCheckpoint) {
  // on all nodes, we build the checkpoint. However, we only ship the 
checkpoint if have a
  // rollback request, are the checkpointer, are outside the quiet period.
  final long secsSinceLastUpload = (now - lastUploadTime) / 1000;
  boolean sendRequest = isPrimaryCheckPointer
      || secsSinceLastUpload >= checkpointConf.getQuietPeriod();
  doCheckpoint(sendRequest);
  ...
}{code}
I agree to to move isPrimaryCheckPointer outside of 'if' block to avoid a 
inconsistent state that there are more than 1  SNN with isPrimaryCheckPointer = 
true, though it will not break anything.

As to {color:#808080}HDFS-12248,{color:#333333} I think{color}{color} we may 
change sendRequest as following:
{code:java}
boolean sendRequest = needRollbackCheckpoint || isPrimaryCheckPointer
    || secsSinceLastUpload >= checkpointConf.getQuietPeriod();
{code}
Thus all SNN will send request everytime rollbackCheckpoint is triggered. Or we 
should fix the comments.

 

 

> SNN will always upload fsimage
> ------------------------------
>
>                 Key: HDFS-14361
>                 URL: https://issues.apache.org/jira/browse/HDFS-14361
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, namenode
>    Affects Versions: 3.2.0
>            Reporter: hunshenshi
>            Priority: Major
>             Fix For: 3.2.0
>
>
> Related to -HDFS-12248.-
> {code:java}
> boolean sendRequest = isPrimaryCheckPointer
>     || secsSinceLastUpload >= checkpointConf.getQuietPeriod();
> doCheckpoint(sendRequest);
> {code}
> If sendRequest is true, SNN will upload fsimage. But isPrimaryCheckPointer 
> always is true,
> {code:java}
> if (ie == null && ioe == null) {
>   //Update only when response from remote about success or
>   lastUploadTime = monotonicNow();
>   // we are primary if we successfully updated the ANN
>   this.isPrimaryCheckPointer = success;
> }
> {code}
> isPrimaryCheckPointer should be outside the if condition.
> If the ANN update was not successful, then isPrimaryCheckPointer should be 
> set to false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to