[ 
https://issues.apache.org/jira/browse/HDFS-15640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218221#comment-17218221
 ] 

Yiqun Lin commented on HDFS-15640:
----------------------------------

{quote}
Only a little problem is it might not be easy to know how much time will the 
diffs cost.
{quote}
Actually current logic already get the latest snapshot diff, and we can just 
reuse that result. So it won't add additionally cost compared with current 
logic.
{code}
  /**
   * Verify whether the src has changed since CURRENT_SNAPSHOT_NAME snapshot.
   *
   * @return true if the src has changed.
   */
  private boolean verifyDiff() throws IOException {
    SnapshotDiffReport diffReport =
        srcFs.getSnapshotDiffReport(src, CURRENT_SNAPSHOT_NAME, "");
    return diffReport.getDiffList().size() > 0;
  }
{code}

Just depend on last 3 consecutive distcp execution time is not a 100% accurate 
way, for example an extreme case, the final distcp should be running very fast 
but actually it finished slowly due to unexpected thing, like abnormal node. So 
I still prefer to use the diff number.

> RBF: Add fast distcp threshold to FedBalance.
> ---------------------------------------------
>
>                 Key: HDFS-15640
>                 URL: https://issues.apache.org/jira/browse/HDFS-15640
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jinglun
>            Assignee: Jinglun
>            Priority: Major
>         Attachments: HDFS-15640.001.patch
>
>
> Currently in the DistCpProcedure it must submit distcp round by round until 
> there is no diff to go to the final distcp stage. The condition is very 
> strict. If the distcp could finish in an acceptable period then we don't need 
> to wait for no diff. For example if 3 consecutive distcp jobs all finish 
> within 10 minutes then we can predict the final distcp could also finish 
> within 10 minutes. So we can start the final distcp directly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to