[ 
https://issues.apache.org/jira/browse/HDFS-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875934#comment-16875934
 ] 

ludun commented on HDFS-14621:
------------------------------

/** \{@inheritDoc} */
@Override
public void commitJob(JobContext jobContext) throws IOException {
 Configuration conf = jobContext.getConfiguration();
 syncFolder = conf.getBoolean(DistCpConstants.CONF_LABEL_SYNC_FOLDERS, false);
 overwrite = conf.getBoolean(DistCpConstants.CONF_LABEL_OVERWRITE, false);
 targetPathExists = 
conf.getBoolean(DistCpConstants.CONF_LABEL_TARGET_PATH_EXISTS, true);
 
 super.commitJob(jobContext);

 cleanupTempFiles(jobContext);

 String attributes = conf.get(DistCpConstants.CONF_LABEL_PRESERVE_STATUS);
 final boolean preserveRawXattrs =
 conf.getBoolean(DistCpConstants.CONF_LABEL_PRESERVE_RAWXATTRS, false);
 if ((attributes != null && !attributes.isEmpty()) || preserveRawXattrs) {
 preserveFileAttributesForDirectories(conf);
 }

 try {
 if (conf.getBoolean(DistCpConstants.CONF_LABEL_DELETE_MISSING, false)) {
 deleteMissing(conf);
 } else if (conf.getBoolean(DistCpConstants.CONF_LABEL_ATOMIC_COPY, false)) {
 commitData(conf);
 }
 taskAttemptContext.setStatus("Commit Successful");
 }
 finally {
 cleanup(conf);
 }
}

> Distcp can not preserve timestamp with -delete  option
> ------------------------------------------------------
>
>                 Key: HDFS-14621
>                 URL: https://issues.apache.org/jira/browse/HDFS-14621
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: distcp
>    Affects Versions: 2.7.7, 3.1.2
>            Reporter: ludun
>            Priority: Major
>
> Use distcp with  -prbugpcaxt and -delete to copy data between cluster.
> hadoop distcp -Dmapreduce.job.queuename="QueueA" -prbugpcaxt -update -delete  
> hdfs://sourcecluster/user/hive/warehouse/sum.db 
> hdfs://destcluster/user/hive/warehouse/sum.db
> After distcp, we found  the timestamp of dest is different from source, and 
> the timestamp of some directory was the time distcp running.
> Check the code of distcp, in committer, it preserves time first then process 
> -delete option which will change the timestamp of dest directory. So we 
> should process -delete option first. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to