[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7287?focusedWorklogId=576482&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-576482
 ]

ASF GitHub Bot logged work on MAPREDUCE-7287:
---------------------------------------------

                Author: ASF GitHub Bot
            Created on: 03/Apr/21 20:07
            Start Date: 03/Apr/21 20:07
    Worklog Time Spent: 10m 
      Work Description: ayushtkn commented on pull request #2852:
URL: https://github.com/apache/hadoop/pull/2852#issuecomment-812918934


   Apart from the checkstyle issue, changes LGTM. 
    Removal of code shouldn't bother, it is used in ``validatePaths`` which 
won't break by passing the actual dest.
   
   In the present code we were passing ``/NONE`` as target which actually 
didn't exist considering we would be least bothered with the target path, when 
building listing, but  the problem seems to happen at 
```SimpleCopyListing#computeSourceRootPath```  when the source is a ``solitary 
file``, It has check which relies on target:
   
   ```
       if (solitaryFile) {
         if (!targetPathExists || targetFS.isFile(target)) {
           return sourceStatus.getPath();
         } else {
           return sourceStatus.getPath().getParent();
         }
   ```
   
   So, here the ``targetFS.isFile(target)`` returns `false` due to ``FNF`` on 
``/NONE`` which wasn't in the case when building listing for ``source``, so for 
source the listing is built with  ``sourceStatus.getPath();`` and for target 
with ``sourceStatus.getPath().getParent()``, hence on comparison in 
``CopyCommitter`` file gets deleted.
   
   So, this passing `/NONE` wasn't a good idea, The present fix to pass the 
actual destination seems fair enough. Just give a check, guess these constants 
can be removed:
   These constants can be removed now?
   ```
     /**
      * Constants for NONE file deletion
      */
     public static final String NONE_PATH_NAME = "/NONE";
     public static final Path NONE_PATH = new Path(NONE_PATH_NAME);
     public static final Path RAW_NONE_PATH = new Path(
         DistCpConstants.HDFS_RESERVED_RAW_DIRECTORY_NAME + NONE_PATH_NAME);
   
   ```
   Nice Catch!!!
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 576482)
    Time Spent: 40m  (was: 0.5h)

> Distcp will delete exists file ,  If we use "--delete and --update" options 
> and distcp file.
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7287
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7287
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distcp
>    Affects Versions: 3.2.1
>            Reporter: zhengchenyu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.3.1
>
>         Attachments: MAPREDUCE-7287.001.patch
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> hdfs://ns1/tmp/a is an existing file, hdfs://ns2/tmp/a is also an existing 
> file.
> When I run this command, 
> {code:java}
> hadoop distcp -delete -update hdfs://ns1/tmp/a hdfs://ns2/tmp/a
> {code}
> I Found hdfs://ns2/tmp/a is deleted unpectectedly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to