[ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767129#action_12767129
 ] 

Aaron Kimball commented on MAPREDUCE-972:
-----------------------------------------

As discussed earlier, the FileSystem API does not provide a means for 
operations such as rename() to get access to a Progressable. I do not see a 
straightforward way to improve the S3FS / S3N implementations without extending 
the FileSystem API to add operations such as {{rename(src, dst, progress)}}.  
Are you +1 on doing that?

Either way, I agree with your criticisms of the progress thread implementation. 
I have the following plan for improving this:
* Make the progress thread's lifetime equal to that of the mapper. The first 
rename() operation starts it, and the join() moves to close()
* Progress thread is only active when a rename() operation is underway. Use a 
volatile boolean to track this state. Otherwise it just sleeps.
* Use {{Thread.interrupt()}} / {{isInterrupted()}} to interrupt the sleep in 
the main loop, so that we don't have to wait the full three seconds before the 
thread exits.
* Add {{distcp.rename.timeout}} as a parameter which sets a max lifetime for 
the inner loop of the progress thread. Default value will be 10 seconds, but if 
it detects that the destination filesystem is s3n:// or s3fs://, ups this to 
fifteen minutes.

- Aaron

> distcp can timeout during rename operation to s3
> ------------------------------------------------
>
>                 Key: MAPREDUCE-972
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distcp
>    Affects Versions: 0.20.1
>            Reporter: Aaron Kimball
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to