[ https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767129#action_12767129 ]
Aaron Kimball commented on MAPREDUCE-972: ----------------------------------------- As discussed earlier, the FileSystem API does not provide a means for operations such as rename() to get access to a Progressable. I do not see a straightforward way to improve the S3FS / S3N implementations without extending the FileSystem API to add operations such as {{rename(src, dst, progress)}}. Are you +1 on doing that? Either way, I agree with your criticisms of the progress thread implementation. I have the following plan for improving this: * Make the progress thread's lifetime equal to that of the mapper. The first rename() operation starts it, and the join() moves to close() * Progress thread is only active when a rename() operation is underway. Use a volatile boolean to track this state. Otherwise it just sleeps. * Use {{Thread.interrupt()}} / {{isInterrupted()}} to interrupt the sleep in the main loop, so that we don't have to wait the full three seconds before the thread exits. * Add {{distcp.rename.timeout}} as a parameter which sets a max lifetime for the inner loop of the progress thread. Default value will be 10 seconds, but if it detects that the destination filesystem is s3n:// or s3fs://, ups this to fifteen minutes. - Aaron > distcp can timeout during rename operation to s3 > ------------------------------------------------ > > Key: MAPREDUCE-972 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-972 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp > Affects Versions: 0.20.1 > Reporter: Aaron Kimball > Assignee: Aaron Kimball > Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, > MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch > > > rename() in S3 is implemented as copy + delete. The S3 copy operation can > perform very slowly, which may cause task timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.