[
https://issues.apache.org/jira/browse/HADOOP-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291809#comment-14291809
]
Thomas Demoor commented on HADOOP-11488:
----------------------------------------
My pleasure, Daisuke.
Steve, that seems sensible.
Purging uses abortMultiPartUploads, which is a list operation followed by a
delete operation per listed upload. It happens synchronously at the end of
fs.initialize(). The test might timeout if the bucket has too much (thousands?)
in-progress uploads that need to be aborted. However, the s3a docs explicitly
state that a dedicated test bucket should be used and the current test classes
do not perform multiple concurrent multiPartUploads so typically 0 and at most
1 upload will be purged. Consequently, we can safely assume "too much
in-progress uploads" to be a "should not happen" during testing.
I see 2 ways to implement this:
1. add purge = true, purgetime = 0 secs to the conf in
S3ATestUtils.createTestFileSystem
+: 2 lines of code
-: cleaning up only happens on the next test run
2. in tearDown() of each test class : run S3ATestUtils.createTestFileSystem
with a config that has purge=true, purgetime = 0 secs
+: immediate cleanup at end of test (except if test is interrupted, f.i. ctrl-c)
-: each test class has to explicitly include the code path
Let me know what you prefer and I'll create an issue addressing this.
> Difference in default connection timeout for S3A FS
> ---------------------------------------------------
>
> Key: HADOOP-11488
> URL: https://issues.apache.org/jira/browse/HADOOP-11488
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.6.0
> Reporter: Harsh J
> Assignee: Daisuke Kobayashi
> Priority: Minor
> Attachments: HADOOP-11488.patch, HADOOP-11488.patch
>
>
> The core-default.xml defines fs.s3a.connection.timeout as 5000, and the code
> under hadoop-tools/hadoop-aws defines it as 50000.
> We should update the former to 50s so it gets taken properly, as we're also
> noticing that 5s is often too low, especially in cases such as large DistCp
> operations (which fail with {{Read timed out}} errors from the S3 service).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)