[jira] [Commented] (HADOOP-13560) S3ABlockOutputStream to support huge (many GB) file writes

Chris Nauroth (JIRA) Tue, 27 Sep 2016 22:47:13 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15528494#comment-15528494
 ]


Chris Nauroth commented on HADOOP-13560:
----------------------------------------

Thank you, Steve.  I have started reviewing patch revision 006.  I haven't read 
through all of it yet, but here is my feedback so far.

This patch does not apply to current trunk, so we'll eventually need a 
different patch for trunk.

All access to {{S3ABlockOutputStream#closed}} happens through {{synchronized}} 
methods.  Would it be simpler to change the data type to straight {{boolean}}, 
or do you prefer to stick with {{AtomicBoolean}}?

{{S3ABlockOutputStream#now}} returns time in milliseconds, but the JavaDocs 
state nanoseconds.  Did you want {{System#nanoTime}} or possibly 
{{org.apache.hadoop.util.Time#monotonicNow}} for a millisecond measurement 
that's safe against system clock changes?

Can ITestS3AHuge* be made to run in parallel instead of sequential?  It appears 
these tests are already sufficiently isolated from one another.  They call 
{{S3AScaleTestBase#getTestPath}}, so they are guaranteed to operate on isolated 
paths within the bucket.  They also disable the multi-part upload purge in 
{{S3AScaleTestBase#setUp}}.  Is there another isolation problem I missed, or is 
the idea more that you don't want activity from another test running in 
parallel to pollute metrics reported from the scale tests due to bandwidth 
limitations or throttling?

> S3ABlockOutputStream to support huge (many GB) file writes
> ----------------------------------------------------------
>
>                 Key: HADOOP-13560
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13560
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.9.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>         Attachments: HADOOP-13560-branch-2-001.patch, 
> HADOOP-13560-branch-2-002.patch, HADOOP-13560-branch-2-003.patch, 
> HADOOP-13560-branch-2-004.patch
>
>
> An AWS SDK [issue|https://github.com/aws/aws-sdk-java/issues/367] highlights 
> that metadata isn't copied on large copies.
> 1. Add a test to do that large copy/rname and verify that the copy really 
> works
> 2. Verify that metadata makes it over.
> Verifying large file rename is important on its own, as it is needed for very 
> large commit operations for committers using rename



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-13560) S3ABlockOutputStream to support huge (many GB) file writes

Reply via email to