[
https://issues.apache.org/jira/browse/HADOOP-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15647718#comment-15647718
]
Steve Loughran commented on HADOOP-13651:
-----------------------------------------
LGTM
bq. All integration tests pass (except for a couple of unrelated,
sometimes-flaky tests).
which tests? I'd encourage you to still declare them in the "tests worked"
report, just so we can track their reliability
h3. S3ABlockOutputStream
{code}
/** Total bytes for downloads submitted so far. */
private int bytesSubmitted;
{code}
best to change the comment, make a long as we support (tested) uploads > 4GB. I
Think {{putObject()}} can still return an int, but safest to make
that a long too.
h3. S3AFileSystem
* good to see you counting errors ignored on MD updates. I was going to suggest
that, but looked closer and you'd already done it. Nice. You may want to make
it a separate metric, "metadata update failure", so they can be monitored
—they're more serious than a recoverable read error on a stream, which is what
that counter was put in to track.
* as/when I move to parallel renaming, this code is going to break again. I
don't see an easy workaround, given that patch doesn't exist. FWIW it will
involve taking the recursive directory listing, sorting by size and then
submitting to the same thread pool which supported writes, ("slow IO
operations").
h3. DirListingMetadata
* {{prettyPrint}} s/Authorit/r/Authoritive; break up append with + inside into
an append chain.
h3. S3Guard
regarding that race condition between listStatus and delete, we always warn
that it always exists in a DFS: there's no guarantee a file returned in a list
is there by thte time you come to use it, and if you use one of the remote
iterator methods, the risk of that or more dramatic things like parent dir
deletion increase. There's not even any guarantee that a an iterated listing is
consistent; even in HDFS you could rename a a file "z9999" to "a0000" and have
it never get found in an ongoing list operation.
So: no need ta havesnapshot consistency between list and use, just go for "as
good as the spec says you need to".
h3. MetadataStoreTestBase
L285: please use a different filename.
> S3Guard: S3AFileSystem Integration with MetadataStore
> -----------------------------------------------------
>
> Key: HADOOP-13651
> URL: https://issues.apache.org/jira/browse/HADOOP-13651
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Aaron Fabbri
> Assignee: Aaron Fabbri
> Attachments: HADOOP-13651-HADOOP-13345.001.patch,
> HADOOP-13651-HADOOP-13345.002.patch, HADOOP-13651-HADOOP-13345.003.patch,
> HADOOP-13651-HADOOP-13345.004.patch, HADOOP-13651-HADOOP-13345.005.patch
>
>
> Modify S3AFileSystem et al. to optionally use a MetadataStore for metadata
> consistency and caching.
> Implementation should have minimal overhead when no MetadataStore is
> configured.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]