[
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928203#comment-15928203
]
Steve Loughran commented on HADOOP-13786:
-----------------------------------------
For metrics, S3a now collects a very detailed set, on low level HTTP verbs,
stream reads (including aborts, seek lengths), upload data, errors seen:
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Statistic.java
S3a toString prints them; the input & output streams also print theirs. All
2.8+ filesystems support a new method, {{getStorageStatistics() }} which return
the map of name -> counter value; HDFS & s3a implement this & share common keys
where possible.
For a 2.8+ committer only, I plan to grab the FS details from every task and,
like you say, copy them back to the job for printing. Spark itself could be
extended there; while moving to 2.8+ isn't going to happen, if there was a way
for every job to provide a key->long map, that 2.8+ committer could return the
stats.
Side issue: filesystems are per user; multiple executors on the same FS will
pick up shared values. We could have the committer independently track the
bytes uploaded in the commit, that being the key delay. Acutally, that's
implicitly in there in the object length field.
I like the idea of serializing in the pending commit data, though that doesn't
count the details of uncommitted IO, does it. We'd really want that too,
somehow. It's still work —just wasted work.
Tez already grabs and logs the storage stats, uploads them to ATS, incidentally.
> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> ------------------------------------------------------------------------
>
> Key: HADOOP-13786
> URL: https://issues.apache.org/jira/browse/HADOOP-13786
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs/s3
> Affects Versions: HADOOP-13345
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-13786-HADOOP-13345-001.patch,
> HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch,
> HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch,
> HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch,
> HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch,
> HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch,
> HADOOP-13786-HADOOP-13345-012.patch, HADOOP-13786-HADOOP-13345-013.patch,
> s3committer-master.zip
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the
> presence of failures". Implement it, including whatever is needed to
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard
> provides a consistent view of the presence/absence of blobs, show that we can
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output
> streams (ie. not visible until the close()), if we need to use that to allow
> us to abort commit operations.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]