[
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280079#comment-15280079
]
Steve Loughran commented on HADOOP-13028:
-----------------------------------------
For some more detail, here's a spark-cloud module (WiP) test run against 2.7.1;
duration measured in tests, stream info printed as test goes alone. There's no
meaningful string value.
{code}
===== TEST OUTPUT FOR o.a.s.cloud.s3.S3aIOSuite: 'SeekReadFully: Cost of seek
and ReadFully' =====
2016-05-11 13:54:44,462 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of stat = 189,933,000 ns
2016-05-11 13:54:44,652 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of open = 189,144,000 ns
2016-05-11 13:54:44,652 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:54:45,099 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of read() [pos = 0] = 446,564,000 ns
2016-05-11 13:54:45,100 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:45,101 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:54:46,052 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of seek(256) [pos = 1] = 950,677,000 ns
2016-05-11 13:54:46,053 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:46,053 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:54:46,054 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of seek(256) [pos = 256] = 22,000 ns
2016-05-11 13:54:46,054 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:46,055 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:54:47,010 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of seek(EOF-2) [pos = 256] = 954,645,000 ns
2016-05-11 13:54:47,010 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:47,011 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:54:47,012 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of read() [pos = 21203389] = 397,000 ns
2016-05-11 13:54:47,012 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:47,013 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:54:49,213 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(1, byte[1]) [pos = 21203390] = 2,199,571,000 ns
2016-05-11 13:54:49,213 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:49,214 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:54:52,487 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(1, byte[256]) [pos = 21203390] = 3,272,746,000 ns
2016-05-11 13:54:52,487 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:52,488 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:54:55,092 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(260, byte[256]) [pos = 21203390] = 2,604,062,000 ns
2016-05-11 13:54:55,092 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:55,093 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:54:56,825 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(1024, byte[256]) [pos = 21203390] = 1,731,421,000 ns
2016-05-11 13:54:56,825 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:56,825 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:54:58,486 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(1536, byte[256]) [pos = 21203390] = 1,660,882,000 ns
2016-05-11 13:54:58,487 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:58,487 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:55:00,635 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(8192, byte[1024]) [pos = 21203390] = 2,147,589,000 ns
2016-05-11 13:55:00,635 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:55:00,636 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:55:02,333 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(9728, byte[1024]) [pos = 21203390] = 1,697,169,000 ns
2016-05-11 13:55:02,334 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:55:02,334 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:55:02,334 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of seek(getPos) [pos = 21203390] = 14,000 ns
2016-05-11 13:55:02,334 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:55:02,335 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:55:02,335 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of read() [pos = 21203390] = 197,000 ns
2016-05-11 13:55:02,335 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:55:02,336 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of close() = 47,000 ns
2016-05-11 13:55:02,336 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Statistics 3076 bytes read, 0 bytes written, 2 read ops, 0 large read ops, 0
write ops
2016-05-11 13:55:02,337 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
===== FINISHED o.a.s.cloud.s3.S3aIOSuite: 'SeekReadFully: Cost of seek and
ReadFully' =====
- SeekReadFully: Cost of seek and ReadFully
+ Cleaning s3a://tests3neu/test
2016-05-11 13:55:03,693 WARN s3.S3aIOSuite (Logging.scala:logWarning(66)) -
Deleting s3a://tests3neu/test returned false
- Read Bytes !!! IGNORED !!!
Run completed in 24 seconds, 16 milliseconds.
Total number of tests run: 1
Suites: completed 2, aborted 0
Tests: succeeded 1, failed 0, canceled 0, ignored 6, pending 0
All tests passed.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
{code}
Now, here's a run against patch 011 here:
{code}
==== TEST OUTPUT FOR o.a.s.cloud.s3.S3aIOSuite: 'SeekReadFully: Cost of seek
and ReadFully' =====
2016-05-11 13:58:42,171 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of stat = 211,846,000 ns
2016-05-11 13:58:42,364 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of open = 192,179,000 ns
2016-05-11 13:58:42,364 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:42,812 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of read() [pos = 0] = 447,179,000 ns
2016-05-11 13:58:42,812 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=1
contentLength=21203391 StreamStatistics{OpenOperations=1, CloseOperations=0,
Closed=0, Aborted=0, SeekOperations=0, ReadExceptions=0,
ForwardSeekOperations=0, BackwardSeekOperations=0, BytesSkippedOnSeek=0,
BytesBackwardsOnSeek=0, BytesRead=1, BytesRead excluding skipped=1,
ReadOperations=0, ReadFullyOperations=0, ReadsIncomplete=0}}
2016-05-11 13:58:42,812 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:42,813 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of seek(256) [pos = 1] = 24,000 ns
2016-05-11 13:58:42,813 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=256
contentLength=21203391 StreamStatistics{OpenOperations=1, CloseOperations=0,
Closed=0, Aborted=0, SeekOperations=0, ReadExceptions=0,
ForwardSeekOperations=0, BackwardSeekOperations=0, BytesSkippedOnSeek=0,
BytesBackwardsOnSeek=0, BytesRead=1, BytesRead excluding skipped=1,
ReadOperations=0, ReadFullyOperations=0, ReadsIncomplete=0}}
2016-05-11 13:58:42,813 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:42,814 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of seek(256) [pos = 256] = 13,000 ns
2016-05-11 13:58:42,814 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=256
contentLength=21203391 StreamStatistics{OpenOperations=1, CloseOperations=0,
Closed=0, Aborted=0, SeekOperations=0, ReadExceptions=0,
ForwardSeekOperations=0, BackwardSeekOperations=0, BytesSkippedOnSeek=0,
BytesBackwardsOnSeek=0, BytesRead=1, BytesRead excluding skipped=1,
ReadOperations=0, ReadFullyOperations=0, ReadsIncomplete=0}}
2016-05-11 13:58:42,815 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:42,816 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of seek(EOF-2) [pos = 256] = 12,000 ns
2016-05-11 13:58:42,816 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=21203389
contentLength=21203391 StreamStatistics{OpenOperations=1, CloseOperations=0,
Closed=0, Aborted=0, SeekOperations=0, ReadExceptions=0,
ForwardSeekOperations=0, BackwardSeekOperations=0, BytesSkippedOnSeek=0,
BytesBackwardsOnSeek=0, BytesRead=1, BytesRead excluding skipped=1,
ReadOperations=0, ReadFullyOperations=0, ReadsIncomplete=0}}
2016-05-11 13:58:42,816 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:43,913 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of read() [pos = 21203389] = 1,096,521,000 ns
2016-05-11 13:58:43,913 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=21203390
nextReadPos=21203390 contentLength=21203391 StreamStatistics{OpenOperations=2,
CloseOperations=1, Closed=0, Aborted=1, SeekOperations=0, ReadExceptions=0,
ForwardSeekOperations=0, BackwardSeekOperations=0, BytesSkippedOnSeek=0,
BytesBackwardsOnSeek=0, BytesRead=2, BytesRead excluding skipped=2,
ReadOperations=0, ReadFullyOperations=0, ReadsIncomplete=0}}
2016-05-11 13:58:43,914 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:44,292 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(1, byte[1]) [pos = 21203390] = 378,296,000 ns
2016-05-11 13:58:44,293 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=2 nextReadPos=21203390
contentLength=21203391 StreamStatistics{OpenOperations=3, CloseOperations=2,
Closed=1, Aborted=1, SeekOperations=1, ReadExceptions=0,
ForwardSeekOperations=0, BackwardSeekOperations=1, BytesSkippedOnSeek=0,
BytesBackwardsOnSeek=21203389, BytesRead=3, BytesRead excluding skipped=3,
ReadOperations=1, ReadFullyOperations=1, ReadsIncomplete=0}}
2016-05-11 13:58:44,293 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:45,453 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(1, byte[256]) [pos = 21203390] = 1,159,621,000 ns
2016-05-11 13:58:45,454 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=257 nextReadPos=21203390
contentLength=21203391 StreamStatistics{OpenOperations=4, CloseOperations=3,
Closed=1, Aborted=2, SeekOperations=2, ReadExceptions=0,
ForwardSeekOperations=0, BackwardSeekOperations=2, BytesSkippedOnSeek=0,
BytesBackwardsOnSeek=21203390, BytesRead=259, BytesRead excluding skipped=259,
ReadOperations=2, ReadFullyOperations=2, ReadsIncomplete=0}}
2016-05-11 13:58:45,454 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:45,454 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(260, byte[256]) [pos = 21203390] = 93,000 ns
2016-05-11 13:58:45,455 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=516 nextReadPos=21203390
contentLength=21203391 StreamStatistics{OpenOperations=4, CloseOperations=3,
Closed=1, Aborted=2, SeekOperations=3, ReadExceptions=0,
ForwardSeekOperations=1, BackwardSeekOperations=2, BytesSkippedOnSeek=3,
BytesBackwardsOnSeek=21203390, BytesRead=518, BytesRead excluding skipped=515,
ReadOperations=3, ReadFullyOperations=3, ReadsIncomplete=0}}
2016-05-11 13:58:45,455 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:45,455 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(1024, byte[256]) [pos = 21203390] = 41,000 ns
2016-05-11 13:58:45,456 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1280 nextReadPos=21203390
contentLength=21203391 StreamStatistics{OpenOperations=4, CloseOperations=3,
Closed=1, Aborted=2, SeekOperations=4, ReadExceptions=0,
ForwardSeekOperations=2, BackwardSeekOperations=2, BytesSkippedOnSeek=511,
BytesBackwardsOnSeek=21203390, BytesRead=1282, BytesRead excluding skipped=771,
ReadOperations=4, ReadFullyOperations=4, ReadsIncomplete=0}}
2016-05-11 13:58:45,456 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:45,456 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(1536, byte[256]) [pos = 21203390] = 34,000 ns
2016-05-11 13:58:45,457 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1792 nextReadPos=21203390
contentLength=21203391 StreamStatistics{OpenOperations=4, CloseOperations=3,
Closed=1, Aborted=2, SeekOperations=5, ReadExceptions=0,
ForwardSeekOperations=3, BackwardSeekOperations=2, BytesSkippedOnSeek=767,
BytesBackwardsOnSeek=21203390, BytesRead=1794, BytesRead excluding
skipped=1027, ReadOperations=5, ReadFullyOperations=5, ReadsIncomplete=0}}
2016-05-11 13:58:45,457 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:45,457 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(8192, byte[1024]) [pos = 21203390] = 54,000 ns
2016-05-11 13:58:45,458 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=9216 nextReadPos=21203390
contentLength=21203391 StreamStatistics{OpenOperations=4, CloseOperations=3,
Closed=1, Aborted=2, SeekOperations=6, ReadExceptions=0,
ForwardSeekOperations=4, BackwardSeekOperations=2, BytesSkippedOnSeek=7167,
BytesBackwardsOnSeek=21203390, BytesRead=9218, BytesRead excluding
skipped=2051, ReadOperations=7, ReadFullyOperations=6, ReadsIncomplete=1}}
2016-05-11 13:58:45,458 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:45,458 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of readFully(9728, byte[1024]) [pos = 21203390] = 41,000 ns
2016-05-11 13:58:45,458 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=10752 nextReadPos=21203390
contentLength=21203391 StreamStatistics{OpenOperations=4, CloseOperations=3,
Closed=1, Aborted=2, SeekOperations=7, ReadExceptions=0,
ForwardSeekOperations=5, BackwardSeekOperations=2, BytesSkippedOnSeek=7679,
BytesBackwardsOnSeek=21203390, BytesRead=10754, BytesRead excluding
skipped=3075, ReadOperations=8, ReadFullyOperations=7, ReadsIncomplete=1}}
2016-05-11 13:58:45,459 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:45,459 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of seek(getPos) [pos = 21203390] = 14,000 ns
2016-05-11 13:58:45,459 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=10752 nextReadPos=21203390
contentLength=21203391 StreamStatistics{OpenOperations=4, CloseOperations=3,
Closed=1, Aborted=2, SeekOperations=7, ReadExceptions=0,
ForwardSeekOperations=5, BackwardSeekOperations=2, BytesSkippedOnSeek=7679,
BytesBackwardsOnSeek=21203390, BytesRead=10754, BytesRead excluding
skipped=3075, ReadOperations=8, ReadFullyOperations=7, ReadsIncomplete=1}}
2016-05-11 13:58:45,460 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
2016-05-11 13:58:46,416 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of read() [pos = 21203390] = 955,773,000 ns
2016-05-11 13:58:46,416 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
S3AInputStream{s3a://landsat-pds/scene_list.gz pos=21203391
nextReadPos=21203391 contentLength=21203391 StreamStatistics{OpenOperations=5,
CloseOperations=4, Closed=1, Aborted=3, SeekOperations=7, ReadExceptions=0,
ForwardSeekOperations=5, BackwardSeekOperations=2, BytesSkippedOnSeek=7679,
BytesBackwardsOnSeek=21203390, BytesRead=10755, BytesRead excluding
skipped=3076, ReadOperations=8, ReadFullyOperations=7, ReadsIncomplete=1}}
2016-05-11 13:58:46,417 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Duration of close() = 102,000 ns
2016-05-11 13:58:46,418 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
Statistics 10755 bytes read, 0 bytes written, 2 read ops, 0 large read ops, 0
write ops
2016-05-11 13:58:46,419 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) -
===== FINISHED o.a.s.cloud.s3.S3aIOSuite: 'SeekReadFully: Cost of seek and
ReadFully' =====
- SeekReadFully: Cost of seek and ReadFully
+ Cleaning s3a://tests3neu/test
2016-05-11 13:58:47,120 WARN s3.S3aIOSuite (Logging.scala:logWarning(66)) -
Deleting s3a://tests3neu/test returned false
- Read Bytes !!! IGNORED !!!
Total number of tests run: 1
{code}
We've cut the test run time in half. That's from lazy seek, readahead *and
understanding whats going on inside the stream during the execution of work*.
We can look at the logs and see when the connection was being closed vs
aborted, when there was a backwards vs. forward seek, how many bytes in the
stream were read overall, and how many were actually returned in read() calls,
how often a read request had to wait for more data arrive, etc. This is visible
in the tests without having to hard-code the module to Hadoop 2.8+, which is
good, as the module doesn't need Hadoop 2.8. All it needs is Hadoop 2.6+ for
s3a (2.7.x for s3a to work best), 2.8 is for performance.
This is why we need the information. I'm not trying to machine parse it, that's
not what these metrics *performance metrics* are about. It's about providing
the information for people to tune their code downstream, not automated health
checking.
> add low level counter metrics for S3A; use in read performance tests
> --------------------------------------------------------------------
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3, metrics
> Affects Versions: 2.8.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch,
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch,
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch,
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch,
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch,
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt,
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive,
> closing connections may be expensive (a sign of a regression).
> S3A FS and individual input streams should have counters of the # of
> open/close/failure+reconnect operations, timers of how long things take. This
> can be used downstream to measure efficiency of the code (how often
> connections are being made), connection reliability, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]