[
https://issues.apache.org/jira/browse/HADOOP-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216638#comment-16216638
]
Steve Loughran commented on HADOOP-14973:
-----------------------------------------
First, sean, tag versions, give title a hint it's for S3, mark as improvement,
move under HADOOP-14831 so it can be tracked for Hadoop 1
Second, you haven't called FileSystem.toString() for a while have you? Or
FSDataInputStream.toString()? Because it prints all this stuff. How else do you
think all the seek optimisation work was debugged?
{code}
2017-10-10 16:23:47,050 [ScalaTest-main-running-S3ADataFrameSuite] INFO
s3.S3ADataFrameSuite (Logging.scala:logInfo(54)) - Duration of scan result list
= 2,118,450 nS
2017-10-10 16:23:47,050 [ScalaTest-main-running-S3ADataFrameSuite] INFO
s3.S3ADataFrameSuite (Logging.scala:logInfo(54)) - FileSystem
S3AFileSystem{uri=s3a://hwdev-steve-ireland-new,
workingDir=s3a://hwdev-steve-ireland-new/user/stevel, inputPolicy=random,
partSize=8388608, enableMultiObjectsDelete=true, maxKeys=5000,
readAhead=262144, blockSize=1048576, multiPartThreshold=2147483647,
serverSideEncryptionAlgorithm='NONE',
blockFactory=org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory@64f6964f,
metastore=NullMetadataStore, authoritative=false, useListV1=false,
boundedExecutor=BlockingThreadPoolExecutorService{SemaphoredDelegatingExecutor{permitCount=25,
available=25, waiting=0}, activeCount=0},
unboundedExecutor=java.util.concurrent.ThreadPoolExecutor@60291e59[Running,
pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0],
statistics {182521443 bytes read, 39004 bytes written, 207 read ops, 0 large
read ops, 76 write ops}, metrics {{Context=S3AFileSystem}
{FileSystemId=e62eeb1a-cced-473b-95f3-06c9910604ad-hwdev-steve-ireland-new}
{fsURI=s3a://hwdev-steve-ireland-new} {files_created=0} {files_copied=0}
{files_copied_bytes=0} {files_deleted=0} {fake_directories_deleted=0}
{directories_created=0} {directories_deleted=0} {ignored_errors=0}
{op_copy_from_local_file=0} {op_exists=0} {op_get_file_status=1}
{op_glob_status=0} {op_is_directory=0} {op_is_file=0} {op_list_files=1}
{op_list_located_status=0} {op_list_status=0} {op_mkdirs=0} {op_rename=0}
{object_copy_requests=0} {object_delete_requests=0} {object_list_requests=2}
{object_continue_list_requests=0} {object_metadata_requests=2}
{object_multipart_aborted=0} {object_put_bytes=0} {object_put_requests=0}
{object_put_requests_completed=0} {stream_write_failures=0}
{stream_write_block_uploads=0} {stream_write_block_uploads_committed=0}
{stream_write_block_uploads_aborted=0} {stream_write_total_time=0}
{stream_write_total_data=0} {committer_commits_created=0}
{committer_commits_completed=0} {committer_jobs_completed=0}
{committer_jobs_failed=0} {committer_tasks_completed=0}
{committer_tasks_failed=0} {committer_bytes_committed=0}
{committer_bytes_uploaded=0} {committer_commits_failed=0}
{committer_commits_aborted=0} {committer_commits_reverted=0}
{s3guard_metadatastore_put_path_request=1}
{s3guard_metadatastore_initialization=0} {s3guard_metadatastore_retry=0}
{s3guard_metadatastore_throttled=0} {store_io_throttled=0}
{object_put_requests_active=0} {object_put_bytes_pending=0}
{stream_write_block_uploads_active=0} {stream_write_block_uploads_pending=0}
{stream_write_block_uploads_data_pending=0}
{S3guard_metadatastore_put_path_latencyNumOps=0}
{S3guard_metadatastore_put_path_latency50thPercentileLatency=0}
{S3guard_metadatastore_put_path_latency75thPercentileLatency=0}
{S3guard_metadatastore_put_path_latency90thPercentileLatency=0}
{S3guard_metadatastore_put_path_latency95thPercentileLatency=0}
{S3guard_metadatastore_put_path_latency99thPercentileLatency=0}
{S3guard_metadatastore_throttle_rateNumEvents=0}
{S3guard_metadatastore_throttle_rate50thPercentileFrequency (Hz)=0}
{S3guard_metadatastore_throttle_rate75thPercentileFrequency (Hz)=0}
{S3guard_metadatastore_throttle_rate90thPercentileFrequency (Hz)=0}
{S3guard_metadatastore_throttle_rate95thPercentileFrequency (Hz)=0}
{S3guard_metadatastore_throttle_rate99thPercentileFrequency (Hz)=0}
{stream_read_fully_operations=0} {stream_opened=0}
{stream_bytes_skipped_on_seek=0} {stream_closed=0}
{stream_bytes_backwards_on_seek=0} {stream_bytes_read=0}
{stream_read_operations_incomplete=0} {stream_bytes_discarded_in_abort=0}
{stream_close_operations=0} {stream_read_operations=0} {stream_aborted=0}
{stream_forward_seek_operations=0} {stream_backward_seek_operations=0}
{stream_seek_operations=0} {stream_bytes_read_in_close=0}
{stream_read_exceptions=0} }}
- DataFrames
2017-10-10 16:23:47,051 [ScalaTest-main-running-S3ADataFrameSuite] INFO
s3.S3ADataFrameSuite (Logging.scala:logInfo(54)) - Cleaning
s3a://hwdev-steve-ireland-new/cloud-integration/DELAY_LISTING_ME/S3ADataFrameSuite
S3AOrcRelationSuite:
{code}
See? That's from a Spark {{logInfo(s"Stats $filesystem")}} instruction, with no
changes make to the spark codebase at all.
w.r.t broad stats there, what is needed is: aggregate collection of stats from
executors. where the work for a specific executor contains the stats for that
task, rather than the statistics summary for the entire life of the shared
process. Same for Tez, I expect
* the _SUCCESS file in the HADOOP-13786 patch collects the VM stats and
aggregates them; it doesn't do what is needed, which is per-thread
collection/diff.
* There's been discussion in Spark PRs about improving how executor stats are
collected (currently it just does a {{listFiles(task-output-dir,
true).map(status => status.len).sum()}} . Tasks should be able to return a
full map string-> long of that tasks' stats and aggregate them.
This is broader than just s3; it needs to cover all stores, plus let committers
& executors add more data.
[~liuml07] has done some of the initial work on chaining up StorageStats.
Anyway, if all you want is logging s3a stats, toString() does it, so I'd close
it as a WORKSFORME. However, we do need to glue together the entire storage
stats mechanism, finishing off Mingliang's work. Well volunteered!
> Log StorageStatistics
> ---------------------
>
> Key: HADOOP-14973
> URL: https://issues.apache.org/jira/browse/HADOOP-14973
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Reporter: Sean Mackrory
> Assignee: Sean Mackrory
>
> S3A is currently storing much more detailed metrics via StorageStatistics
> than are logged in a MapReduce job. Eventually, it would be nice to get
> Spark, MapReduce and other workloads to retrieve and store these metrics, but
> it may be some time before they all do that. I'd like to consider having S3A
> publish the metrics itself in some form. This is tricky, as S3A has no daemon
> but lives inside various other processes.
> Perhaps writing to a log file at some configurable interval and on close()
> would be the best we could do. Other ideas would be welcome.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]