[ https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=791114&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791114 ]
ASF GitHub Bot logged work on HADOOP-17461: ------------------------------------------- Author: ASF GitHub Bot Created on: 14/Jul/22 19:49 Start Date: 14/Jul/22 19:49 Worklog Time Spent: 10m Work Description: steveloughran commented on PR #4566: URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1184835595 you are going to love this. Full end to end stats collection of IO during job commit of terasort. ``` 2022-07-14 20:37:38,332 [JUnit] INFO s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:dumpFileSystemIOStatistics(137)) - Aggregate FileSystem Statistics counters=((action_executor_acquired=2) (action_file_opened=26) (action_http_get_request=26) (action_http_head_request=112) (audit_request_execution=250) (audit_span_creation=144) (directories_created=14) (directories_deleted=14) (fake_directories_deleted=4) (files_created=2) (files_deleted=16) (ignored_errors=8) (object_bulk_delete_request=4) (object_delete_objects=34) (object_delete_request=14) (object_list_request=74) (object_metadata_request=112) (object_put_request=16) (object_put_request_completed=16) (op_create=2) (op_delete=16) (op_exists=2) (op_exists.failures=2) (op_get_file_status=40) (op_glob_status=4) (op_list_files=6) (op_list_located_status=4) (op_list_status=4) (op_list_status.failures=2) (op_mkdirs=14) (op_open=30) (store_io_request=251) (stream_read_bytes=232084) (stream_read_close_operations=26) (stream_read_closed=26) (stream_read_fully_operations=10) (stream_read_opened=26) (stream_read_operations=41) (stream_read_operations_incomplete=23) (stream_read_remote_stream_drain=26) (stream_read_seek_policy_changed=26) (stream_read_total_bytes=232084) (stream_write_block_uploads=2)); gauges=((stream_write_block_uploads_pending=2)); minimums=((action_executor_acquired.min=0) (action_file_opened.min=49) (action_http_get_request.min=63) (action_http_head_request.min=40) (object_bulk_delete_request.min=173) (object_delete_request.min=51) (object_list_request.min=48) (object_put_request.min=82) (op_create.min=62) (op_delete.min=52) (op_exists.failures.min=110) (op_get_file_status.min=53) (op_glob_status.min=102) (op_list_files.min=176) (op_list_status.failures.min=172) (op_list_status.min=62) (op_mkdirs.min=313) (stream_read_remote_stream_drain.min=0)); maximums=((action_executor_acquired.max=0) (action_file_opened.max=100) (action_http_get_request.max=139) (action_http_head_request.max=206) (object_bulk_delete_request.max=454) (object_delete_request.max=70) (object_list_request.max=667) (object_put_request.max=123) (op_create.max=75) (op_delete.max=685) (op_exists.failures.max=120) (op_get_file_status.max=178) (op_glob_status.max=126) (op_list_files.max=285) (op_list_status.failures.max=184) (op_list_status.max=68) (op_mkdirs.max=847) (stream_read_remote_stream_drain.max=1)); means=((action_executor_acquired.mean=(samples=2, sum=0, mean=0.0000)) (action_file_opened.mean=(samples=26, sum=1515, mean=58.2692)) (action_http_get_request.mean=(samples=26, sum=1997, mean=76.8077)) (action_http_head_request.mean=(samples=112, sum=7160, mean=63.9286)) (object_bulk_delete_request.mean=(samples=4, sum=997, mean=249.2500)) (object_delete_request.mean=(samples=14, sum=810, mean=57.8571)) (object_list_request.mean=(samples=74, sum=7647, mean=103.3378)) (object_put_request.mean=(samples=16, sum=1497, mean=93.5625)) (op_create.mean=(samples=2, sum=137, mean=68.5000)) (op_delete.mean=(samples=16, sum=1935, mean=120.9375)) (op_exists.failures.mean=(samples=2, sum=230, mean=115.0000)) (op_get_file_status.mean=(samples=40, sum=3283, mean=82.0750)) (op_glob_status.mean=(samples=4, sum=459, mean=114.7500)) (op_list_files.mean=(samples=6, sum=1280, mean=213.3333)) (op_list_status.failures.mean=(samples=2, sum=356, mean=178.0000)) (op_list_status.mean=(samples=2, sum=130, mean=65.0000)) (op_mkdirs.mean=(samples=14, sum=5303, mean=378.7857)) (stream_read_remote_stream_drain.mean=(samples=26, sum=3, mean=0.1154))); Process finished with exit code 0 ``` Issue Time Tracking ------------------- Worklog Id: (was: 791114) Time Spent: 4h 20m (was: 4h 10m) > Add thread-level IOStatistics Context > ------------------------------------- > > Key: HADOOP-17461 > URL: https://issues.apache.org/jira/browse/HADOOP-17461 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs, fs/azure, fs/s3 > Affects Versions: 3.3.1 > Reporter: Steve Loughran > Assignee: Mehakmeet Singh > Priority: Major > Labels: pull-request-available > Time Spent: 4h 20m > Remaining Estimate: 0h > > For effective reporting of the iostatistics of individual worker threads, we > need a thread-level context which IO components update. > * this contact needs to be passed in two background thread forming work on > behalf of a task. > * IO Components (streams, iterators, filesystems) need to update this context > statistics as they perform work > * Without double counting anything. > I imagine a ThreadLocal IOStatisticContext which will be updated in the > FileSystem API Calls. This context MUST be passed into the background threads > used by a task, so that IO is correctly aggregated. > I don't want streams, listIterators &c to do the updating as there is more > risk of double counting. However, we need to see their statistics if we want > to know things like "bytes discarded in backwards seeks". And I don't want to > be updating a shared context object on every read() call. > If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the > FS is sufficient. > If we do want the stream-specific detail, then I propose > * caching the context in the constructor > * updating it only in close() or unbuffer() (as we do from S3AInputStream to > S3AInstrumenation) > * excluding those we know the FS already collects. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org