[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795596=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795596
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 27/Jul/22 10:23
Start Date: 27/Jul/22 10:23
Worklog Time Spent: 10m 
  Work Description: steveloughran merged PR #4639:
URL: https://github.com/apache/hadoop/pull/4639




Issue Time Tracking
---

Worklog Id: (was: 795596)
Time Spent: 11h 20m  (was: 11h 10m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795594=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795594
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 27/Jul/22 10:18
Start Date: 27/Jul/22 10:18
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4639:
URL: https://github.com/apache/hadoop/pull/4639#issuecomment-1196541774

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   7m 40s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 13s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 48s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |  21m 19s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   3m 46s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   3m 35s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   2m 33s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   5m  0s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  26m 55s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 50s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 25s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  17m 25s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 13s |  |  root: The patch generated 
0 new + 97 unchanged - 1 fixed = 97 total (was 98)  |
   | +1 :green_heart: |  mvnsite  |   3m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 37s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   5m  4s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 25s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 16s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 12s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 34s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 202m 11s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4639/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4639 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 60b0e0809438 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 
23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 6483021fd68e830862c9c85f5be6e1f41b696afe |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4639/1/testReport/ |
   | Max. process+thread count | 1674 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws 
U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4639/1/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 795594)
Time Spent: 11h 10m  (was: 11h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795535
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 27/Jul/22 06:55
Start Date: 27/Jul/22 06:55
Worklog Time Spent: 10m 
  Work Description: mehakmeet opened a new pull request, #4639:
URL: https://github.com/apache/hadoop/pull/4639

   This adds a thread-level collector of IOStatistics, IOStatisticsContext,
   which can be:
   * Retrieved for a thread and cached for access from other
 threads.
   * reset() to record new statistics.
   * Queried for live statistics through the
 IOStatisticsSource.getIOStatistics() method.
   * Queries for a statistics aggregator for use in instrumented
 classes.
   * Asked to create a serializable copy in snapshot()
   
   The goal is to make it possible for applications with multiple
   threads performing different work items simultaneously
   to be able to collect statistics on the individual threads,
   and so generate aggregate reports on the total work performed
   for a specific job, query or similar unit of work.
   
   Some changes in IOStatistics-gathering classes are needed for 
   this feature
   * Caching the active context's aggregator in the object's
 constructor
   * Updating it in close()
   
   Slightly more work is needed in multithreaded code,
   such as the S3A committers, which collect statistics across
   all threads used in task and job commit operations.
   
   Currently the IOStatisticsContext-aware classes are:
   * The S3A input stream, output stream and list iterators.
   * RawLocalFileSystem's input and output streams.
   * The S3A committers.
   * The TaskPool class in hadoop-common, which propagates
 the active context into scheduled worker threads.
   
   Collection of statistics in the IOStatisticsContext
   is disabled process-wide by default until the feature 
   is considered stable.
   
   To enable the collection, set the option
   fs.thread.level.iostatistics.enabled
   to "true" in core-site.xml;

   Contributed by Mehakmeet Singh and Steve Loughran
   
   
   
   ### Description of PR
   
   
   ### How was this patch tested?
   Region: `ap-south-1`
   All tests ran successfully.
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [X] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [X] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [X] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




Issue Time Tracking
---

Worklog Id: (was: 795535)
Time Spent: 10h 50m  (was: 10h 40m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795536
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 27/Jul/22 06:55
Start Date: 27/Jul/22 06:55
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on PR #4639:
URL: https://github.com/apache/hadoop/pull/4639#issuecomment-1196335276

   backports: #4352.
   CC: @steveloughran 




Issue Time Tracking
---

Worklog Id: (was: 795536)
Time Spent: 11h  (was: 10h 50m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795416=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795416
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 26/Jul/22 20:31
Start Date: 26/Jul/22 20:31
Worklog Time Spent: 10m 
  Work Description: steveloughran closed pull request #4566: HADOOP-17461. 
IOStatisticsContext + committer integration
URL: https://github.com/apache/hadoop/pull/4566




Issue Time Tracking
---

Worklog Id: (was: 795416)
Time Spent: 10h 40m  (was: 10.5h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795401
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 26/Jul/22 19:41
Start Date: 26/Jul/22 19:41
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1195904127

   ok, merged. Can you do a PR/cherrypick into branch-3.3 now?




Issue Time Tracking
---

Worklog Id: (was: 795401)
Time Spent: 10.5h  (was: 10h 20m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795400=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795400
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 26/Jul/22 19:41
Start Date: 26/Jul/22 19:41
Worklog Time Spent: 10m 
  Work Description: steveloughran merged PR #4352:
URL: https://github.com/apache/hadoop/pull/4352




Issue Time Tracking
---

Worklog Id: (was: 795400)
Time Spent: 10h 20m  (was: 10h 10m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795343
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 26/Jul/22 16:51
Start Date: 26/Jul/22 16:51
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1195733167

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 44s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 39s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 30s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  9s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 44s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 44s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  1s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  4s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 11s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 42s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 31s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 17s |  |  root: The patch generated 
0 new + 97 unchanged - 1 fixed = 97 total (was 98)  |
   | +1 :green_heart: |  mvnsite  |   3m 24s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 45s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 30s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 36s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m  7s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 26s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 241m 14s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/17/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux fca9fdd4589a 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0db1a76e21be9996a6d3f257630ba1ce0ca8b745 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795293=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795293
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 26/Jul/22 14:18
Start Date: 26/Jul/22 14:18
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1195545593

   +1 pending yetus




Issue Time Tracking
---

Worklog Id: (was: 795293)
Time Spent: 10h  (was: 9h 50m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795214=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795214
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 26/Jul/22 10:58
Start Date: 26/Jul/22 10:58
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1195331924

   (the streams need to cache the aggregator and update in close)




Issue Time Tracking
---

Worklog Id: (was: 795214)
Time Spent: 9h 50m  (was: 9h 40m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795213
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 26/Jul/22 10:57
Start Date: 26/Jul/22 10:57
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1195331465

   -1 to the last patch. pick up 9dd221dda28 from my PR




Issue Time Tracking
---

Worklog Id: (was: 795213)
Time Spent: 9h 40m  (was: 9.5h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795208=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795208
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 26/Jul/22 10:16
Start Date: 26/Jul/22 10:16
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1195291441

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 37s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 13s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 27s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 25s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  21m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 48s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 35s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 54s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 28s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  3s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  8s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 40s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 29s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 45s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 45s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 17s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 44s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 50s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 43s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 20s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 40s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 19s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 243m  1s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/16/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 18348d0e00b5 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / de12dbe2a061edf49f147e634c8cee29a31b09ed |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795074=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795074
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 25/Jul/22 23:02
Start Date: 25/Jul/22 23:02
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4566:
URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1194744613

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  6s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 20s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  26m  1s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  22m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 32s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 18s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 25s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 47s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m  7s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  25m 37s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 43s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  25m  2s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  25m  2s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  22m 35s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 29s |  |  root: The patch generated 
0 new + 97 unchanged - 1 fixed = 97 total (was 98)  |
   | +1 :green_heart: |  mvnsite  |   3m 13s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 17s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 54s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 37s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 33s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m  2s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 20s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 254m 31s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4566 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 950ed0434155 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9dd221dda28fd9e7d7ffb5d4605e6b3d89774fef |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=795019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795019
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 25/Jul/22 19:00
Start Date: 25/Jul/22 19:00
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1194488631

   ok, one more change, as I write up something on integration. The RawLocal 
streams should support IOContext too.
   
   why so? makes testing integration easier, e.g. for distcp, spark etc. no 
need to wait until object store tests.
   
   This also lines up for moving ITestS3AIOStatisticsContext into hadoop common 
unit tests as a contract test.
   
   I'm not going to make that a requirement of this PR, but for adding abfs in 
we should do that




Issue Time Tracking
---

Worklog Id: (was: 795019)
Time Spent: 9h 10m  (was: 9h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=794943=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794943
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 25/Jul/22 15:16
Start Date: 25/Jul/22 15:16
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1194187132

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 48s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 47s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 18s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  21m 53s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 32s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 13s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 23s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  5s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 39s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 44s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  25m 11s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 26s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 25s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  24m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 51s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 51s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 24s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/15/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 77 unchanged - 0 fixed = 78 total (was 
77)  |
   | +1 :green_heart: |  mvnsite  |   3m 11s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 15s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 25s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  27m  1s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 56s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m  8s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 18s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 253m 20s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/15/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 76bd37db2c7b 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / cdc9915d9c52644d35057844c101653858ed10eb |
   | 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=794893=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794893
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 25/Jul/22 13:33
Start Date: 25/Jul/22 13:33
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1194055457

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 20s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 48s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  24m 51s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  21m 56s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 38s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 23s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  5s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 31s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  3s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 33s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 35s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 49s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  23m 53s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  23m 53s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m  1s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  22m  1s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 23s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/14/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 77 unchanged - 0 fixed = 79 total (was 
77)  |
   | +1 :green_heart: |  mvnsite  |   3m 46s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 20s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  6s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 12s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 51s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 59s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 34s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 245m 27s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/14/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux e41385851cf2 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 81c0480a7d0eabd171c9e7cdaccbb3939ec85559 |
   | 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=794813=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794813
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 25/Jul/22 10:17
Start Date: 25/Jul/22 10:17
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r928703332


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java:
##
@@ -1351,6 +1351,7 @@ public boolean hasCapability(String capability) {
 case StreamCapabilities.READAHEAD:
 case StreamCapabilities.UNBUFFER:
 case StreamCapabilities.VECTOREDIO:
+case StreamCapabilities.IOSTATISTICS_CONTEXT:

Review Comment:
   put it in alphabetical order



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextIntegration.java:
##
@@ -116,13 +116,17 @@ public static IOStatisticsContext 
getCurrentIOStatisticsContext() {
   /**
* Set the IOStatisticsContext for the current thread.
* @param statisticsContext IOStatistics context instance for the
-   * current thread.
+   * current thread. If null, the context is reset.
*/
   public static void setThreadIOStatisticsContext(
   IOStatisticsContext statisticsContext) {
-if (IS_THREAD_IOSTATS_ENABLED &&
-ACTIVE_IOSTATS_CONTEXT.getForCurrentThread() != statisticsContext) {
-  ACTIVE_IOSTATS_CONTEXT.setForCurrentThread(statisticsContext);
+if (IS_THREAD_IOSTATS_ENABLED) {

Review Comment:
   we should have a test to verify this doesn't trigger an NPE and removes the 
context
   
   test
   * get thread context
   * set the current context to null
   * get the thread context again
   * verify the ID values don't match, because a new context was generated



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextIntegration.java:
##
@@ -152,4 +152,13 @@ public static IOStatisticsContext 
getThreadSpecificIOStatisticsContext(long test
 }
 return null;
   }
+
+  /**
+   * A method to enable IOStatisticsContext to override if set otherwise in
+   * the configurations for tests.
+   */
+  @VisibleForTesting
+  public static void enableIOStatisticsContext() {
+IS_THREAD_IOSTATS_ENABLED = true;

Review Comment:
   how about, if it wasn't already true, log at info that it is being set



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextIntegration.java:
##
@@ -53,7 +53,7 @@ public final class IOStatisticsContextIntegration {
   /**
* Is thread-level IO Statistics enabled?
*/
-  private static final boolean IS_THREAD_IOSTATS_ENABLED;
+  private static boolean IS_THREAD_IOSTATS_ENABLED;

Review Comment:
   needs to become lower case





Issue Time Tracking
---

Worklog Id: (was: 794813)
Time Spent: 8h 40m  (was: 8.5h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=794220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794220
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 22/Jul/22 12:54
Start Date: 22/Jul/22 12:54
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1192543668

   good point. 
   
   how about adding a static method to enable it in the integration class?
   
   disabling it would be harder...what if contexts had already been generated? 
easier to only allow the caller to go from off to on.
   
   also, had an idea:
   
   add a stream capability `fs.capability.iocontext.supported` for streams 
which update iocontext.
   
   then add asserts to the test to verify this
   
   1. lines up for making the tests contract tests
   2. if the prefetching stream isn't (yet) context aware, we will know why the 
tests are failing




Issue Time Tracking
---

Worklog Id: (was: 794220)
Time Spent: 8.5h  (was: 8h 20m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=794188=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794188
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 22/Jul/22 11:42
Start Date: 22/Jul/22 11:42
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1192486376

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 10s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 44s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 40s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  22m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 10s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  8s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 28s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m  8s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  23m 33s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 33s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  24m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  22m  6s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/13/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   4m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 33s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 30s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 56s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m  3s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m  8s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 29s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 245m 55s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/13/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 0377eb935cf3 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 359206d7a5726bde59ad77997f698989f18e18cc |
   | Default 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=794121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794121
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 22/Jul/22 08:25
Start Date: 22/Jul/22 08:25
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1192316937

   Pushed the changes, makes sense in case of null IOStatisticsContext. There 
is one more issue, in case we have `fs.thread.level.iostatistics.enabled=false` 
which means empty counters, hence the test getting failed in the test in this 
PR only, since it requires thread-level IOstatsitics to be enabled, now we do 
the `removeBaseAndBucketOverrides(configuration,
   THREAD_LEVEL_IOSTATISTICS_ENABLED);
   configuration.setBoolean(THREAD_LEVEL_IOSTATISTICS_ENABLED, true);`
   But, since the way this is toggled is in a static initializer for the 
`IOStatisticsContextIntegration` utility class, it does it's calculation before 
this, and setting it afterwards doesn't have an affect, how do we ensure that 
this property is enabled for this test always? Reload the class using 
classLoaders(not sure if this works since I tried using the 
Thread.currentThread.getClassLoaderContext)? 
   
   Also
   > Contributed by Mehakmeet Singh
   
   *and Steve Loughran




Issue Time Tracking
---

Worklog Id: (was: 794121)
Time Spent: 8h 10m  (was: 8h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=793785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793785
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 21/Jul/22 14:57
Start Date: 21/Jul/22 14:57
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4566:
URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1191589460

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 17s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 15s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 14s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  21m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 34s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 12s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 24s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  3s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 39s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 28s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  24m 55s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 19s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  24m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 51s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 51s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/5/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   4m 26s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 10s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 16s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 51s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 56s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  20m  2s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 16s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 16s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 252m 47s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4566 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 0af63939de89 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / ac6a3fa248e00111cbeb94c6b767b7c2b80641c7 |
   | Default 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=793770=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793770
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 21/Jul/22 14:35
Start Date: 21/Jul/22 14:35
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4566:
URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1191563524

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 46s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 11s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 19s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 45s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 57s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 27s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  0s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  8s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 40s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 23s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 59s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 59s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/4/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   4m 14s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 40s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 53s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  7s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 40s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 45s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 13s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 242m 35s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4566 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 16cb6c89e5fb 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 40ca262a7183799ae076cef877511d597a1da3c3 |
   | Default 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=793670=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793670
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 21/Jul/22 10:38
Start Date: 21/Jul/22 10:38
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1191326991

   This is my draft commit message btw
   
   
   
   
   Adds a new IOStatistics class IOStatisticsContext.
   
   This is the active collector of thread-level statistics for
   The current thread.
   
   The S3A Filesystem's input and output streams, and listing
   operations, all update this context when close() is called on
   them (and not before!), so there is effectively automatic
   aggregation of all IO statistics performed by a single thread.
   
   The IOStatisticsContext of a thread can be retrieved and
   cached for invocation in other threads. Holding such a
   reference also ensures that the context will not be garbage
   collected.
   
   To collect statistics on a thread:
   
   1. Retrieve the active context with a call to
  IOStatisticsContext.getCurrentIOStatisticsContext()
   2. Call IOStatisticsContext.reset() to reset all statistics
   3. Call getIOStatistics() on it for the latest values, or
  snapshot() for a snapshot of them.
 
   To instrument filesystem objects for thread-level
   IOStatistics
   
   1. Cache the current IOStatisticsContext context or just its
  aggregator in the object constructor.
   2. In the close() operation, aggregate() the object's own statistics.
   3. Pass the context into worker threads performing work
  on behalf of this thread, through
 IOStatisticsContext.setThreadIOStatisticsContext();
 set it to null afterwards.
   
   TaskPool does the context propagation and reset
   automatically.
   
   Contributed by Mehakmeet Singh
   
   ---




Issue Time Tracking
---

Worklog Id: (was: 793670)
Time Spent: 7h 40m  (was: 7.5h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=793665=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793665
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 21/Jul/22 10:33
Start Date: 21/Jul/22 10:33
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1191322172

   i was writing a commit message with instructions on use, realised that the 
task pools sharing meant we needed a way to reset the context, then that the 
pool should do this...adding that and I concluded that the TaskPool should 
automatically pick up and propagate the context. Which it now does.
   
   no new tests, not tested at all...




Issue Time Tracking
---

Worklog Id: (was: 793665)
Time Spent: 7.5h  (was: 7h 20m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=793659=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793659
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 21/Jul/22 10:14
Start Date: 21/Jul/22 10:14
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r926503951


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextIntegration.java:
##
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.statistics.impl;
+
+import java.lang.ref.WeakReference;
+import java.util.concurrent.atomic.AtomicLong;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.impl.WeakReferenceThreadMap;
+import org.apache.hadoop.fs.statistics.IOStatisticsContext;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED;
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED_DEFAULT;
+
+/**
+ * A Utility class for IOStatisticsContext, which helps in creating and
+ * getting the current active context. Static methods in this class allows to
+ * get the current context to start aggregating the IOStatistics.
+ *
+ * Static initializer is used to work out if the feature to collect
+ * thread-level IOStatistics is enabled or not and the corresponding
+ * implementation class is called for it.
+ *
+ * Weak Reference thread map to be used to keep track of different context's
+ * to avoid long-lived memory leakages as these references would be cleaned
+ * up at GC.
+ */
+public final class IOStatisticsContextIntegration {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(IOStatisticsContextIntegration.class);
+
+  /**
+   * Is thread-level IO Statistics enabled?
+   */
+  private static final boolean IS_THREAD_IOSTATS_ENABLED;
+
+  /**
+   * ID for next instance to create.
+   */
+  public static final AtomicLong INSTANCE_ID = new AtomicLong(1);
+
+  /**
+   * Active IOStatistics Context containing different worker thread's
+   * statistics. Weak Reference so that it gets cleaned up during GC and we
+   * avoid any memory leak issues due to long lived references.
+   */
+  private static final WeakReferenceThreadMap
+  ACTIVE_IOSTATS_CONTEXT =
+  new WeakReferenceThreadMap<>(
+  IOStatisticsContextIntegration::createNewInstance,
+  IOStatisticsContextIntegration::referenceLostContext
+  );
+
+  static {
+// Work out if the current context has thread level IOStatistics enabled.
+final Configuration configuration = new Configuration();
+IS_THREAD_IOSTATS_ENABLED =
+configuration.getBoolean(THREAD_LEVEL_IOSTATISTICS_ENABLED,
+THREAD_LEVEL_IOSTATISTICS_ENABLED_DEFAULT);
+  }
+
+  /**
+   * Private constructor for a utility class to be used in IOStatisticsContext.
+   */
+  private IOStatisticsContextIntegration() {}
+
+  /**
+   * Creating a new IOStatisticsContext instance for a FS to be used.
+   * @param key Thread ID that represents which thread the context belongs to.
+   * @return an instance of IOStatisticsContext.
+   */
+  private static IOStatisticsContext createNewInstance(Long key) {
+return new IOStatisticsContextImpl(key, INSTANCE_ID.getAndIncrement());
+  }
+
+  /**
+   * In case of reference loss for IOStatisticsContext.
+   * @param key ThreadID.
+   */
+  private static void referenceLostContext(Long key) {
+LOG.debug("Reference lost for threadID for the context: {}", key);
+  }
+
+  /**
+   * Get the current thread's IOStatisticsContext instance. If no instance is
+   * present for this thread ID, create one using the factory.
+   * @return instance of IOStatisticsContext.
+   */
+  public static IOStatisticsContext getCurrentIOStatisticsContext() {
+return 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=793658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793658
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 21/Jul/22 10:12
Start Date: 21/Jul/22 10:12
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r926485594


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextIntegration.java:
##
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.statistics.impl;
+
+import java.lang.ref.WeakReference;
+import java.util.concurrent.atomic.AtomicLong;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.impl.WeakReferenceThreadMap;
+import org.apache.hadoop.fs.statistics.IOStatisticsContext;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED;
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED_DEFAULT;
+
+/**
+ * A Utility class for IOStatisticsContext, which helps in creating and
+ * getting the current active context. Static methods in this class allows to
+ * get the current context to start aggregating the IOStatistics.
+ *
+ * Static initializer is used to work out if the feature to collect
+ * thread-level IOStatistics is enabled or not and the corresponding
+ * implementation class is called for it.
+ *
+ * Weak Reference thread map to be used to keep track of different context's
+ * to avoid long-lived memory leakages as these references would be cleaned
+ * up at GC.
+ */
+public final class IOStatisticsContextIntegration {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(IOStatisticsContextIntegration.class);
+
+  /**
+   * Is thread-level IO Statistics enabled?
+   */
+  private static final boolean IS_THREAD_IOSTATS_ENABLED;
+
+  /**
+   * ID for next instance to create.
+   */
+  public static final AtomicLong INSTANCE_ID = new AtomicLong(1);
+
+  /**
+   * Active IOStatistics Context containing different worker thread's
+   * statistics. Weak Reference so that it gets cleaned up during GC and we
+   * avoid any memory leak issues due to long lived references.
+   */
+  private static final WeakReferenceThreadMap
+  ACTIVE_IOSTATS_CONTEXT =
+  new WeakReferenceThreadMap<>(
+  IOStatisticsContextIntegration::createNewInstance,
+  IOStatisticsContextIntegration::referenceLostContext
+  );
+
+  static {
+// Work out if the current context has thread level IOStatistics enabled.
+final Configuration configuration = new Configuration();
+IS_THREAD_IOSTATS_ENABLED =
+configuration.getBoolean(THREAD_LEVEL_IOSTATISTICS_ENABLED,
+THREAD_LEVEL_IOSTATISTICS_ENABLED_DEFAULT);
+  }
+
+  /**
+   * Private constructor for a utility class to be used in IOStatisticsContext.
+   */
+  private IOStatisticsContextIntegration() {}
+
+  /**
+   * Creating a new IOStatisticsContext instance for a FS to be used.
+   * @param key Thread ID that represents which thread the context belongs to.
+   * @return an instance of IOStatisticsContext.
+   */
+  private static IOStatisticsContext createNewInstance(Long key) {
+return new IOStatisticsContextImpl(key, INSTANCE_ID.getAndIncrement());
+  }
+
+  /**
+   * In case of reference loss for IOStatisticsContext.
+   * @param key ThreadID.
+   */
+  private static void referenceLostContext(Long key) {
+LOG.debug("Reference lost for threadID for the context: {}", key);
+  }
+
+  /**
+   * Get the current thread's IOStatisticsContext instance. If no instance is
+   * present for this thread ID, create one using the factory.
+   * @return instance of IOStatisticsContext.
+   */
+  public static IOStatisticsContext getCurrentIOStatisticsContext() {
+return 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=793463=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793463
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 20/Jul/22 23:02
Start Date: 20/Jul/22 23:02
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1190851834

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 31s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 34s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 11s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 51s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 50s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  2s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 52s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  24m 24s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 35s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 45s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 24s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 50s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 50s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 19s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 42s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 44s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 41s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 12s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 19s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 51s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 20s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 39s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 244m 18s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/12/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 747c1495282e 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 7b3c3356cd52f5378c4d68b6ddfed548d074ddf6 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=793393=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793393
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 20/Jul/22 18:58
Start Date: 20/Jul/22 18:58
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1190642189

   added your latest commit + Testing for full suite done for both enabled and 
disabled `fs.s3a.committer.experimental.collect.iostatistics` too.




Issue Time Tracking
---

Worklog Id: (was: 793393)
Time Spent: 6h 50m  (was: 6h 40m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=793390=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793390
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 20/Jul/22 18:52
Start Date: 20/Jul/22 18:52
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4566:
URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1190636833

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 15s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 12s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 46s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m  9s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  21m 52s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 29s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 13s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 24s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 38s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 37s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  25m  4s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 20s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  24m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 49s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 49s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 22s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/3/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 77 unchanged - 0 fixed = 79 total (was 
77)  |
   | +1 :green_heart: |  mvnsite  |   3m 11s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 24s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/3/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  
hadoop-common-project_hadoop-common-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | +1 :green_heart: |  javadoc  |   2m  4s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 50s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 44s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 42s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 10s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 18s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 250m 50s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/3/artifact/out/Dockerfile
 |
   | GITHUB PR | 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=793158=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793158
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 20/Jul/22 10:34
Start Date: 20/Jul/22 10:34
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1190110412

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 42s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m  5s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 21s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  24m  5s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  22m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 35s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 59s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 43s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  3s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 21s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 54s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 35s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 45s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 25s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 50s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 50s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 14s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 33s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 45s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 40s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 11s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m  8s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 10s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 20s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 244m 37s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/11/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux bd453dad2670 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 6f7e72ee80015a6f9396a3ab1d8896a142fef464 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=792888=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-792888
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 19/Jul/22 19:48
Start Date: 19/Jul/22 19:48
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r924896943


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContext.java:
##
@@ -31,34 +31,34 @@
  * 
  * The {@link #snapshot()} call creates a snapshot of the statistics;
  * 
- * The {@link #reset()} call resets the statistics in the current thread so
+ * The {@link #reset()} call resets the statistics in the context so
  * that later snapshots will get the incremental data.
  */
 public interface IOStatisticsContext extends IOStatisticsSource {

Review Comment:
   this needs to get moved to 
org.apache.hadoop.fs.statistics.IOStatisticsContext as it is the public api the 
apps need; the impl stuff is kept private for the filesystems.
   





Issue Time Tracking
---

Worklog Id: (was: 792888)
Time Spent: 6h 20m  (was: 6h 10m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=792887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-792887
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 19/Jul/22 19:42
Start Date: 19/Jul/22 19:42
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1189482342

   
   I now have a branch of spark set up to use this, albeit not production ready
   
   
https://github.com/steveloughran/hadoop/tree/s3/HADOOP-17461-iostatisticsContext
   
   * TaskMetrics includes an IOStatisticsSnapshot in its serialized data
   * IOStatisticsContext is retrieved and reset at start of read/write work; 
updated at end.
   
   Based on where the read bytes/write bytes counters were read.
   
   Having done that, I've realised that the rdds and writers are not quite the 
correct place as the IOStats collect all IO on the thread, and are both the 
readers and writers will be working on that thread.
   
   The place to do it is actually in the ResultTask, with IOStatisticsSnapshot 
being one of the accumulators which is sent back. the spark driver would keep 
the core IOStatisticsSnapshot up to date with results from success and failure 
tasks.




Issue Time Tracking
---

Worklog Id: (was: 792887)
Time Spent: 6h 10m  (was: 6h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=792549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-792549
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 19/Jul/22 09:51
Start Date: 19/Jul/22 09:51
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1188844927

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  44m 47s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m 25s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 19s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  22m  3s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 15s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 24s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 38s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 11s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  25m 12s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 54s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 52s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  24m 52s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 57s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  22m 57s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m  2s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 44s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 25s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 19s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m 30s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 16s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 32s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 283m 38s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/10/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 2607ccafc906 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f54ad18a9e2230191dc4c0c504d332772bceb23e |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=792297=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-792297
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 18/Jul/22 18:13
Start Date: 18/Jul/22 18:13
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4566:
URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1188020315

   exactly. distcp, good thought. let's not worry about collecting results there




Issue Time Tracking
---

Worklog Id: (was: 792297)
Time Spent: 5h 50m  (was: 5h 40m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=792244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-792244
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 18/Jul/22 16:50
Start Date: 18/Jul/22 16:50
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1187742820

   yeah, reviewed javadocs and code. criticised my own work.  checkstyles
   ```
   
./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextIntegration.java:46:public
 class IOStatisticsContextIntegration {:1: Utility classes should not have a 
public or default constructor. [HideUtilityClassConstructor]
   
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/impl/CommitContext.java:370:
  /**: First sentence should end with a period. [JavadocStyle]
   
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:135:import
 org.apache.hadoop.fs.statistics.impl.IOStatisticsContextImpl;:8: Unused import 
- org.apache.hadoop.fs.statistics.impl.IOStatisticsContextImpl. [UnusedImports]
   
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/commit/AbstractCommitITest.java:73:
  /**: First sentence should end with a period. [JavadocStyle]
   
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AIOStatisticsContext.java:168:
final ExecutorService executor =:27: 'executor' hides a field. [HiddenField]
   ```




Issue Time Tracking
---

Worklog Id: (was: 792244)
Time Spent: 5h 40m  (was: 5.5h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=792243=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-792243
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 18/Jul/22 16:48
Start Date: 18/Jul/22 16:48
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r920337912


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextImpl.java:
##
@@ -0,0 +1,224 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.statistics.impl;
+
+import java.lang.ref.WeakReference;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.impl.WeakReferenceThreadMap;
+import org.apache.hadoop.fs.statistics.IOStatistics;
+import org.apache.hadoop.fs.statistics.IOStatisticsAggregator;
+import org.apache.hadoop.fs.statistics.IOStatisticsSnapshot;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED;
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED_DEFAULT;
+
+/**
+ * Implementing the IOStatisticsContext.
+ *
+ * A Context defined for IOStatistics collection per thread which captures
+ * each worker thread's work in FS streams and stores it in the form of
+ * IOStatisticsSnapshot if thread level aggregation is enabled else returning
+ * an instance of EmptyIOStatisticsStore for the FS. An active instance of
+ * the IOStatisticsContext can be used to collect the statistics.
+ *
+ * For the current thread the IOStatisticsSnapshot can be used as a way to
+ * move the IOStatistics data between applications using the Serializable
+ * nature of the class.
+ */
+public class IOStatisticsContextImpl implements IOStatisticsContext {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(IOStatisticsContextImpl.class);
+  private static final boolean IS_THREAD_IOSTATS_ENABLED;
+
+  /**
+   * Active IOStatistics Context containing different worker thread's
+   * statistics. Weak Reference so that it gets cleaned up during GC and we
+   * avoid any memory leak issues due to long lived references.
+   */
+  private static final WeakReferenceThreadMap
+  ACTIVE_IOSTATS_CONTEXT =
+  new WeakReferenceThreadMap<>(IOStatisticsContextImpl::createNewInstance,
+  IOStatisticsContextImpl::referenceLostContext
+  );
+
+  /**
+   * Collecting IOStatistics per thread.
+   */
+  private final WeakReferenceThreadMap
+  threadLevelIOStatisticsMap =
+  new WeakReferenceThreadMap<>(this::getIOStatisticsSnapshotFactory,
+  this::referenceLost);
+
+  static {
+// Work out if the current context has thread level IOStatistics enabled.
+final Configuration configuration = new Configuration();
+IS_THREAD_IOSTATS_ENABLED =
+configuration.getBoolean(THREAD_LEVEL_IOSTATISTICS_ENABLED,
+THREAD_LEVEL_IOSTATISTICS_ENABLED_DEFAULT);
+  }
+
+  /**
+   * Creating a new IOStatisticsContext instance for a FS to be used. If

Review Comment:
   cropped sentence at end



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextImpl.java:
##
@@ -0,0 +1,224 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=791919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791919
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 18/Jul/22 09:08
Start Date: 18/Jul/22 09:08
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1186951915

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 39s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 44s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m  3s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  8s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 25s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 34s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 58s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 43s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  3s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  9s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 40s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 45s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  23m  8s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  23m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 52s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 52s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 17s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/9/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 5 new + 76 unchanged - 0 fixed = 81 total (was 
76)  |
   | +1 :green_heart: |  mvnsite  |   3m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 51s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 43s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 45s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 59s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 19s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 30s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 242m 55s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/9/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 6932bdb96253 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / bf84de68fd815ca9cc2f33f830a81f1c3a0263c0 |
   | 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=791846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791846
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 18/Jul/22 05:07
Start Date: 18/Jul/22 05:07
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1186772913

   Cherry-picked S3A committers integration and lining up with spark build. 
Some javadocs would need modification, adding that in a commit. Need to see how 
to test this on a spark build locally, since spark might not be using the 
correct hadoop version iirc, then we could get this in




Issue Time Tracking
---

Worklog Id: (was: 791846)
Time Spent: 5h 10m  (was: 5h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=791834=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791834
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 18/Jul/22 04:32
Start Date: 18/Jul/22 04:32
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on PR #4566:
URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1186757937

   > should these committers reset the context before task/job commit.
   
   In spark we would be resetting before the execution begins I assume, so 
having it reset in the committers could mess things up? Was thinking about the 
case like a disctp job, how do we reset the stats in that case? 




Issue Time Tracking
---

Worklog Id: (was: 791834)
Time Spent: 5h  (was: 4h 50m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=791552=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791552
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 15/Jul/22 18:54
Start Date: 15/Jul/22 18:54
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4566:
URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1185819028

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 56s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 24s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  4s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 44s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  0s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 56s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 13s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 43s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 45s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 17s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 48s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 48s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 13s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/2/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 5 new + 76 unchanged - 0 fixed = 81 total (was 
76)  |
   | +1 :green_heart: |  mvnsite  |   3m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 50s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 43s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 16s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 53s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 26s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 241m 49s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4566 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 9bb736781002 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / bf6ee6f5eb507d306691088906e8e5623becc1a5 |
   | 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=791466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791466
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 15/Jul/22 14:58
Start Date: 15/Jul/22 14:58
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4566:
URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1185628638

   
   Here are the stats for a magic run where we also measure list calls.
   
   The fact that hsync is being called shows that the terasort output is also 
being picked up. not intentional.
   
   raises a big issue: should these committers reset the context before 
task/job commit.
   
   ```
commit.AbstractCommitITest (AbstractCommitITest.java:printStatistics(118)) 
- Aggregate job statistics counters=((action_executor_acquired=7)
   (committer_bytes_committed=800042)
   (committer_commit_job=10)
   (committer_commits_completed=14)
   (committer_jobs_completed=10)
   (committer_magic_marker_put=7)
   (committer_materialize_file=14)
   (object_list_request=17)
   (object_multipart_initiated=7)
   (op_hsync=6)
   (stream_write_block_uploads=7)
   (stream_write_bytes=400021)
   (stream_write_total_data=400021));
   
   gauges=();
   
   minimums=((action_executor_acquired.min=0)
   (committer_commit_job.min=301)
   (committer_magic_marker_put.min=70)
   (committer_materialize_file.min=129)
   (object_list_request.min=64)
   (object_multipart_initiated.min=74));
   
   maximums=((action_executor_acquired.max=11)
   (committer_commit_job.max=643)
   (committer_magic_marker_put.max=127)
   (committer_materialize_file.max=478)
   (object_list_request.max=319)
   (object_multipart_initiated.max=97));
   
   means=((action_executor_acquired.mean=(samples=14, sum=65, mean=4.6429))
   (committer_commit_job.mean=(samples=10, sum=4974, mean=497.4000))
   (committer_magic_marker_put.mean=(samples=7, sum=641, mean=91.5714))
   (committer_materialize_file.mean=(samples=14, sum=3171, mean=226.5000))
   (object_list_request.mean=(samples=17, sum=2098, mean=123.4118))
   (object_multipart_initiated.mean=(samples=7, sum=598, mean=85.4286)));
   ```
   
   this is going to overreport, but i don't think the committers should be 
interfering with the context which should really be managed by the app




Issue Time Tracking
---

Worklog Id: (was: 791466)
Time Spent: 4h 40m  (was: 4.5h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=791185=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791185
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 14/Jul/22 23:52
Start Date: 14/Jul/22 23:52
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4566:
URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1185033366

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 57s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 43s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m  7s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 14s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 44s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 22s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 47s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  0s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 43s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  6s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 31s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  23m  3s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 47s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 24s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 33s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 33s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 15s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/1/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 5 new + 75 unchanged - 0 fixed = 80 total (was 
75)  |
   | +1 :green_heart: |  mvnsite  |   3m 44s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 52s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 41s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 15s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 19s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 47s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 23s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 241m 58s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4566/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4566 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 8b0e37433c5b 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 1e9e6600d862bd7c33f627a670509bcf47cfeaa7 |
   | 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=791113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791113
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 14/Jul/22 19:49
Start Date: 14/Jul/22 19:49
Worklog Time Spent: 10m 
  Work Description: steveloughran opened a new pull request, #4566:
URL: https://github.com/apache/hadoop/pull/4566

   This is PR #4352 with a new patch to integrate with the MR committers, at 
least
   as far as collecting read stats from job commits and including in _SUCCESS
   the committer ITests collect these and print them.
   
   * move reference map and lookup to a IOStatisticsContextIntegration class
   * static method in IOStatisticsContext to relay lookup
   * add method to switch a thread's context; needed to aggregate worker thread
 IO in threads doing work for committers without the need to explicitly
 collect and pass back the stats
   * production code moves to the new methods
   * tests move to this and away from looking up the fields in the streams
   * stats are reset in s3a test setup
   * s3a committers collect data read stats during job commit and include
 in summary statistics. This is only the stats when reading manifest
 files, not the actual work.
   * tests to print the aggregate of all loaded success files in the run.
   
   ### How was this patch tested?
   
   new/modified tests
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




Issue Time Tracking
---

Worklog Id: (was: 791113)
Time Spent: 4h 10m  (was: 4h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=791114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791114
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 14/Jul/22 19:49
Start Date: 14/Jul/22 19:49
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4566:
URL: https://github.com/apache/hadoop/pull/4566#issuecomment-1184835595

   
   you are going to love this. Full end to end stats collection of IO during 
job commit of terasort.
   
   ```
   
   2022-07-14 20:37:38,332 [JUnit] INFO  s3a.AbstractS3ATestBase 
(AbstractS3ATestBase.java:dumpFileSystemIOStatistics(137)) - Aggregate 
FileSystem Statistics counters=((action_executor_acquired=2)
   (action_file_opened=26)
   (action_http_get_request=26)
   (action_http_head_request=112)
   (audit_request_execution=250)
   (audit_span_creation=144)
   (directories_created=14)
   (directories_deleted=14)
   (fake_directories_deleted=4)
   (files_created=2)
   (files_deleted=16)
   (ignored_errors=8)
   (object_bulk_delete_request=4)
   (object_delete_objects=34)
   (object_delete_request=14)
   (object_list_request=74)
   (object_metadata_request=112)
   (object_put_request=16)
   (object_put_request_completed=16)
   (op_create=2)
   (op_delete=16)
   (op_exists=2)
   (op_exists.failures=2)
   (op_get_file_status=40)
   (op_glob_status=4)
   (op_list_files=6)
   (op_list_located_status=4)
   (op_list_status=4)
   (op_list_status.failures=2)
   (op_mkdirs=14)
   (op_open=30)
   (store_io_request=251)
   (stream_read_bytes=232084)
   (stream_read_close_operations=26)
   (stream_read_closed=26)
   (stream_read_fully_operations=10)
   (stream_read_opened=26)
   (stream_read_operations=41)
   (stream_read_operations_incomplete=23)
   (stream_read_remote_stream_drain=26)
   (stream_read_seek_policy_changed=26)
   (stream_read_total_bytes=232084)
   (stream_write_block_uploads=2));
   
   gauges=((stream_write_block_uploads_pending=2));
   
   minimums=((action_executor_acquired.min=0)
   (action_file_opened.min=49)
   (action_http_get_request.min=63)
   (action_http_head_request.min=40)
   (object_bulk_delete_request.min=173)
   (object_delete_request.min=51)
   (object_list_request.min=48)
   (object_put_request.min=82)
   (op_create.min=62)
   (op_delete.min=52)
   (op_exists.failures.min=110)
   (op_get_file_status.min=53)
   (op_glob_status.min=102)
   (op_list_files.min=176)
   (op_list_status.failures.min=172)
   (op_list_status.min=62)
   (op_mkdirs.min=313)
   (stream_read_remote_stream_drain.min=0));
   
   maximums=((action_executor_acquired.max=0)
   (action_file_opened.max=100)
   (action_http_get_request.max=139)
   (action_http_head_request.max=206)
   (object_bulk_delete_request.max=454)
   (object_delete_request.max=70)
   (object_list_request.max=667)
   (object_put_request.max=123)
   (op_create.max=75)
   (op_delete.max=685)
   (op_exists.failures.max=120)
   (op_get_file_status.max=178)
   (op_glob_status.max=126)
   (op_list_files.max=285)
   (op_list_status.failures.max=184)
   (op_list_status.max=68)
   (op_mkdirs.max=847)
   (stream_read_remote_stream_drain.max=1));
   
   means=((action_executor_acquired.mean=(samples=2, sum=0, mean=0.))
   (action_file_opened.mean=(samples=26, sum=1515, mean=58.2692))
   (action_http_get_request.mean=(samples=26, sum=1997, mean=76.8077))
   (action_http_head_request.mean=(samples=112, sum=7160, mean=63.9286))
   (object_bulk_delete_request.mean=(samples=4, sum=997, mean=249.2500))
   (object_delete_request.mean=(samples=14, sum=810, mean=57.8571))
   (object_list_request.mean=(samples=74, sum=7647, mean=103.3378))
   (object_put_request.mean=(samples=16, sum=1497, mean=93.5625))
   (op_create.mean=(samples=2, sum=137, mean=68.5000))
   (op_delete.mean=(samples=16, sum=1935, mean=120.9375))
   (op_exists.failures.mean=(samples=2, sum=230, mean=115.))
   (op_get_file_status.mean=(samples=40, sum=3283, mean=82.0750))
   (op_glob_status.mean=(samples=4, sum=459, mean=114.7500))
   (op_list_files.mean=(samples=6, sum=1280, mean=213.))
   (op_list_status.failures.mean=(samples=2, sum=356, mean=178.))
   (op_list_status.mean=(samples=2, sum=130, mean=65.))
   (op_mkdirs.mean=(samples=14, sum=5303, mean=378.7857))
   (stream_read_remote_stream_drain.mean=(samples=26, sum=3, mean=0.1154)));
   
   
   Process finished with exit code 0
   ```




Issue Time Tracking
---

Worklog Id: (was: 791114)
Time Spent: 4h 20m  (was: 4h 10m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>   

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=790929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790929
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 14/Jul/22 13:03
Start Date: 14/Jul/22 13:03
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1184422633

   ok, i have some changes but I will add them as a commit for you to 
review/cherrypick...i want to make sure they line up with a local spark build.
   
   key point: interfaces now support static methods, so looking up the current 
context can/should be a static method in IOStatisticsContext




Issue Time Tracking
---

Worklog Id: (was: 790929)
Time Spent: 4h  (was: 3h 50m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=790596=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790596
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 13/Jul/22 20:56
Start Date: 13/Jul/22 20:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1183670262

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 40s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  44m 22s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 13s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 12s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 26s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 47s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 58s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  0s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  5s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 38s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m  7s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 40s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 40s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 13s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 52s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 41s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 53s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m  9s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 270m 13s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux dca9ac835711 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / c2dcad142b639640681fa17960202b00befef2f4 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=790113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790113
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 12/Jul/22 18:33
Start Date: 12/Jul/22 18:33
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1182217571

   see https://github.com/steveloughran/spark/tree/HADOOP-17461-iostatistics
   
   retrieval
   
https://github.com/steveloughran/spark/blob/HADOOP-17461-iostatistics/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L211
   
   task metrics
   
https://github.com/steveloughran/spark/blob/HADOOP-17461-iostatistics/core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala
   
   and where we'd be doing the collection for the spark RDDs (not the sql code, 
which is more significant but harder to get started on)
   
https://github.com/steveloughran/spark/blob/HADOOP-17461-iostatistics/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala#L172




Issue Time Tracking
---

Worklog Id: (was: 790113)
Time Spent: 3h 40m  (was: 3.5h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=790111=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790111
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 12/Jul/22 18:17
Start Date: 12/Jul/22 18:17
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r919262464


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContext.java:
##
@@ -0,0 +1,238 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.statistics.impl;
+
+import java.lang.ref.WeakReference;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.impl.WeakReferenceThreadMap;
+import org.apache.hadoop.fs.statistics.IOStatistics;
+import org.apache.hadoop.fs.statistics.IOStatisticsAggregator;
+import org.apache.hadoop.fs.statistics.IOStatisticsSnapshot;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED;
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED_DEFAULT;
+
+/**
+ * Implementing the IOStatisticsContext.
+ *
+ * A Context defined for IOStatistics collection per thread which captures
+ * each worker thread's work in FS streams and stores it in the form of
+ * IOStatisticsSnapshot if thread level aggregation is enabled else returning
+ * an instance of EmptyIOStatisticsStore for the FS. An active instance of
+ * the IOStatisticsContext can be used to collect the statistics.
+ *
+ * For the current thread the IOStatisticsSnapshot can be used as a way to
+ * move the IOStatistics data between applications using the Serializable
+ * nature of the class.
+ */
+public class IOStatisticsContext {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(IOStatisticsContext.class);
+  private static final boolean IS_THREAD_IOSTATS_ENABLED;
+
+  /**
+   * Active IOStatistics Context containing different worker thread's
+   * statistics. Weak Reference so that it gets cleaned up during GC and we
+   * avoid any memory leak issues due to long lived references.
+   */
+  private static final WeakReferenceThreadMap
+  ACTIVE_IOSTATS_CONTEXT =
+  new WeakReferenceThreadMap<>(IOStatisticsContext::createNewInstance,
+  IOStatisticsContext::referenceLostContext
+  );
+
+  /**
+   * Collecting IOStatistics per thread.
+   */
+  private final WeakReferenceThreadMap
+  threadIOStatsContext =
+  new WeakReferenceThreadMap<>(this::getIOStatisticsSnapshotFactory,
+  this::referenceLost);
+
+  static {
+// Work out if the current context has thread level IOStatistics enabled.
+final Configuration configuration = new Configuration();
+IS_THREAD_IOSTATS_ENABLED =
+configuration.getBoolean(THREAD_LEVEL_IOSTATISTICS_ENABLED,
+THREAD_LEVEL_IOSTATISTICS_ENABLED_DEFAULT);
+  }
+
+  /**
+   * Creating a new IOStatisticsContext instance for a FS to be used.
+   *
+   * @param key Thread ID that represents which thread the context belongs to.
+   * @return an instance of IOStatisticsContext.
+   */
+  private static IOStatisticsContext createNewInstance(Long key) {
+return new IOStatisticsContext();
+  }
+
+  /**
+   * Get the current IOStatisticsContext.
+   *
+   * @return current IOStatisticsContext instance.
+   */
+  public static IOStatisticsContext currentIOStatisticsContext() {
+return ACTIVE_IOSTATS_CONTEXT.get(Thread.currentThread().getId());
+  }
+
+  /**
+   * A Method to act as an IOStatisticsSnapshot factory, in a
+   * WeakReferenceThreadMap.
+   *
+   * @param key ThreadID.
+   * @return an Instance of IOStatisticsSnapshot.
+   */
+  private IOStatisticsSnapshot getIOStatisticsSnapshotFactory(Long key) {
+return new IOStatisticsSnapshot();
+  }
+
+  /**
+   * In case of reference loss.
+   *
+   * 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=790099=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790099
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 12/Jul/22 17:24
Start Date: 12/Jul/22 17:24
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1182052415

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 35s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 11s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 22s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 45s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  0s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 57s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 28s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  23m  0s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 48s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 14s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 49s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 49s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 15s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 51s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  8s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 20s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 40s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 20s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 241m 55s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/7/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux d849f9acb1b6 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 82bc205d749d727d03750a1698adf34f7c2a88fd |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=790033=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790033
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 12/Jul/22 13:26
Start Date: 12/Jul/22 13:26
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1181759094

   Changed the way streams are accessing the thread IOStatistics, now we would 
directly get them from the current active context in the stream's constructor 
rather than pass them around through the builders, as it didn't seem to add 
anything if we can directly get it due to static nature of the context. Also, 
made the weakRef as IOstatisticsSnapshot rather than an aggregator and then 
cast as discussed above.




Issue Time Tracking
---

Worklog Id: (was: 790033)
Time Spent: 3h 10m  (was: 3h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=789485=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-789485
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 11/Jul/22 11:33
Start Date: 11/Jul/22 11:33
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r917829041


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContext.java:
##
@@ -51,21 +51,21 @@ public class IOStatisticsContext {
   private static final boolean IS_THREAD_IOSTATS_ENABLED;
 
   private static final WeakReferenceThreadMap
-  ACTIVE_IOSTATS_CONTEXT = new WeakReferenceThreadMap<>(
-  IOStatisticsContext::createNewInstance,
-  IOStatisticsContext::referenceLostContext
+  ACTIVE_IOSTATS_CONTEXT =
+  new WeakReferenceThreadMap<>(IOStatisticsContext::createNewInstance,
+  IOStatisticsContext::referenceLostContext
   );
 
   /**
* Collecting IOStatistics per thread.
*/
   private final WeakReferenceThreadMap

Review Comment:
   Okay, I'll do that. My thinking behind this was to also have a way to get 
EmptyIOstatisticsSource as one of the implementations we could return for a 
thread ID, which would have a no-op for aggregation, but thinking this through, 
it seems like having this thread map solely for "enabled" thread IOStats makes 
more sense and we can return the EmptyIOstatisticsSource, while getting the 
IOstats without setting it in. Makes resetting easier too. 





Issue Time Tracking
---

Worklog Id: (was: 789485)
Time Spent: 3h  (was: 2h 50m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=789476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-789476
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 11/Jul/22 11:26
Start Date: 11/Jul/22 11:26
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r917816203


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContext.java:
##
@@ -51,21 +51,21 @@ public class IOStatisticsContext {
   private static final boolean IS_THREAD_IOSTATS_ENABLED;
 
   private static final WeakReferenceThreadMap

Review Comment:
   let's add a brief comment here as it is a key part of the design



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContext.java:
##
@@ -164,13 +165,25 @@ public void setThreadIOStatistics(
*
* @return IOStatisticsSnapshot of the current thread.
*/
-  public IOStatisticsSnapshot getThreadIOStatisticsSnapshot() {
+  public IOStatisticsSnapshot snapshotCurrentThreadIOStatistics() {
 if (IS_THREAD_IOSTATS_ENABLED) {
   return (IOStatisticsSnapshot) getThreadIOStatistics();
 }
 return new IOStatisticsSnapshot();
   }
 
+  /**
+   * Reset the thread IOStatistics for current thread.
+   */
+  public void resetThreadIOStatisticsForCurrentThread() {
+if (IS_THREAD_IOSTATS_ENABLED) {
+  IOStatisticsSnapshot currentThreadIOStatsSnapshot =
+  (IOStatisticsSnapshot) getThreadIOStatistics();
+  currentThreadIOStatsSnapshot.clear();
+  setThreadIOStatistics(currentThreadIOStatsSnapshot);

Review Comment:
   don't think this is needed



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContext.java:
##
@@ -51,21 +51,21 @@ public class IOStatisticsContext {
   private static final boolean IS_THREAD_IOSTATS_ENABLED;
 
   private static final WeakReferenceThreadMap
-  ACTIVE_IOSTATS_CONTEXT = new WeakReferenceThreadMap<>(
-  IOStatisticsContext::createNewInstance,
-  IOStatisticsContext::referenceLostContext
+  ACTIVE_IOSTATS_CONTEXT =
+  new WeakReferenceThreadMap<>(IOStatisticsContext::createNewInstance,
+  IOStatisticsContext::referenceLostContext
   );
 
   /**
* Collecting IOStatistics per thread.
*/
   private final WeakReferenceThreadMap

Review Comment:
   now we are casting to a snapshot, why not make this of type 
`IOStatisticsSnapshot`



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContext.java:
##
@@ -164,13 +165,25 @@ public void setThreadIOStatistics(
*
* @return IOStatisticsSnapshot of the current thread.
*/
-  public IOStatisticsSnapshot getThreadIOStatisticsSnapshot() {
+  public IOStatisticsSnapshot snapshotCurrentThreadIOStatistics() {
 if (IS_THREAD_IOSTATS_ENABLED) {
   return (IOStatisticsSnapshot) getThreadIOStatistics();
 }
 return new IOStatisticsSnapshot();
   }
 
+  /**
+   * Reset the thread IOStatistics for current thread.
+   */
+  public void resetThreadIOStatisticsForCurrentThread() {
+if (IS_THREAD_IOSTATS_ENABLED) {
+  IOStatisticsSnapshot currentThreadIOStatsSnapshot =
+  (IOStatisticsSnapshot) getThreadIOStatistics();

Review Comment:
   This is going to demand create a snapshot context even if there wasn't one 
needing resetting. How about you just look in the reference map (via 
lookup(currentThreadId())); and only if a ref is returned do the resetting?





Issue Time Tracking
---

Worklog Id: (was: 789476)
Time Spent: 2h 50m  (was: 2h 40m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-07-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=789464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-789464
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 11/Jul/22 11:00
Start Date: 11/Jul/22 11:00
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1180254837

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 46s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m 30s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 57s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 23s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 44s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 59s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  0s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  0s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 32s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 47s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 13s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 26s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 13s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/6/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 75 unchanged - 0 fixed = 76 total (was 
75)  |
   | +1 :green_heart: |  mvnsite  |   3m 44s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 50s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javadoc  |   1m 21s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/6/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 2 new + 
37 unchanged - 0 fixed = 39 total (was 37)  |
   | +1 :green_heart: |  spotbugs  |   5m  4s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m 24s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 10s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 245m 43s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=785213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-785213
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 27/Jun/22 19:04
Start Date: 27/Jun/22 19:04
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r907678976


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContext.java:
##
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.statistics.impl;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.impl.WeakReferenceThreadMap;
+import org.apache.hadoop.fs.statistics.IOStatistics;
+import org.apache.hadoop.fs.statistics.IOStatisticsAggregator;
+import org.apache.hadoop.fs.statistics.IOStatisticsSnapshot;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED;
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED_DEFAULT;
+
+/**
+ * Implementing the IOStatisticsContext interface.

Review Comment:
   javadocs here don't match with latest code.



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContext.java:
##
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.statistics.impl;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.impl.WeakReferenceThreadMap;
+import org.apache.hadoop.fs.statistics.IOStatistics;
+import org.apache.hadoop.fs.statistics.IOStatisticsAggregator;
+import org.apache.hadoop.fs.statistics.IOStatisticsSnapshot;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED;
+import static 
org.apache.hadoop.fs.CommonConfigurationKeys.THREAD_LEVEL_IOSTATISTICS_ENABLED_DEFAULT;
+
+/**
+ * Implementing the IOStatisticsContext interface.
+ *
+ * A Context defined for IOStatistics collection per thread which captures
+ * each worker thread's work in FS streams and stores it in the form of
+ * IOStatisticsAggregator which could be either IOStatisticsSnapshot or
+ * EmptyIOStatisticsStore if thread level aggregation is enabled or not for
+ * the FS. An active instance of the IOStatisticsContext can be used to
+ * collect the statistics.
+ *
+ * For the current thread the IOStatisticsSnapshot can be used as a way to
+ * move the IOStatistics data between applications using the Serializable
+ * nature of the class.
+ */
+public class IOStatisticsContext {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(IOStatisticsContext.class);
+  private static final boolean IS_THREAD_IOSTATS_ENABLED;
+
+  private static final WeakReferenceThreadMap
+  ACTIVE_IOSTATS_CONTEXT = new WeakReferenceThreadMap<>(
+  IOStatisticsContext::createNewInstance,

Review Comment:
   nit, indent 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=784144=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784144
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 23/Jun/22 11:18
Start Date: 23/Jun/22 11:18
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1164284637

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   2m 18s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 10s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 24s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 45s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  21m 44s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m  5s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 31s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  2s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 54s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  23m 22s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 54s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 42s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  24m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 24s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 27s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 22s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javadoc  |   1m  0s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/5/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 2 new + 
37 unchanged - 0 fixed = 39 total (was 37)  |
   | +1 :green_heart: |  spotbugs  |   4m 56s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m  6s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m  5s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 56s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 27s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 246m 42s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 4b91d0b6f8a6 4.15.0-112-generic #113-Ubuntu SMP 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=784060=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784060
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 23/Jun/22 07:11
Start Date: 23/Jun/22 07:11
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1164037866

   More merge conflicts due to vectored IO merging. 
   Interesting bit of info on the new test, seems to fail in maven verify but 
works fine in intellij, I remember Steve saying something about how tests run 
differently in mvn than intellij wrt Threads, need to check that.




Issue Time Tracking
---

Worklog Id: (was: 784060)
Time Spent: 2h 10m  (was: 2h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=783786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783786
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 22/Jun/22 09:25
Start Date: 22/Jun/22 09:25
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1162865131

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 33s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 35s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  22m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m  8s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  1s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 24s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  3s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 53s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 36s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  25m  5s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 53s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  25m 25s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  25m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  22m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m  6s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/4/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 3 new + 74 unchanged - 0 fixed = 77 total (was 
74)  |
   | +1 :green_heart: |  mvnsite  |   3m  8s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 18s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javadoc  |   1m  2s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/4/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 2 new + 
38 unchanged - 0 fixed = 40 total (was 38)  |
   | +1 :green_heart: |  spotbugs  |   4m 48s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 24s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 43s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 54s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 20s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 247m 29s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=783782=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783782
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 22/Jun/22 09:17
Start Date: 22/Jun/22 09:17
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1162856649

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 42s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  24m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  1s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 22s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 43s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  0s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 59s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  9s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 42s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 44s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  23m  1s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  23m  1s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 54s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 54s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 15s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/3/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 3 new + 74 unchanged - 0 fixed = 77 total (was 
74)  |
   | +1 :green_heart: |  mvnsite  |   3m 41s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javadoc  |   1m 16s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/3/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 2 new + 
38 unchanged - 0 fixed = 40 total (was 38)  |
   | +1 :green_heart: |  spotbugs  |   5m 16s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 46s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 14s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 12s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 241m 22s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=783703=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783703
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 22/Jun/22 05:20
Start Date: 22/Jun/22 05:20
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1162654911

   Had to force push to resolve conflicts.
   
   Some changes in the latest commit:
   - Moved thread context initialization and config for enabling in 
hadoop-common.
   - Weak ref thread map for iostats context, have a doubt here regarding this 
whether having context and the iostats as weak ref would cause any issues, 
since both are now mapped to thread IDs. Thinking in terms of one context per 
s3afs with iostats per thread.
   - Added new test
   - Disabling iostats returns EmptyIOstatisticsStore instance with no-op 
aggregation.




Issue Time Tracking
---

Worklog Id: (was: 783703)
Time Spent: 1h 40m  (was: 1.5h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=782506=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782506
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 17/Jun/22 18:36
Start Date: 17/Jun/22 18:36
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r900419091


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -374,6 +376,16 @@ public class S3AFileSystem extends FileSystem implements 
StreamCapabilities,
*/
   private ArnResource accessPoint;
 
+  /**
+   * Does the fs have thread-level IOStats support enabled?
+   */
+  private boolean isThreadLevelIOStatsEnabled;

Review Comment:
   as discussed, this should go into hadoop common



##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java:
##
@@ -444,6 +450,9 @@ public void close() throws IOException {
*/
   private synchronized void cleanupOnClose() {
 cleanupWithLogger(LOG, getActiveBlock(), blockFactory);
+if(ioStatisticsContext != null) {

Review Comment:
   nit, add a space



##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/LambdaTestUtils.java:
##
@@ -65,6 +66,15 @@ private LambdaTestUtils() {
*/
   public static final String NULL_RESULT = "(null)";
 
+  /**
+   * Atomic references to be used to re-throw an Exception or an ASE
+   * caught inside a lambda function.
+   */
+  public static final AtomicReference futureExcp =

Review Comment:
   yes



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/IOStatisticsContext.java:
##
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.statistics;
+
+/**
+ * Context used to capture IOStatistics per thread in a task.
+ */
+public interface IOStatisticsContext {
+
+  IOStatisticsSnapshot getThreadIOStatistics();

Review Comment:
   nit: javadocs





Issue Time Tracking
---

Worklog Id: (was: 782506)
Time Spent: 1.5h  (was: 1h 20m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=782504=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782504
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 17/Jun/22 18:31
Start Date: 17/Jun/22 18:31
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1159135030

   we need a single IOStatisticsContext for all FS instances, so that a task 
reading from one fs and writing to another would have the stats updated from 
both actions.
   
   And it needs to be in hadoop-common, just like the common audit context, so 
that code can compile and link against it even without having hadoop-aws or 
hadoop-azure on the classpath.
   
   Making the context a weak ref map ensures that GCs will trigger cleanup. 
This would lose stats, but not while any stream was active, *or if some code 
picked up a reference to the thread stats before executing work. That is what I 
plan to do in spark; we will grab that ref before starting the work, and after 
it is finished, take a snapshot of it. Oh, and reset the values before work 
starts -we need that context there too.
   
   




Issue Time Tracking
---

Worklog Id: (was: 782504)
Time Spent: 1h 20m  (was: 1h 10m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=781924=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781924
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 16/Jun/22 06:00
Start Date: 16/Jun/22 06:00
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1157264920

   Thanks for the review @steveloughran, sorry couldn't address anything until 
now(got little ill)
   > main comment is that the thread's statistic aggregator should be 
fetched/stored in constructor, not in close
   
   Got your point, so just one concern on that, should the IOStatisticsContext 
be a static instance in the S3AFileSystem, and we just pass on the 
iostatisticsAggregator to the streams, since we would still require the context 
in the streams to update the WeakReferenceThreadMap after the aggregation, 
right?




Issue Time Tracking
---

Worklog Id: (was: 781924)
Time Spent: 1h 10m  (was: 1h)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=781923=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781923
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 16/Jun/22 05:59
Start Date: 16/Jun/22 05:59
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r898719852


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AReadOpContext.java:
##
@@ -77,11 +80,13 @@ public S3AReadOpContext(
   Invoker invoker,
   @Nullable FileSystem.Statistics stats,
   S3AStatisticsContext instrumentation,
-  FileStatus dstFileStatus) {
+  FileStatus dstFileStatus,
+  final IOStatisticsContext ioStatisticsContext) {
 
 super(invoker, stats, instrumentation,
 dstFileStatus);
 this.path = requireNonNull(path);
+this.ioStatisticsContext = ioStatisticsContext;

Review Comment:
   Adding EmptyIOStatisticsContext for the disabled scenario with 
EmptyIOStatisticsStore as the aggregator which is a no-op.





Issue Time Tracking
---

Worklog Id: (was: 781923)
Time Spent: 1h  (was: 50m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=781922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781922
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 16/Jun/22 05:57
Start Date: 16/Jun/22 05:57
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r898719209


##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/LambdaTestUtils.java:
##
@@ -65,6 +66,15 @@ private LambdaTestUtils() {
*/
   public static final String NULL_RESULT = "(null)";
 
+  /**
+   * Atomic references to be used to re-throw an Exception or an ASE
+   * caught inside a lambda function.
+   */
+  public static final AtomicReference futureExcp =

Review Comment:
   ah I see, since we wanted the atomic ref to these exceptions for lambda 
functions, I thought this might be the best place for these, you reckon we move 
this out of here?





Issue Time Tracking
---

Worklog Id: (was: 781922)
Time Spent: 50m  (was: 40m)

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-06-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=780090=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780090
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 09/Jun/22 18:37
Start Date: 09/Jun/22 18:37
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#discussion_r893822412


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextImpl.java:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.statistics.impl;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.fs.impl.WeakReferenceThreadMap;
+import org.apache.hadoop.fs.statistics.IOStatisticsContext;
+import org.apache.hadoop.fs.statistics.IOStatisticsSnapshot;
+
+/**
+ * Implementing the IOStatisticsContext interface.
+ */
+public class IOStatisticsContextImpl implements IOStatisticsContext {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(IOStatisticsContextImpl.class);
+
+  /**
+   * Collecting IOStatistics per thread.
+   */
+  private final WeakReferenceThreadMap
+  threadIOStatsContext = new WeakReferenceThreadMap<>(
+  this::getIOStatisticsSnapshotFactory,

Review Comment:
   needs indentation



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextImpl.java:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.statistics.impl;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.fs.impl.WeakReferenceThreadMap;
+import org.apache.hadoop.fs.statistics.IOStatisticsContext;
+import org.apache.hadoop.fs.statistics.IOStatisticsSnapshot;
+
+/**
+ * Implementing the IOStatisticsContext interface.
+ */
+public class IOStatisticsContextImpl implements IOStatisticsContext {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(IOStatisticsContextImpl.class);
+
+  /**
+   * Collecting IOStatistics per thread.
+   */
+  private final WeakReferenceThreadMap
+  threadIOStatsContext = new WeakReferenceThreadMap<>(
+  this::getIOStatisticsSnapshotFactory,
+  this::referenceLost);
+
+  /**
+   * A Method to act as an IOStatisticsSnapshot factory, in a
+   * WeakReferenceThreadMap.
+   *
+   * @param key ThreadID.
+   * @return an Instance of IOStatisticsSnapshot.
+   */
+  private IOStatisticsSnapshot getIOStatisticsSnapshotFactory(Long key) {
+return new IOStatisticsSnapshot();
+  }
+
+  /**
+   * In case of reference loss.
+   *
+   * @param key ThreadID.
+   */
+  private void referenceLost(Long key) {
+LOG.info("Reference lost for threadID: {}", key);

Review Comment:
   maybe debug



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsContextImpl.java:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=775143=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775143
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 26/May/22 16:54
Start Date: 26/May/22 16:54
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1138786813

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 26s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 38s |  |  trunk passed  |
   | -1 :x: |  compile  |   4m 21s | 
[/branch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/2/artifact/out/branch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  root in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   3m 32s | 
[/branch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/2/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  root in trunk failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  checkstyle  |   3m 50s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 26s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m  4s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  22m 29s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 40s |  |  the patch passed  |
   | -1 :x: |  compile  |   3m 50s | 
[/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/2/artifact/out/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  root in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  javac  |   3m 50s | 
[/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/2/artifact/out/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  root in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   3m 25s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/2/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | -1 :x: |  javac  |   3m 25s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/2/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 22s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/2/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 3 new + 6 unchanged - 0 fixed = 9 total (was 6)  |
   | +1 :green_heart: |  mvnsite  |   2m 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=774151=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774151
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 24/May/22 17:41
Start Date: 24/May/22 17:41
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4352:
URL: https://github.com/apache/hadoop/pull/4352#issuecomment-1136252339

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 44s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 24s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m 11s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  2s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 28s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 24s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 45s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 59s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  2s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  5s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 35s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 10s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 38s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 38s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 15s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/1/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 3 new + 6 unchanged - 0 fixed = 9 total (was 6)  |
   | +1 :green_heart: |  mvnsite  |   3m 38s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 53s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javadoc  |   1m 19s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/1/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 2 new + 
38 unchanged - 0 fixed = 40 total (was 38)  |
   | +1 :green_heart: |  spotbugs  |   5m 12s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 30s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m  2s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 241m 16s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4352/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4352 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 10f5099454c2 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven 

[jira] [Work logged] (HADOOP-17461) Add thread-level IOStatistics Context

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17461?focusedWorklogId=774010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774010
 ]

ASF GitHub Bot logged work on HADOOP-17461:
---

Author: ASF GitHub Bot
Created on: 24/May/22 13:39
Start Date: 24/May/22 13:39
Worklog Time Spent: 10m 
  Work Description: mehakmeet opened a new pull request, #4352:
URL: https://github.com/apache/hadoop/pull/4352

   ### Description of PR
   Adding Thread-level IOStatsitics in hadoop-common and implementing it in S3A 
Streams.
   
   ### How was this patch tested?
   Region: ap-south-1
   `mvn clean verify -Dparallel-tests -DtestsThreadCount=4 -Dscale`
   
   All tests ran fine.
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [X] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [X] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [X] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




Issue Time Tracking
---

Worklog Id: (was: 774010)
Remaining Estimate: 0h
Time Spent: 10m

> Add thread-level IOStatistics Context
> -
>
> Key: HADOOP-17461
> URL: https://issues.apache.org/jira/browse/HADOOP-17461
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For effective reporting of the iostatistics of individual worker threads, we 
> need a thread-level context which IO components update.
> * this contact needs to be passed in two background thread forming work on 
> behalf of a task.
> * IO Components (streams, iterators, filesystems) need to update this context 
> statistics as they perform work
> * Without double counting anything.
> I imagine a ThreadLocal IOStatisticContext which will be updated in the 
> FileSystem API Calls. This context MUST be passed into the background threads 
> used by a task, so that IO is correctly aggregated.
> I don't want streams, listIterators  to do the updating as there is more 
> risk of double counting. However, we need to see their statistics if we want 
> to know things like "bytes discarded in backwards seeks". And I don't want to 
> be updating a shared context object on every read() call.
> If all we want is store IO (HEAD, GET, DELETE, list performance etc) then the 
> FS is sufficient. 
> If we do want the stream-specific detail, then I propose
> * caching the context in the constructor
> * updating it only in close() or unbuffer() (as we do from S3AInputStream to 
> S3AInstrumenation)
> * excluding those we know the FS already collects.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org