Re: [PR] TEZ-4451: ThreadLevel IO Stats Support for TEZ. [tez]

2024-02-05 Thread via GitHub


ayushtkn commented on code in PR #331:
URL: https://github.com/apache/tez/pull/331#discussion_r1479160714


##
tez-runtime-internals/src/main/java/org/apache/tez/runtime/task/TaskRunner2Callable.java:
##
@@ -28,6 +30,8 @@
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import static 
org.apache.hadoop.fs.statistics.IOStatisticsContext.getCurrentIOStatisticsContext;

Review Comment:
   I changed it, somehow my IDE did that automatically...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4451: ThreadLevel IO Stats Support for TEZ. [tez]

2024-02-05 Thread via GitHub


steveloughran commented on code in PR #331:
URL: https://github.com/apache/tez/pull/331#discussion_r1478949443


##
tez-runtime-internals/src/main/java/org/apache/tez/runtime/task/TaskRunner2Callable.java:
##
@@ -116,6 +120,11 @@ public TaskRunner2CallableResult run() throws Exception {
   // For a successful task, however, this should be almost no delay since 
close has already happened.
   maybeFixInterruptStatus();
   LOG.info("Cleaning up task {}, stopRequested={}", 
task.getTaskAttemptID(), stopRequested.get());
+  String ioStats = IOStatisticsLogging.ioStatisticsToPrettyString(

Review Comment:
   looks good. you know you can take a snapshot of it and serialize it as java 
serializable or json; been some discussion about making a Writable too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4451: ThreadLevel IO Stats Support for TEZ. [tez]

2024-02-05 Thread via GitHub


tez-yetus commented on PR #331:
URL: https://github.com/apache/tez/pull/331#issuecomment-1927464787

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |  22m 54s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  14m  2s |  master passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  master passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04  |
   | +1 :green_heart: |  compile  |   0m 27s |  master passed with JDK Private 
Build-1.8.0_392-8u392-ga-1~22.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m 17s |  master passed  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  master passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04  |
   | +1 :green_heart: |  javadoc  |   0m 21s |  master passed with JDK Private 
Build-1.8.0_392-8u392-ga-1~22.04-b08  |
   | +0 :ok: |  spotbugs  |   1m 16s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 13s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 16s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 16s |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04  |
   | +1 :green_heart: |  javac  |   0m 16s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 15s |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~22.04-b08  |
   | +1 :green_heart: |  javac  |   0m 15s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m  8s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  javadoc  |   0m  9s |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04  |
   | +1 :green_heart: |  javadoc  |   0m  9s |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~22.04-b08  |
   | +1 :green_heart: |  findbugs  |   0m 40s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 46s |  tez-runtime-internals in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 16s |  The patch does not generate 
ASF License warnings.  |
   |  |   |  45m 11s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-331/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/tez/pull/331 |
   | JIRA Issue | TEZ-4451 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs 
checkstyle compile |
   | uname | Linux e6ed98431096 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/tez.sh |
   | git revision | master / 5e1cdee75 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~22.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~22.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-331/3/testReport/ |
   | Max. process+thread count | 103 (vs. ulimit of 5500) |
   | modules | C: tez-runtime-internals U: tez-runtime-internals |
   | Console output | 
https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-331/3/console |
   | versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4451: ThreadLevel IO Stats Support for TEZ. [tez]

2024-02-05 Thread via GitHub


abstractdog commented on code in PR #331:
URL: https://github.com/apache/tez/pull/331#discussion_r1478387025


##
tez-runtime-internals/src/main/java/org/apache/tez/runtime/task/TaskRunner2Callable.java:
##
@@ -28,6 +30,8 @@
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import static 
org.apache.hadoop.fs.statistics.IOStatisticsContext.getCurrentIOStatisticsContext;

Review Comment:
   nit: if we import some classes from org.apache.hadoop.fs.statistics, 
couldn't we import them in the same way? is there a specific reason for static 
importing this method?
   I guess these would look better together, like:
   ```
   import org.apache.hadoop.fs.statistics.IOStatisticsLogging;
   import org.apache.hadoop.fs.statistics.IOStatisticsContext;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4451: ThreadLevel IO Stats Support for TEZ. [tez]

2024-02-05 Thread via GitHub


abstractdog commented on code in PR #331:
URL: https://github.com/apache/tez/pull/331#discussion_r1478387025


##
tez-runtime-internals/src/main/java/org/apache/tez/runtime/task/TaskRunner2Callable.java:
##
@@ -28,6 +30,8 @@
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import static 
org.apache.hadoop.fs.statistics.IOStatisticsContext.getCurrentIOStatisticsContext;

Review Comment:
   nit: if we import some classes from org.apache.hadoop.fs.statistics, 
couldn't we import them in the same way? is there a specific reason for static 
importing this method?
   I guess this would look better to be inline:
   ```
   import org.apache.hadoop.fs.statistics.IOStatisticsLogging;
   import org.apache.hadoop.fs.statistics.IOStatisticsContext;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4451: ThreadLevel IO Stats Support for TEZ. [tez]

2024-02-05 Thread via GitHub


abstractdog commented on code in PR #331:
URL: https://github.com/apache/tez/pull/331#discussion_r1478387025


##
tez-runtime-internals/src/main/java/org/apache/tez/runtime/task/TaskRunner2Callable.java:
##
@@ -28,6 +30,8 @@
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import static 
org.apache.hadoop.fs.statistics.IOStatisticsContext.getCurrentIOStatisticsContext;

Review Comment:
   nit: if we import some classes from org.apache.hadoop.fs.statistics, 
couldn't we import them in the same way? is there a specific reason for static 
importing this method?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4540: Reading proto data more than 2GB from multiple splits fails [tez]

2024-02-05 Thread via GitHub


tez-yetus commented on PR #334:
URL: https://github.com/apache/tez/pull/334#issuecomment-1926923425

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 15s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  17m 26s |  master passed  |
   | +1 :green_heart: |  compile  |   0m 29s |  master passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04  |
   | +1 :green_heart: |  compile  |   0m 28s |  master passed with JDK Private 
Build-1.8.0_392-8u392-ga-1~22.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m 17s |  master passed  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  master passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  master passed with JDK Private 
Build-1.8.0_392-8u392-ga-1~22.04-b08  |
   | +0 :ok: |  spotbugs  |   1m 13s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 11s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 17s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 17s |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04  |
   | +1 :green_heart: |  javac  |   0m 17s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 16s |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~22.04-b08  |
   | +1 :green_heart: |  javac  |   0m 16s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m  8s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  javadoc  |   0m  8s |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04  |
   | +1 :green_heart: |  javadoc  |   0m  9s |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~22.04-b08  |
   | +1 :green_heart: |  findbugs  |   0m 38s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 32s |  tez-protobuf-history-plugin in the 
patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 16s |  The patch does not generate 
ASF License warnings.  |
   |  |   |  25m 42s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-334/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/tez/pull/334 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs 
checkstyle compile |
   | uname | Linux 006060d13f5e 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/tez.sh |
   | git revision | master / 5e1cdee75 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~22.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~22.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-334/2/testReport/ |
   | Max. process+thread count | 105 (vs. ulimit of 5500) |
   | modules | C: tez-plugins/tez-protobuf-history-plugin U: 
tez-plugins/tez-protobuf-history-plugin |
   | Console output | 
https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-334/2/console |
   | versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4540: Reading proto data more than 2GB from multiple splits fails [tez]

2024-02-05 Thread via GitHub


Aggarwal-Raghav commented on code in PR #334:
URL: https://github.com/apache/tez/pull/334#discussion_r1478118024


##
tez-plugins/tez-protobuf-history-plugin/src/main/java/org/apache/tez/dag/history/logging/proto/ProtoMessageWritable.java:
##
@@ -96,6 +96,9 @@ public void readFields(DataInput in) throws IOException {
   cin = CodedInputStream.newInstance(din);
   cin.setSizeLimit(Integer.MAX_VALUE);
 }
+if (din.in != in) {
+  cin.resetSizeCounter();
+}

Review Comment:
   Thanks for the review @zabetak.
   **I missed this Java doc statement.**  I was suspecting that resetting the 
_totalBytesRetired_  after every message read might have unexpected impact 
therefore, I resetted it after every hdfs split read. But based on the Javadoc, 
I think we can reset the counter after every mesage read. Will modify the patch.
   
   Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4540: Reading proto data more than 2GB from multiple splits fails [tez]

2024-02-05 Thread via GitHub


zabetak commented on code in PR #334:
URL: https://github.com/apache/tez/pull/334#discussion_r1477969135


##
tez-plugins/tez-protobuf-history-plugin/src/main/java/org/apache/tez/dag/history/logging/proto/ProtoMessageWritable.java:
##
@@ -96,6 +96,9 @@ public void readFields(DataInput in) throws IOException {
   cin = CodedInputStream.newInstance(din);
   cin.setSizeLimit(Integer.MAX_VALUE);
 }
+if (din.in != in) {
+  cin.resetSizeCounter();
+}

Review Comment:
   The javadoc of `CodedInputStream#setSizeLimit` says the following:
   ```
   If you want to read several messages from a single CodedInputStream, you 
could call resetSizeCounter() after each one to avoid hitting the size limit.
   ```
   Based on that I would be inclined to reset the counter after every single 
message otherwise it still seems feasible to hit the same error if the 
`DataInput` is sufficiently large.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org