[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=543019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543019
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 19:18
Start Date: 27/Jan/21 19:18
Worklog Time Spent: 10m 
  Work Description: HeartSaVioR commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-768515963


   Thanks for reviewing and merging!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 543019)
Time Spent: 2.5h  (was: 2h 20m)

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Assignee: Jungtaek Lim
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=543012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543012
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 19:08
Start Date: 27/Jan/21 19:08
Worklog Time Spent: 10m 
  Work Description: steveloughran merged pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 543012)
Time Spent: 2h 20m  (was: 2h 10m)

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=542838=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542838
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 13:46
Start Date: 27/Jan/21 13:46
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-768295570


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 23s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 16s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 34s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 38s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 17s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 14s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  12m 56s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m 16s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m  0s |  |  hadoop-mapreduce-client-core in 
the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 31s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  80m 43s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2624/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2624 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux dd8707cc185a 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 80c7404b519 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2624/4/testReport/ |
   | Max. process+thread count | 1559 (vs. ulimit of 5500) |
   | modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
U: 

[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=542824=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542824
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 13:26
Start Date: 27/Jan/21 13:26
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-768283949


   Yeah, that's good. we don't build the string unless its being logged, (and 
its only done at the start, not re-evaluated later), so keeping it efficient is 
nice.
   
   LGTM. +1 pending Yetus, being happy (and ignoring its complaints about tests)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 542824)
Time Spent: 2h  (was: 1h 50m)

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=542791=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542791
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 12:19
Start Date: 27/Jan/21 12:19
Worklog Time Spent: 10m 
  Work Description: HeartSaVioR commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-768248565


   Ah OK. Good to know. I didn't realize it prints two times for entrance and 
exit. Will fix. Probably I'll have to go back to use `from` instead of 
`from.getPath()` then.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 542791)
Time Spent: 1h 50m  (was: 1h 40m)

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=542751=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542751
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 10:23
Start Date: 27/Jan/21 10:23
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#2624:
URL: https://github.com/apache/hadoop/pull/2624#discussion_r565187998



##
File path: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java
##
@@ -455,53 +455,50 @@ protected void commitJobInternal(JobContext context) 
throws IOException {
*/
   private void mergePaths(FileSystem fs, final FileStatus from,
   final Path to, JobContext context) throws IOException {
-long timeStartNs = -1L;
-if (LOG.isDebugEnabled()) {
-  timeStartNs = System.nanoTime();
-  LOG.debug("Merging data from " + from + " to " + to);
-}
-reportProgress(context);
-FileStatus toStat;
-try {
-  toStat = fs.getFileStatus(to);
-} catch (FileNotFoundException fnfe) {
-  toStat = null;
-}
-
-if (from.isFile()) {
-  if (toStat != null) {
-if (!fs.delete(to, true)) {
-  throw new IOException("Failed to delete " + to);
-}
+try (DurationInfo d = new DurationInfo(LOG,
+false,
+"Merged data from %s to %s", from.getPath(), to)) {

Review comment:
   we actually print this *at start* as well as end; end has timings. So 
you don't need lines 461 & 462

##
File path: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java
##
@@ -455,53 +455,50 @@ protected void commitJobInternal(JobContext context) 
throws IOException {
*/
   private void mergePaths(FileSystem fs, final FileStatus from,
   final Path to, JobContext context) throws IOException {
-long timeStartNs = -1L;
-if (LOG.isDebugEnabled()) {
-  timeStartNs = System.nanoTime();
-  LOG.debug("Merging data from " + from + " to " + to);
-}
-reportProgress(context);
-FileStatus toStat;
-try {
-  toStat = fs.getFileStatus(to);
-} catch (FileNotFoundException fnfe) {
-  toStat = null;
-}
-
-if (from.isFile()) {
-  if (toStat != null) {
-if (!fs.delete(to, true)) {
-  throw new IOException("Failed to delete " + to);
-}
+try (DurationInfo d = new DurationInfo(LOG,
+false,
+"Merged data from %s to %s", from.getPath(), to)) {
+  if (LOG.isDebugEnabled()) {

Review comment:
   cut these; duplicate now





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 542751)
Time Spent: 1h 40m  (was: 1.5h)

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For 

[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=542670=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542670
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 06:17
Start Date: 27/Jan/21 06:17
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-768060327


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  56m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 46s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 38s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 45s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 25s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 20s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 18s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  13m 45s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 19s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m 20s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m  2s |  |  hadoop-mapreduce-client-core in 
the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 139m  4s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2624/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2624 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 70605f252ffa 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 80c7404b519 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2624/3/testReport/ |
   | Max. process+thread count | 1676 (vs. ulimit of 5500) |
   | modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
U: 

[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=542620=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542620
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 04:00
Start Date: 27/Jan/21 04:00
Worklog Time Spent: 10m 
  Work Description: HeartSaVioR commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-768006743


   Thanks for suggestion. Reflected review comment. The most of add/delete 
lines are simply indentation change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 542620)
Time Spent: 1h 20m  (was: 1h 10m)

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=542494=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542494
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 26/Jan/21 22:58
Start Date: 26/Jan/21 22:58
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-767882952


   Use DurationInfo in a try-with-resources; it will log start and end with 
timings @ debug
   ```java
   try (DurationInfo d = new DurationInfo(LOG,
   "Aborting commit ID %s to path %s", uploadId, destKey)) {
 writeOperations.abortMultipartCommit(destKey, uploadId);
   }
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 542494)
Time Spent: 1h 10m  (was: 1h)

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=537360=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537360
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 12:28
Start Date: 18/Jan/21 12:28
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-762219366


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 16s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 44s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 43s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 12s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 25s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 21s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 19s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  14m 38s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m 22s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   7m 16s |  |  hadoop-mapreduce-client-core in 
the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 34s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  85m 34s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2624/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2624 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 9462477aca03 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 97f843de3a9 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2624/2/testReport/ |
   | Max. process+thread count | 1108 (vs. ulimit of 5500) |
   | modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
U: 

[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=537337=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537337
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 11:09
Start Date: 18/Jan/21 11:09
Worklog Time Spent: 10m 
  Work Description: HeartSaVioR commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-762177025


   -1 from test4tests is expected. Just fixed -1 from checkstyle.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537337)
Time Spent: 50m  (was: 40m)

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=537296=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537296
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 09:22
Start Date: 18/Jan/21 09:22
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-762110273


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 32s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 58s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 43s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 36s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 16s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m 14s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 21s | 
[/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2624/1/artifact/out/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt)
 |  
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: 
The patch generated 2 new + 24 unchanged - 0 fixed = 26 total (was 24)  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  15m  0s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   0m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m 16s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   6m 55s |  |  hadoop-mapreduce-client-core in 
the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 34s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  84m 38s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2624/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2624 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 1618abb73623 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 97f843de3a9 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 

[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=537262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537262
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 07:59
Start Date: 18/Jan/21 07:59
Worklog Time Spent: 10m 
  Work Description: HeartSaVioR commented on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-762060417


   DISCLOSURE: We encountered such issues with Spark, and we had to dump stack 
trace to confirm the issue. While I'd prefer logging it as INFO but I agree it 
should be a bit verbose and affects existing users, so just trying to add more 
information when enabling DEBUG log level.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537262)
Time Spent: 20m  (was: 10m)

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=537263=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537263
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 07:59
Start Date: 18/Jan/21 07:59
Worklog Time Spent: 10m 
  Work Description: HeartSaVioR edited a comment on pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624#issuecomment-762060417


   DISCLOSURE: We encountered such issues with Spark, and every time we had to 
dump stack trace to confirm the issue. While I'd prefer logging it as INFO but 
I agree it should be a bit verbose and affects existing users, so just trying 
to add more information when enabling DEBUG log level.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537263)
Time Spent: 0.5h  (was: 20m)

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Work logged] (MAPREDUCE-7317) Add latency information in FileOutputCommitter.mergePaths

2021-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7317?focusedWorklogId=537261=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537261
 ]

ASF GitHub Bot logged work on MAPREDUCE-7317:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 07:56
Start Date: 18/Jan/21 07:56
Worklog Time Spent: 10m 
  Work Description: HeartSaVioR opened a new pull request #2624:
URL: https://github.com/apache/hadoop/pull/2624


   This PR proposes to add latency information in 
FileOutputCommitter.mergePaths, so that we can trace how much latency specific 
directory takes to merge.
   
   This information would provide some value on investigation when the commit 
in FileOutputCommitter takes huge time than expected. This class logged the 
call with from/to params in debug level which looks insufficient to trace the 
latency of specific directory due to recursive call.
   
   No test added as there's nothing to test actually. Manual test done via 
adding below in log4j.properties
   
   ```
   log4j.logger.org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter=DEBUG
   ```
   
   and ran tests in TestFileOutputCommitter.
   
   ```
   2021-01-18 16:14:03,475 DEBUG [main] output.FileOutputCommitter 
(FileOutputCommitter.java:mergePaths(461)) - Merging data from 
DeprecatedRawLocalFileStatus{path=file:/Users/jlim/WorkArea/JavaProjects/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter/_temporary/0/task_200707121733_0001_m_00;
 isDirectory=true; modification_time=1610954043000; access_time=1610954043000; 
owner=; group=; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; 
isEncrypted=false; isErasureCoded=false} to 
file:/Users/jlim/WorkArea/JavaProjects/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter
   ...
   2021-01-18 16:14:03,476 DEBUG [main] output.FileOutputCommitter 
(FileOutputCommitter.java:mergePaths(502)) - Merged data from 
file:/.../hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter/_temporary/0/task_200707121733_0001_m_00
 to 
file:/.../hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter
 in 1 ms
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537261)
Remaining Estimate: 0h
Time Spent: 10m

> Add latency information in FileOutputCommitter.mergePaths
> -
>
> Key: MAPREDUCE-7317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7317
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Jungtaek Lim
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have been observed some occurrences of huge delay from file output 
> committer V1, where file output committer V2 is not an option.
> While the root cause should have investigated on our side, there's another 
> issue that there's insufficient information to debug. Most likely the huge 
> delay comes from mergePaths, but the class only provides the "debug" log 
> message to log the call itself with parameters, nothing else. mergePaths has 
> been called recursively which is harder to trace how much latency specific 
> directory takes to merge.
> It would be nice and not intrusive to add latency info in mergePath, so that 
> we can see how much latency specific directory takes to merge, only when 
> debug log is enabled.
> (Ideally it'd be nice if we can log warn message when the call takes huge 
> time to process, but I don't have the proper threshold for the "huge time", 
> so I'd avoid dealing with it altogether here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org