[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-15 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168588#comment-16168588
 ] 

Jason Lowe commented on MAPREDUCE-6958:
---

Thanks for the review, Chris!  I'll update the patch to preserve the existing 
format as best as I can.

bq. The shuffle sizes used to be available in the clienttrace log. Was that 
removed from the ShuffleHandler at some point?

It does log the content length in the normal logger, but it's on a separate log 
line which isn't very friendly given the multithreaded nature of the netty 
processing and also not available in the audit log itself.


> Shuffle audit logger should log size of shuffle transfer
> 
>
> Key: MAPREDUCE-6958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Minor
> Attachments: MAPREDUCE-6958.001.patch, MAPREDUCE-6958.002.patch
>
>
> The shuffle audit logger currently logs the job ID and reducer ID but nothing 
> about the size of the requested transfer.  It calculates this as part of the 
> HTTP response headers, so it would be trivial to log the response size.  This 
> would be very valuable for debugging network traffic storms from the shuffle 
> handler.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168577#comment-16168577
 ] 

Chris Douglas commented on MAPREDUCE-6958:
--

Sorry to ask for revs on this kind of patch, but this changes the format of the 
audit log in a way that might break downstream consumers. The mapIds are 
printed after the reducer in the revised version. Could this keep the format 
as-is, with the length appended?

The shuffle sizes used to be available in the clienttrace log. Was that removed 
from the ShuffleHandler at some point?

> Shuffle audit logger should log size of shuffle transfer
> 
>
> Key: MAPREDUCE-6958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Minor
> Attachments: MAPREDUCE-6958.001.patch, MAPREDUCE-6958.002.patch
>
>
> The shuffle audit logger currently logs the job ID and reducer ID but nothing 
> about the size of the requested transfer.  It calculates this as part of the 
> HTTP response headers, so it would be trivial to log the response size.  This 
> would be very valuable for debugging network traffic storms from the shuffle 
> handler.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168511#comment-16168511
 ] 

Hadoop QA commented on MAPREDUCE-6958:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
20s{color} | {color:green} hadoop-mapreduce-client-shuffle in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 18m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | MAPREDUCE-6958 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887420/MAPREDUCE-6958.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux cfd259bb0f16 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fbe06b5 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7140/testReport/ |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle
 U: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle
 |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7140/console |
| Powered by | Apache Yetus 0.5.0   http://yetus.apache.org |


This message was automatically generated.



> Shuffle audit logger should log size of shuffle transfer
> 
>
> Key: MAPREDUCE-6958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Minor
> Attachments: 

[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-15 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168504#comment-16168504
 ] 

Rushabh S Shah commented on MAPREDUCE-6958:
---

+1 ltgm non-binding.

> Shuffle audit logger should log size of shuffle transfer
> 
>
> Key: MAPREDUCE-6958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Minor
> Attachments: MAPREDUCE-6958.001.patch, MAPREDUCE-6958.002.patch
>
>
> The shuffle audit logger currently logs the job ID and reducer ID but nothing 
> about the size of the requested transfer.  It calculates this as part of the 
> HTTP response headers, so it would be trivial to log the response size.  This 
> would be very valuable for debugging network traffic storms from the shuffle 
> handler.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-15 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-6958:
--
Attachment: MAPREDUCE-6958.002.patch

Thanks for the reviews!  I updated the patch to print the ArrayList directly.

> Shuffle audit logger should log size of shuffle transfer
> 
>
> Key: MAPREDUCE-6958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Minor
> Attachments: MAPREDUCE-6958.001.patch, MAPREDUCE-6958.002.patch
>
>
> The shuffle audit logger currently logs the job ID and reducer ID but nothing 
> about the size of the requested transfer.  It calculates this as part of the 
> HTTP response headers, so it would be trivial to log the response size.  This 
> would be very valuable for debugging network traffic storms from the shuffle 
> handler.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2017-09-15 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168439#comment-16168439
 ] 

Jason Lowe commented on MAPREDUCE-4980:
---

bq. it's patching code that isn't (last I checked) in branch-2.

I'm confused by that statement.  The latest 636K patch applies about 591K  of 
it as-is on branch-2.  Granted I didn't try to build it, but I'm curious what 
part makes this patch unapplicable for branch-2.  It's interesting to note that 
some of the patch conflicts are context issues because JIRAs like HADOOP-14729 
and HADOOP-10392 were applicable to branch-2 but not applied there.

There is a _lot_ of whitespace changes in this patch, so I did a comparison of 
file sizes both with and without the whitespace changes.  If the patch focuses 
just on fixing the problems with the tests wrt. making them run parallel, the 
patch is around 293K.  (I can post it if interested.)  That's less than half 
the size of the current patch, making it easier to review, easier to port to 
other branches, and more durable against other checkins to trunk.  If we really 
want to clean up whitespace that's seems better done and discussed in a 
separate JIRA given the extent of the changes rather than negatively impacting 
this one.

Other comments on patch 015:

Commented out code in the following places should be removed:
{code}
@@ -130,8 +135,8 @@ public void testMRTimelineEventHandling() throws Exception {
   + cluster.getApplicationHistoryServer().getPort());
   TimelineStore ts = cluster.getApplicationHistoryServer()
   .getTimelineStore();
-  String localPathRoot = System.getProperty("test.build.data",
-  "build/test/data");
+  String localPathRoot = GenericTestUtils.getRandomizedTempPath();
+//getTempPath(TestMRTimelineEventHandling.class.getSimpleName());
   Path inDir = new Path(localPathRoot, "input");
   Path outDir = new Path(localPathRoot, "output");
   RunningJob job =
@@ -176,9 +181,9 @@ public void testMRTimelineEventHandling() throws Exception {
   public void testMRNewTimelineServiceEventHandling() throws Exception {
 LOG.info("testMRNewTimelineServiceEventHandling start.");
 
-String testDir =
-new File("target", getClass().getSimpleName() +
-"-test_dir").getAbsolutePath();
+String testDir = GenericTestUtils.getRandomizedTestDir().getAbsolutePath();
+//new File("target", getClass().getSimpleName() +
+//"-test_dir").getAbsolutePath();
 String storageDir =
 testDir + File.separator + "timeline_service_data";
{code}

There's some TBD stuff that was added in TestMultipleLevelCaching and TestPipes 
should either be addressed or removed if unnecessary, i.e.:
{code}
-mr.waitUntilIdle();
-mr.shutdown();
+// TBD mr.waitUntilIdle();
+mr.stop();

and

-  mr.waitUntilIdle();
+  //TBD mr.waitUntilIdle();
{code}

I really like your idea of unsetting HADOOP_CONF_DIR in the pom if we can get 
away with that for the tests.  Will help keep dev environments from 
contaminating the unit tests in unexpected ways.  Best done as a separate JIRA? 
 Seems like that would apply to more than just the jobclient tests.

Thanks again for the hard work pushing this forward.  It's far from glamorous 
work, but it really is appreciated.

> Parallel test execution of hadoop-mapreduce-client-core
> ---
>
> Key: MAPREDUCE-4980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Affects Versions: 3.0.0-alpha1
>Reporter: Tsuyoshi Ozawa
>Assignee: Andrey Klochkov
> Attachments: MAPREDUCE-4980.010.patch, MAPREDUCE-4980.011.patch, 
> MAPREDUCE-4980.012.patch, MAPREDUCE-4980.013.patch, MAPREDUCE-4980.014.patch, 
> MAPREDUCE-4980.015.patch, MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
> MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, 
> MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n8.patch, 
> MAPREDUCE-4980.patch
>
>
> The maven surefire plugin supports parallel testing feature. By using it, the 
> tests can be run more faster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area

2017-09-15 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated MAPREDUCE-6954:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-beta1
   Status: Resolved  (was: Patch Available)

Thanks [~pbacsko].  Committed to trunk and branch-3.0!

> Disable erasure coding for files that are uploaded to the MR staging area
> -
>
> Key: MAPREDUCE-6954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 3.0.0-alpha4
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Fix For: 3.0.0-beta1
>
> Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, 
> MAPREDUCE-6954-003.patch, MAPREDUCE-6954-004.patch
>
>
> Depending on the encoder/decoder used and the type or MR workload, EC might 
> negatively affect the performance of an MR job if too many files are 
> localized.
> In such a scenario, users might want to disable EC in the staging area to 
> speed up the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area

2017-09-15 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168352#comment-16168352
 ] 

Robert Kanter commented on MAPREDUCE-6954:
--

+1

> Disable erasure coding for files that are uploaded to the MR staging area
> -
>
> Key: MAPREDUCE-6954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 3.0.0-alpha4
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, 
> MAPREDUCE-6954-003.patch, MAPREDUCE-6954-004.patch
>
>
> Depending on the encoder/decoder used and the type or MR workload, EC might 
> negatively affect the performance of an MR job if too many files are 
> localized.
> In such a scenario, users might want to disable EC in the staging area to 
> speed up the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter

2017-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168103#comment-16168103
 ] 

Hudson commented on MAPREDUCE-6956:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12883 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/12883/])
MAPREDUCE-6956 FileOutputCommitter to gain abstract superclass (stevel: rev 
11390c2d111910b01d9c4d3e39dee49babae272f)
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/TaskAttemptContextImpl.java
* (add) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/PathOutputCommitter.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java
* (add) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/output/TestPathOutputCommitter.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/JobContextImpl.java


> FileOutputCommitter to gain abstract superclass PathOutputCommitter
> ---
>
> Key: MAPREDUCE-6956
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.0.0-beta1
>
> Attachments: MAPREDUCE-6956-001.patch, MAPREDUCE-6956-002.patch
>
>
> This is the initial step of MAPREDUCE-6823, which proposes a factory behind 
> {{FileOutputFormat}} to create different committers for different 
> filesystems, if so configured..
> This patch simply adds the new abstract superclass of 
> {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. 
> This abstract class adds the {{getWorkPath()}} method as an abstract method, 
> with {{FIleOutputCommitter}} being the implementation..
> {{FileOutputFormat}} then relaxes its requirement of any committer returned 
> by {{getOutputCommitter()}}, so that instead of requiring a  
> {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, 
> using {{PathOutputCommitter.getWorkPath()}} to get the work path.
> What does that do?
> It allows people to implement subclasses of {{FileOutputFormat}} which can 
> provide their own committers *which don't need to inherit the complexity that 
> FileOutputCommitter has acquired over time*
> Currently anyone implementing a new committer (example: Netflix S3 committer) 
> needs to subclass {{FileOutputCommitter}}, which is too complex to understand 
> except under a debugger with co-recursive routines, lots of methods which 
> need to be overwritten to guarantee a safe subclass, and, because of its 
> critical role and known subclassing, something which isn't ever going to be 
> cleaned up.
> A new, lean, parent class which {{FileOutputFormat}} can handle allows people 
> to write new committers which don't have to worry about implementation 
> details of {{FileOutputCommitter}}, but instead how well they implement the 
> semantics of committing work.
> The full MAPREDUCE-6823 goes beyond this with a change to 
> {{FileOutputFormat}} for a factory for creating FS-specific 
> {{PathOutputCommitter}} instances. This patch doesn't include that, as that 
> is something which needs to be reviewed in the context of HADOOP-13786 and 
> ideally 1+ committer for another store, so people can say "this factory model 
> works".
> All I'm proposing here is: tune the committer class hierarchy in MRv2 so that 
> people can more easily implement committers, and when that factory is done, 
> for it to be switched to easily. And I'd like this in branch-3 from the 
> outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to 
> get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the 
> new interface across all of Hadoop 3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter

2017-09-15 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-6956:
--
   Resolution: Fixed
Fix Version/s: 3.0.0-beta1
   Status: Resolved  (was: Patch Available)

committed to 3.0-beta; now I can work on the next detail: what's a good API for 
letting people define different committers for different filesystems

> FileOutputCommitter to gain abstract superclass PathOutputCommitter
> ---
>
> Key: MAPREDUCE-6956
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.0.0-beta1
>
> Attachments: MAPREDUCE-6956-001.patch, MAPREDUCE-6956-002.patch
>
>
> This is the initial step of MAPREDUCE-6823, which proposes a factory behind 
> {{FileOutputFormat}} to create different committers for different 
> filesystems, if so configured..
> This patch simply adds the new abstract superclass of 
> {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. 
> This abstract class adds the {{getWorkPath()}} method as an abstract method, 
> with {{FIleOutputCommitter}} being the implementation..
> {{FileOutputFormat}} then relaxes its requirement of any committer returned 
> by {{getOutputCommitter()}}, so that instead of requiring a  
> {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, 
> using {{PathOutputCommitter.getWorkPath()}} to get the work path.
> What does that do?
> It allows people to implement subclasses of {{FileOutputFormat}} which can 
> provide their own committers *which don't need to inherit the complexity that 
> FileOutputCommitter has acquired over time*
> Currently anyone implementing a new committer (example: Netflix S3 committer) 
> needs to subclass {{FileOutputCommitter}}, which is too complex to understand 
> except under a debugger with co-recursive routines, lots of methods which 
> need to be overwritten to guarantee a safe subclass, and, because of its 
> critical role and known subclassing, something which isn't ever going to be 
> cleaned up.
> A new, lean, parent class which {{FileOutputFormat}} can handle allows people 
> to write new committers which don't have to worry about implementation 
> details of {{FileOutputCommitter}}, but instead how well they implement the 
> semantics of committing work.
> The full MAPREDUCE-6823 goes beyond this with a change to 
> {{FileOutputFormat}} for a factory for creating FS-specific 
> {{PathOutputCommitter}} instances. This patch doesn't include that, as that 
> is something which needs to be reviewed in the context of HADOOP-13786 and 
> ideally 1+ committer for another store, so people can say "this factory model 
> works".
> All I'm proposing here is: tune the committer class hierarchy in MRv2 so that 
> people can more easily implement committers, and when that factory is done, 
> for it to be switched to easily. And I'd like this in branch-3 from the 
> outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to 
> get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the 
> new interface across all of Hadoop 3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter

2017-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168039#comment-16168039
 ] 

Hadoop QA commented on MAPREDUCE-6956:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 21s{color} 
| {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core 
generated 3 new + 162 unchanged - 0 fixed = 165 total (was 162) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: 
The patch generated 0 new + 93 unchanged - 12 fixed = 93 total (was 105) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
33s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | MAPREDUCE-6956 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887361/MAPREDUCE-6956-002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f1b77e584af6 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 78bdf10 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7139/artifact/patchprocess/diff-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7139/testReport/ |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
U: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7139/console |
| Powered by | Apache Yetus 0.5.0   http://yetus.apache.org |


This message was automatically generated.



> FileOutputCommitter to gain abstract superclass PathOutputCommitter
> 

[jira] [Comment Edited] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-15 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168013#comment-16168013
 ] 

Rushabh S Shah edited comment on MAPREDUCE-6958 at 9/15/17 3:19 PM:


Overall the patch looks good.
Just one very minor nit.
{noformat}
for (String mapId : mapIds) {
   sb.append(" ");
  sb.append(mapId);
}
{noformat}
Instead of going through {{mapId}}, we can just print {{mapIds}}. It will be 
just less code.


was (Author: shahrs87):
Overall the patch looks good.
Just one very minor nit.
{quote}
+for (String mapId : mapIds) {
+  sb.append(" ");
+  sb.append(mapId);
+}
{quote}
Instead of going through {{mapId}}, we can just print {{mapIds}}. It will be 
just less code.

> Shuffle audit logger should log size of shuffle transfer
> 
>
> Key: MAPREDUCE-6958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Minor
> Attachments: MAPREDUCE-6958.001.patch
>
>
> The shuffle audit logger currently logs the job ID and reducer ID but nothing 
> about the size of the requested transfer.  It calculates this as part of the 
> HTTP response headers, so it would be trivial to log the response size.  This 
> would be very valuable for debugging network traffic storms from the shuffle 
> handler.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-15 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168013#comment-16168013
 ] 

Rushabh S Shah commented on MAPREDUCE-6958:
---

Overall the patch looks good.
Just one very minor nit.
{quote}
+for (String mapId : mapIds) {
+  sb.append(" ");
+  sb.append(mapId);
+}
{quote}
Instead of going through {{mapId}}, we can just print {{mapIds}}. It will be 
just less code.

> Shuffle audit logger should log size of shuffle transfer
> 
>
> Key: MAPREDUCE-6958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Minor
> Attachments: MAPREDUCE-6958.001.patch
>
>
> The shuffle audit logger currently logs the job ID and reducer ID but nothing 
> about the size of the requested transfer.  It calculates this as part of the 
> HTTP response headers, so it would be trivial to log the response size.  This 
> would be very valuable for debugging network traffic storms from the shuffle 
> handler.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter

2017-09-15 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-6956:
--
Attachment: MAPREDUCE-6956-002.patch

thanks, this is the patch I'm committing: checkstyle is now happy with the 
patch as far as I can tell

> FileOutputCommitter to gain abstract superclass PathOutputCommitter
> ---
>
> Key: MAPREDUCE-6956
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: MAPREDUCE-6956-001.patch, MAPREDUCE-6956-002.patch
>
>
> This is the initial step of MAPREDUCE-6823, which proposes a factory behind 
> {{FileOutputFormat}} to create different committers for different 
> filesystems, if so configured..
> This patch simply adds the new abstract superclass of 
> {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. 
> This abstract class adds the {{getWorkPath()}} method as an abstract method, 
> with {{FIleOutputCommitter}} being the implementation..
> {{FileOutputFormat}} then relaxes its requirement of any committer returned 
> by {{getOutputCommitter()}}, so that instead of requiring a  
> {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, 
> using {{PathOutputCommitter.getWorkPath()}} to get the work path.
> What does that do?
> It allows people to implement subclasses of {{FileOutputFormat}} which can 
> provide their own committers *which don't need to inherit the complexity that 
> FileOutputCommitter has acquired over time*
> Currently anyone implementing a new committer (example: Netflix S3 committer) 
> needs to subclass {{FileOutputCommitter}}, which is too complex to understand 
> except under a debugger with co-recursive routines, lots of methods which 
> need to be overwritten to guarantee a safe subclass, and, because of its 
> critical role and known subclassing, something which isn't ever going to be 
> cleaned up.
> A new, lean, parent class which {{FileOutputFormat}} can handle allows people 
> to write new committers which don't have to worry about implementation 
> details of {{FileOutputCommitter}}, but instead how well they implement the 
> semantics of committing work.
> The full MAPREDUCE-6823 goes beyond this with a change to 
> {{FileOutputFormat}} for a factory for creating FS-specific 
> {{PathOutputCommitter}} instances. This patch doesn't include that, as that 
> is something which needs to be reviewed in the context of HADOOP-13786 and 
> ideally 1+ committer for another store, so people can say "this factory model 
> works".
> All I'm proposing here is: tune the committer class hierarchy in MRv2 so that 
> people can more easily implement committers, and when that factory is done, 
> for it to be switched to easily. And I'd like this in branch-3 from the 
> outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to 
> get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the 
> new interface across all of Hadoop 3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter

2017-09-15 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-6956:
--
Status: Patch Available  (was: Open)

> FileOutputCommitter to gain abstract superclass PathOutputCommitter
> ---
>
> Key: MAPREDUCE-6956
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: MAPREDUCE-6956-001.patch, MAPREDUCE-6956-002.patch
>
>
> This is the initial step of MAPREDUCE-6823, which proposes a factory behind 
> {{FileOutputFormat}} to create different committers for different 
> filesystems, if so configured..
> This patch simply adds the new abstract superclass of 
> {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. 
> This abstract class adds the {{getWorkPath()}} method as an abstract method, 
> with {{FIleOutputCommitter}} being the implementation..
> {{FileOutputFormat}} then relaxes its requirement of any committer returned 
> by {{getOutputCommitter()}}, so that instead of requiring a  
> {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, 
> using {{PathOutputCommitter.getWorkPath()}} to get the work path.
> What does that do?
> It allows people to implement subclasses of {{FileOutputFormat}} which can 
> provide their own committers *which don't need to inherit the complexity that 
> FileOutputCommitter has acquired over time*
> Currently anyone implementing a new committer (example: Netflix S3 committer) 
> needs to subclass {{FileOutputCommitter}}, which is too complex to understand 
> except under a debugger with co-recursive routines, lots of methods which 
> need to be overwritten to guarantee a safe subclass, and, because of its 
> critical role and known subclassing, something which isn't ever going to be 
> cleaned up.
> A new, lean, parent class which {{FileOutputFormat}} can handle allows people 
> to write new committers which don't have to worry about implementation 
> details of {{FileOutputCommitter}}, but instead how well they implement the 
> semantics of committing work.
> The full MAPREDUCE-6823 goes beyond this with a change to 
> {{FileOutputFormat}} for a factory for creating FS-specific 
> {{PathOutputCommitter}} instances. This patch doesn't include that, as that 
> is something which needs to be reviewed in the context of HADOOP-13786 and 
> ideally 1+ committer for another store, so people can say "this factory model 
> works".
> All I'm proposing here is: tune the committer class hierarchy in MRv2 so that 
> people can more easily implement committers, and when that factory is done, 
> for it to be switched to easily. And I'd like this in branch-3 from the 
> outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to 
> get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the 
> new interface across all of Hadoop 3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter

2017-09-15 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-6956:
--
Status: Open  (was: Patch Available)

> FileOutputCommitter to gain abstract superclass PathOutputCommitter
> ---
>
> Key: MAPREDUCE-6956
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: MAPREDUCE-6956-001.patch
>
>
> This is the initial step of MAPREDUCE-6823, which proposes a factory behind 
> {{FileOutputFormat}} to create different committers for different 
> filesystems, if so configured..
> This patch simply adds the new abstract superclass of 
> {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. 
> This abstract class adds the {{getWorkPath()}} method as an abstract method, 
> with {{FIleOutputCommitter}} being the implementation..
> {{FileOutputFormat}} then relaxes its requirement of any committer returned 
> by {{getOutputCommitter()}}, so that instead of requiring a  
> {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, 
> using {{PathOutputCommitter.getWorkPath()}} to get the work path.
> What does that do?
> It allows people to implement subclasses of {{FileOutputFormat}} which can 
> provide their own committers *which don't need to inherit the complexity that 
> FileOutputCommitter has acquired over time*
> Currently anyone implementing a new committer (example: Netflix S3 committer) 
> needs to subclass {{FileOutputCommitter}}, which is too complex to understand 
> except under a debugger with co-recursive routines, lots of methods which 
> need to be overwritten to guarantee a safe subclass, and, because of its 
> critical role and known subclassing, something which isn't ever going to be 
> cleaned up.
> A new, lean, parent class which {{FileOutputFormat}} can handle allows people 
> to write new committers which don't have to worry about implementation 
> details of {{FileOutputCommitter}}, but instead how well they implement the 
> semantics of committing work.
> The full MAPREDUCE-6823 goes beyond this with a change to 
> {{FileOutputFormat}} for a factory for creating FS-specific 
> {{PathOutputCommitter}} instances. This patch doesn't include that, as that 
> is something which needs to be reviewed in the context of HADOOP-13786 and 
> ideally 1+ committer for another store, so people can say "this factory model 
> works".
> All I'm proposing here is: tune the committer class hierarchy in MRv2 so that 
> people can more easily implement committers, and when that factory is done, 
> for it to be switched to easily. And I'd like this in branch-3 from the 
> outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to 
> get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the 
> new interface across all of Hadoop 3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area

2017-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167776#comment-16167776
 ] 

Hadoop QA commented on MAPREDUCE-6954:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
44s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | MAPREDUCE-6954 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887325/MAPREDUCE-6954-004.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 8a7126aa2100 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 50764ef |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7138/testReport/ |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
U: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7138/console |
| Powered by | Apache Yetus 0.5.0   http://yetus.apache.org |


This message was automatically generated.



> Disable erasure coding for files that are uploaded to the MR staging area
> -
>
> Key: MAPREDUCE-6954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 3.0.0-alpha4
>  

[jira] [Commented] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area

2017-09-15 Thread Peter Bacsko (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167751#comment-16167751
 ] 

Peter Bacsko commented on MAPREDUCE-6954:
-

I had to upload a new patch, because I forgot to modify mapred-default.xml

> Disable erasure coding for files that are uploaded to the MR staging area
> -
>
> Key: MAPREDUCE-6954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 3.0.0-alpha4
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, 
> MAPREDUCE-6954-003.patch, MAPREDUCE-6954-004.patch
>
>
> Depending on the encoder/decoder used and the type or MR workload, EC might 
> negatively affect the performance of an MR job if too many files are 
> localized.
> In such a scenario, users might want to disable EC in the staging area to 
> speed up the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area

2017-09-15 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated MAPREDUCE-6954:

Attachment: MAPREDUCE-6954-004.patch

> Disable erasure coding for files that are uploaded to the MR staging area
> -
>
> Key: MAPREDUCE-6954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 3.0.0-alpha4
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, 
> MAPREDUCE-6954-003.patch, MAPREDUCE-6954-004.patch
>
>
> Depending on the encoder/decoder used and the type or MR workload, EC might 
> negatively affect the performance of an MR job if too many files are 
> localized.
> In such a scenario, users might want to disable EC in the staging area to 
> speed up the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area

2017-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167742#comment-16167742
 ] 

Hadoop QA commented on MAPREDUCE-6954:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
7s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | MAPREDUCE-6954 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887316/MAPREDUCE-6954-003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 6d897802dcb4 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 50764ef |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7137/testReport/ |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
U: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7137/console |
| Powered by | Apache Yetus 0.5.0   http://yetus.apache.org |


This message was automatically generated.



> Disable erasure coding for files that are uploaded to the MR staging area
> -
>
> Key: MAPREDUCE-6954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 3.0.0-alpha4
>

[jira] [Commented] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area

2017-09-15 Thread Peter Bacsko (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167708#comment-16167708
 ] 

Peter Bacsko commented on MAPREDUCE-6954:
-

I changed the boolean parameter as you requested. I agree it was confusing.
The default value of the property is now "false" instead of "true", that is it 
will try to disable EC.

> Disable erasure coding for files that are uploaded to the MR staging area
> -
>
> Key: MAPREDUCE-6954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 3.0.0-alpha4
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, 
> MAPREDUCE-6954-003.patch
>
>
> Depending on the encoder/decoder used and the type or MR workload, EC might 
> negatively affect the performance of an MR job if too many files are 
> localized.
> In such a scenario, users might want to disable EC in the staging area to 
> speed up the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area

2017-09-15 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated MAPREDUCE-6954:

Attachment: MAPREDUCE-6954-003.patch

> Disable erasure coding for files that are uploaded to the MR staging area
> -
>
> Key: MAPREDUCE-6954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 3.0.0-alpha4
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, 
> MAPREDUCE-6954-003.patch
>
>
> Depending on the encoder/decoder used and the type or MR workload, EC might 
> negatively affect the performance of an MR job if too many files are 
> localized.
> In such a scenario, users might want to disable EC in the staging area to 
> speed up the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter

2017-09-15 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167582#comment-16167582
 ] 

Steve Loughran commented on MAPREDUCE-6956:
---

great! thx for the review. I'll do the cleanup today. All the javac complaints 
are deprecation of subclassed methods, so I can't fix them. checkstyle: yes

> FileOutputCommitter to gain abstract superclass PathOutputCommitter
> ---
>
> Key: MAPREDUCE-6956
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: MAPREDUCE-6956-001.patch
>
>
> This is the initial step of MAPREDUCE-6823, which proposes a factory behind 
> {{FileOutputFormat}} to create different committers for different 
> filesystems, if so configured..
> This patch simply adds the new abstract superclass of 
> {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. 
> This abstract class adds the {{getWorkPath()}} method as an abstract method, 
> with {{FIleOutputCommitter}} being the implementation..
> {{FileOutputFormat}} then relaxes its requirement of any committer returned 
> by {{getOutputCommitter()}}, so that instead of requiring a  
> {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, 
> using {{PathOutputCommitter.getWorkPath()}} to get the work path.
> What does that do?
> It allows people to implement subclasses of {{FileOutputFormat}} which can 
> provide their own committers *which don't need to inherit the complexity that 
> FileOutputCommitter has acquired over time*
> Currently anyone implementing a new committer (example: Netflix S3 committer) 
> needs to subclass {{FileOutputCommitter}}, which is too complex to understand 
> except under a debugger with co-recursive routines, lots of methods which 
> need to be overwritten to guarantee a safe subclass, and, because of its 
> critical role and known subclassing, something which isn't ever going to be 
> cleaned up.
> A new, lean, parent class which {{FileOutputFormat}} can handle allows people 
> to write new committers which don't have to worry about implementation 
> details of {{FileOutputCommitter}}, but instead how well they implement the 
> semantics of committing work.
> The full MAPREDUCE-6823 goes beyond this with a change to 
> {{FileOutputFormat}} for a factory for creating FS-specific 
> {{PathOutputCommitter}} instances. This patch doesn't include that, as that 
> is something which needs to be reviewed in the context of HADOOP-13786 and 
> ideally 1+ committer for another store, so people can say "this factory model 
> works".
> All I'm proposing here is: tune the committer class hierarchy in MRv2 so that 
> people can more easily implement committers, and when that factory is done, 
> for it to be switched to easily. And I'd like this in branch-3 from the 
> outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to 
> get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the 
> new interface across all of Hadoop 3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org