[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer
[ https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168588#comment-16168588 ] Jason Lowe commented on MAPREDUCE-6958: --- Thanks for the review, Chris! I'll update the patch to preserve the existing format as best as I can. bq. The shuffle sizes used to be available in the clienttrace log. Was that removed from the ShuffleHandler at some point? It does log the content length in the normal logger, but it's on a separate log line which isn't very friendly given the multithreaded nature of the netty processing and also not available in the audit log itself. > Shuffle audit logger should log size of shuffle transfer > > > Key: MAPREDUCE-6958 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Minor > Attachments: MAPREDUCE-6958.001.patch, MAPREDUCE-6958.002.patch > > > The shuffle audit logger currently logs the job ID and reducer ID but nothing > about the size of the requested transfer. It calculates this as part of the > HTTP response headers, so it would be trivial to log the response size. This > would be very valuable for debugging network traffic storms from the shuffle > handler. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer
[ https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168577#comment-16168577 ] Chris Douglas commented on MAPREDUCE-6958: -- Sorry to ask for revs on this kind of patch, but this changes the format of the audit log in a way that might break downstream consumers. The mapIds are printed after the reducer in the revised version. Could this keep the format as-is, with the length appended? The shuffle sizes used to be available in the clienttrace log. Was that removed from the ShuffleHandler at some point? > Shuffle audit logger should log size of shuffle transfer > > > Key: MAPREDUCE-6958 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Minor > Attachments: MAPREDUCE-6958.001.patch, MAPREDUCE-6958.002.patch > > > The shuffle audit logger currently logs the job ID and reducer ID but nothing > about the size of the requested transfer. It calculates this as part of the > HTTP response headers, so it would be trivial to log the response size. This > would be very valuable for debugging network traffic storms from the shuffle > handler. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer
[ https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168511#comment-16168511 ] Hadoop QA commented on MAPREDUCE-6958: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s{color} | {color:green} hadoop-mapreduce-client-shuffle in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 18m 28s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | MAPREDUCE-6958 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887420/MAPREDUCE-6958.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux cfd259bb0f16 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fbe06b5 | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7140/testReport/ | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7140/console | | Powered by | Apache Yetus 0.5.0 http://yetus.apache.org | This message was automatically generated. > Shuffle audit logger should log size of shuffle transfer > > > Key: MAPREDUCE-6958 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Minor > Attachments:
[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer
[ https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168504#comment-16168504 ] Rushabh S Shah commented on MAPREDUCE-6958: --- +1 ltgm non-binding. > Shuffle audit logger should log size of shuffle transfer > > > Key: MAPREDUCE-6958 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Minor > Attachments: MAPREDUCE-6958.001.patch, MAPREDUCE-6958.002.patch > > > The shuffle audit logger currently logs the job ID and reducer ID but nothing > about the size of the requested transfer. It calculates this as part of the > HTTP response headers, so it would be trivial to log the response size. This > would be very valuable for debugging network traffic storms from the shuffle > handler. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer
[ https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-6958: -- Attachment: MAPREDUCE-6958.002.patch Thanks for the reviews! I updated the patch to print the ArrayList directly. > Shuffle audit logger should log size of shuffle transfer > > > Key: MAPREDUCE-6958 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Minor > Attachments: MAPREDUCE-6958.001.patch, MAPREDUCE-6958.002.patch > > > The shuffle audit logger currently logs the job ID and reducer ID but nothing > about the size of the requested transfer. It calculates this as part of the > HTTP response headers, so it would be trivial to log the response size. This > would be very valuable for debugging network traffic storms from the shuffle > handler. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168439#comment-16168439 ] Jason Lowe commented on MAPREDUCE-4980: --- bq. it's patching code that isn't (last I checked) in branch-2. I'm confused by that statement. The latest 636K patch applies about 591K of it as-is on branch-2. Granted I didn't try to build it, but I'm curious what part makes this patch unapplicable for branch-2. It's interesting to note that some of the patch conflicts are context issues because JIRAs like HADOOP-14729 and HADOOP-10392 were applicable to branch-2 but not applied there. There is a _lot_ of whitespace changes in this patch, so I did a comparison of file sizes both with and without the whitespace changes. If the patch focuses just on fixing the problems with the tests wrt. making them run parallel, the patch is around 293K. (I can post it if interested.) That's less than half the size of the current patch, making it easier to review, easier to port to other branches, and more durable against other checkins to trunk. If we really want to clean up whitespace that's seems better done and discussed in a separate JIRA given the extent of the changes rather than negatively impacting this one. Other comments on patch 015: Commented out code in the following places should be removed: {code} @@ -130,8 +135,8 @@ public void testMRTimelineEventHandling() throws Exception { + cluster.getApplicationHistoryServer().getPort()); TimelineStore ts = cluster.getApplicationHistoryServer() .getTimelineStore(); - String localPathRoot = System.getProperty("test.build.data", - "build/test/data"); + String localPathRoot = GenericTestUtils.getRandomizedTempPath(); +//getTempPath(TestMRTimelineEventHandling.class.getSimpleName()); Path inDir = new Path(localPathRoot, "input"); Path outDir = new Path(localPathRoot, "output"); RunningJob job = @@ -176,9 +181,9 @@ public void testMRTimelineEventHandling() throws Exception { public void testMRNewTimelineServiceEventHandling() throws Exception { LOG.info("testMRNewTimelineServiceEventHandling start."); -String testDir = -new File("target", getClass().getSimpleName() + -"-test_dir").getAbsolutePath(); +String testDir = GenericTestUtils.getRandomizedTestDir().getAbsolutePath(); +//new File("target", getClass().getSimpleName() + +//"-test_dir").getAbsolutePath(); String storageDir = testDir + File.separator + "timeline_service_data"; {code} There's some TBD stuff that was added in TestMultipleLevelCaching and TestPipes should either be addressed or removed if unnecessary, i.e.: {code} -mr.waitUntilIdle(); -mr.shutdown(); +// TBD mr.waitUntilIdle(); +mr.stop(); and - mr.waitUntilIdle(); + //TBD mr.waitUntilIdle(); {code} I really like your idea of unsetting HADOOP_CONF_DIR in the pom if we can get away with that for the tests. Will help keep dev environments from contaminating the unit tests in unexpected ways. Best done as a separate JIRA? Seems like that would apply to more than just the jobclient tests. Thanks again for the hard work pushing this forward. It's far from glamorous work, but it really is appreciated. > Parallel test execution of hadoop-mapreduce-client-core > --- > > Key: MAPREDUCE-4980 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 3.0.0-alpha1 >Reporter: Tsuyoshi Ozawa >Assignee: Andrey Klochkov > Attachments: MAPREDUCE-4980.010.patch, MAPREDUCE-4980.011.patch, > MAPREDUCE-4980.012.patch, MAPREDUCE-4980.013.patch, MAPREDUCE-4980.014.patch, > MAPREDUCE-4980.015.patch, MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, > MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, > MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n8.patch, > MAPREDUCE-4980.patch > > > The maven surefire plugin supports parallel testing feature. By using it, the > tests can be run more faster. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area
[ https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-6954: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-beta1 Status: Resolved (was: Patch Available) Thanks [~pbacsko]. Committed to trunk and branch-3.0! > Disable erasure coding for files that are uploaded to the MR staging area > - > > Key: MAPREDUCE-6954 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 3.0.0-alpha4 >Reporter: Peter Bacsko >Assignee: Peter Bacsko > Fix For: 3.0.0-beta1 > > Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, > MAPREDUCE-6954-003.patch, MAPREDUCE-6954-004.patch > > > Depending on the encoder/decoder used and the type or MR workload, EC might > negatively affect the performance of an MR job if too many files are > localized. > In such a scenario, users might want to disable EC in the staging area to > speed up the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area
[ https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168352#comment-16168352 ] Robert Kanter commented on MAPREDUCE-6954: -- +1 > Disable erasure coding for files that are uploaded to the MR staging area > - > > Key: MAPREDUCE-6954 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 3.0.0-alpha4 >Reporter: Peter Bacsko >Assignee: Peter Bacsko > Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, > MAPREDUCE-6954-003.patch, MAPREDUCE-6954-004.patch > > > Depending on the encoder/decoder used and the type or MR workload, EC might > negatively affect the performance of an MR job if too many files are > localized. > In such a scenario, users might want to disable EC in the staging area to > speed up the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter
[ https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168103#comment-16168103 ] Hudson commented on MAPREDUCE-6956: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12883 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/12883/]) MAPREDUCE-6956 FileOutputCommitter to gain abstract superclass (stevel: rev 11390c2d111910b01d9c4d3e39dee49babae272f) * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/TaskAttemptContextImpl.java * (add) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/PathOutputCommitter.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java * (add) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/output/TestPathOutputCommitter.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/JobContextImpl.java > FileOutputCommitter to gain abstract superclass PathOutputCommitter > --- > > Key: MAPREDUCE-6956 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 3.0.0-beta1 > > Attachments: MAPREDUCE-6956-001.patch, MAPREDUCE-6956-002.patch > > > This is the initial step of MAPREDUCE-6823, which proposes a factory behind > {{FileOutputFormat}} to create different committers for different > filesystems, if so configured.. > This patch simply adds the new abstract superclass of > {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. > This abstract class adds the {{getWorkPath()}} method as an abstract method, > with {{FIleOutputCommitter}} being the implementation.. > {{FileOutputFormat}} then relaxes its requirement of any committer returned > by {{getOutputCommitter()}}, so that instead of requiring a > {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, > using {{PathOutputCommitter.getWorkPath()}} to get the work path. > What does that do? > It allows people to implement subclasses of {{FileOutputFormat}} which can > provide their own committers *which don't need to inherit the complexity that > FileOutputCommitter has acquired over time* > Currently anyone implementing a new committer (example: Netflix S3 committer) > needs to subclass {{FileOutputCommitter}}, which is too complex to understand > except under a debugger with co-recursive routines, lots of methods which > need to be overwritten to guarantee a safe subclass, and, because of its > critical role and known subclassing, something which isn't ever going to be > cleaned up. > A new, lean, parent class which {{FileOutputFormat}} can handle allows people > to write new committers which don't have to worry about implementation > details of {{FileOutputCommitter}}, but instead how well they implement the > semantics of committing work. > The full MAPREDUCE-6823 goes beyond this with a change to > {{FileOutputFormat}} for a factory for creating FS-specific > {{PathOutputCommitter}} instances. This patch doesn't include that, as that > is something which needs to be reviewed in the context of HADOOP-13786 and > ideally 1+ committer for another store, so people can say "this factory model > works". > All I'm proposing here is: tune the committer class hierarchy in MRv2 so that > people can more easily implement committers, and when that factory is done, > for it to be switched to easily. And I'd like this in branch-3 from the > outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to > get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the > new interface across all of Hadoop 3. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter
[ https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated MAPREDUCE-6956: -- Resolution: Fixed Fix Version/s: 3.0.0-beta1 Status: Resolved (was: Patch Available) committed to 3.0-beta; now I can work on the next detail: what's a good API for letting people define different committers for different filesystems > FileOutputCommitter to gain abstract superclass PathOutputCommitter > --- > > Key: MAPREDUCE-6956 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 3.0.0-beta1 > > Attachments: MAPREDUCE-6956-001.patch, MAPREDUCE-6956-002.patch > > > This is the initial step of MAPREDUCE-6823, which proposes a factory behind > {{FileOutputFormat}} to create different committers for different > filesystems, if so configured.. > This patch simply adds the new abstract superclass of > {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. > This abstract class adds the {{getWorkPath()}} method as an abstract method, > with {{FIleOutputCommitter}} being the implementation.. > {{FileOutputFormat}} then relaxes its requirement of any committer returned > by {{getOutputCommitter()}}, so that instead of requiring a > {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, > using {{PathOutputCommitter.getWorkPath()}} to get the work path. > What does that do? > It allows people to implement subclasses of {{FileOutputFormat}} which can > provide their own committers *which don't need to inherit the complexity that > FileOutputCommitter has acquired over time* > Currently anyone implementing a new committer (example: Netflix S3 committer) > needs to subclass {{FileOutputCommitter}}, which is too complex to understand > except under a debugger with co-recursive routines, lots of methods which > need to be overwritten to guarantee a safe subclass, and, because of its > critical role and known subclassing, something which isn't ever going to be > cleaned up. > A new, lean, parent class which {{FileOutputFormat}} can handle allows people > to write new committers which don't have to worry about implementation > details of {{FileOutputCommitter}}, but instead how well they implement the > semantics of committing work. > The full MAPREDUCE-6823 goes beyond this with a change to > {{FileOutputFormat}} for a factory for creating FS-specific > {{PathOutputCommitter}} instances. This patch doesn't include that, as that > is something which needs to be reviewed in the context of HADOOP-13786 and > ideally 1+ committer for another store, so people can say "this factory model > works". > All I'm proposing here is: tune the committer class hierarchy in MRv2 so that > people can more easily implement committers, and when that factory is done, > for it to be switched to easily. And I'd like this in branch-3 from the > outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to > get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the > new interface across all of Hadoop 3. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter
[ https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168039#comment-16168039 ] Hadoop QA commented on MAPREDUCE-6956: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 21s{color} | {color:red} hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core generated 3 new + 162 unchanged - 0 fixed = 165 total (was 162) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: The patch generated 0 new + 93 unchanged - 12 fixed = 93 total (was 105) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 33s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 20m 47s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | MAPREDUCE-6956 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887361/MAPREDUCE-6956-002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux f1b77e584af6 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 78bdf10 | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | javac | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7139/artifact/patchprocess/diff-compile-javac-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7139/testReport/ | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7139/console | | Powered by | Apache Yetus 0.5.0 http://yetus.apache.org | This message was automatically generated. > FileOutputCommitter to gain abstract superclass PathOutputCommitter >
[jira] [Comment Edited] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer
[ https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168013#comment-16168013 ] Rushabh S Shah edited comment on MAPREDUCE-6958 at 9/15/17 3:19 PM: Overall the patch looks good. Just one very minor nit. {noformat} for (String mapId : mapIds) { sb.append(" "); sb.append(mapId); } {noformat} Instead of going through {{mapId}}, we can just print {{mapIds}}. It will be just less code. was (Author: shahrs87): Overall the patch looks good. Just one very minor nit. {quote} +for (String mapId : mapIds) { + sb.append(" "); + sb.append(mapId); +} {quote} Instead of going through {{mapId}}, we can just print {{mapIds}}. It will be just less code. > Shuffle audit logger should log size of shuffle transfer > > > Key: MAPREDUCE-6958 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Minor > Attachments: MAPREDUCE-6958.001.patch > > > The shuffle audit logger currently logs the job ID and reducer ID but nothing > about the size of the requested transfer. It calculates this as part of the > HTTP response headers, so it would be trivial to log the response size. This > would be very valuable for debugging network traffic storms from the shuffle > handler. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer
[ https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168013#comment-16168013 ] Rushabh S Shah commented on MAPREDUCE-6958: --- Overall the patch looks good. Just one very minor nit. {quote} +for (String mapId : mapIds) { + sb.append(" "); + sb.append(mapId); +} {quote} Instead of going through {{mapId}}, we can just print {{mapIds}}. It will be just less code. > Shuffle audit logger should log size of shuffle transfer > > > Key: MAPREDUCE-6958 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Minor > Attachments: MAPREDUCE-6958.001.patch > > > The shuffle audit logger currently logs the job ID and reducer ID but nothing > about the size of the requested transfer. It calculates this as part of the > HTTP response headers, so it would be trivial to log the response size. This > would be very valuable for debugging network traffic storms from the shuffle > handler. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter
[ https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated MAPREDUCE-6956: -- Attachment: MAPREDUCE-6956-002.patch thanks, this is the patch I'm committing: checkstyle is now happy with the patch as far as I can tell > FileOutputCommitter to gain abstract superclass PathOutputCommitter > --- > > Key: MAPREDUCE-6956 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: MAPREDUCE-6956-001.patch, MAPREDUCE-6956-002.patch > > > This is the initial step of MAPREDUCE-6823, which proposes a factory behind > {{FileOutputFormat}} to create different committers for different > filesystems, if so configured.. > This patch simply adds the new abstract superclass of > {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. > This abstract class adds the {{getWorkPath()}} method as an abstract method, > with {{FIleOutputCommitter}} being the implementation.. > {{FileOutputFormat}} then relaxes its requirement of any committer returned > by {{getOutputCommitter()}}, so that instead of requiring a > {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, > using {{PathOutputCommitter.getWorkPath()}} to get the work path. > What does that do? > It allows people to implement subclasses of {{FileOutputFormat}} which can > provide their own committers *which don't need to inherit the complexity that > FileOutputCommitter has acquired over time* > Currently anyone implementing a new committer (example: Netflix S3 committer) > needs to subclass {{FileOutputCommitter}}, which is too complex to understand > except under a debugger with co-recursive routines, lots of methods which > need to be overwritten to guarantee a safe subclass, and, because of its > critical role and known subclassing, something which isn't ever going to be > cleaned up. > A new, lean, parent class which {{FileOutputFormat}} can handle allows people > to write new committers which don't have to worry about implementation > details of {{FileOutputCommitter}}, but instead how well they implement the > semantics of committing work. > The full MAPREDUCE-6823 goes beyond this with a change to > {{FileOutputFormat}} for a factory for creating FS-specific > {{PathOutputCommitter}} instances. This patch doesn't include that, as that > is something which needs to be reviewed in the context of HADOOP-13786 and > ideally 1+ committer for another store, so people can say "this factory model > works". > All I'm proposing here is: tune the committer class hierarchy in MRv2 so that > people can more easily implement committers, and when that factory is done, > for it to be switched to easily. And I'd like this in branch-3 from the > outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to > get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the > new interface across all of Hadoop 3. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter
[ https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated MAPREDUCE-6956: -- Status: Patch Available (was: Open) > FileOutputCommitter to gain abstract superclass PathOutputCommitter > --- > > Key: MAPREDUCE-6956 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: MAPREDUCE-6956-001.patch, MAPREDUCE-6956-002.patch > > > This is the initial step of MAPREDUCE-6823, which proposes a factory behind > {{FileOutputFormat}} to create different committers for different > filesystems, if so configured.. > This patch simply adds the new abstract superclass of > {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. > This abstract class adds the {{getWorkPath()}} method as an abstract method, > with {{FIleOutputCommitter}} being the implementation.. > {{FileOutputFormat}} then relaxes its requirement of any committer returned > by {{getOutputCommitter()}}, so that instead of requiring a > {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, > using {{PathOutputCommitter.getWorkPath()}} to get the work path. > What does that do? > It allows people to implement subclasses of {{FileOutputFormat}} which can > provide their own committers *which don't need to inherit the complexity that > FileOutputCommitter has acquired over time* > Currently anyone implementing a new committer (example: Netflix S3 committer) > needs to subclass {{FileOutputCommitter}}, which is too complex to understand > except under a debugger with co-recursive routines, lots of methods which > need to be overwritten to guarantee a safe subclass, and, because of its > critical role and known subclassing, something which isn't ever going to be > cleaned up. > A new, lean, parent class which {{FileOutputFormat}} can handle allows people > to write new committers which don't have to worry about implementation > details of {{FileOutputCommitter}}, but instead how well they implement the > semantics of committing work. > The full MAPREDUCE-6823 goes beyond this with a change to > {{FileOutputFormat}} for a factory for creating FS-specific > {{PathOutputCommitter}} instances. This patch doesn't include that, as that > is something which needs to be reviewed in the context of HADOOP-13786 and > ideally 1+ committer for another store, so people can say "this factory model > works". > All I'm proposing here is: tune the committer class hierarchy in MRv2 so that > people can more easily implement committers, and when that factory is done, > for it to be switched to easily. And I'd like this in branch-3 from the > outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to > get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the > new interface across all of Hadoop 3. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter
[ https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated MAPREDUCE-6956: -- Status: Open (was: Patch Available) > FileOutputCommitter to gain abstract superclass PathOutputCommitter > --- > > Key: MAPREDUCE-6956 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: MAPREDUCE-6956-001.patch > > > This is the initial step of MAPREDUCE-6823, which proposes a factory behind > {{FileOutputFormat}} to create different committers for different > filesystems, if so configured.. > This patch simply adds the new abstract superclass of > {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. > This abstract class adds the {{getWorkPath()}} method as an abstract method, > with {{FIleOutputCommitter}} being the implementation.. > {{FileOutputFormat}} then relaxes its requirement of any committer returned > by {{getOutputCommitter()}}, so that instead of requiring a > {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, > using {{PathOutputCommitter.getWorkPath()}} to get the work path. > What does that do? > It allows people to implement subclasses of {{FileOutputFormat}} which can > provide their own committers *which don't need to inherit the complexity that > FileOutputCommitter has acquired over time* > Currently anyone implementing a new committer (example: Netflix S3 committer) > needs to subclass {{FileOutputCommitter}}, which is too complex to understand > except under a debugger with co-recursive routines, lots of methods which > need to be overwritten to guarantee a safe subclass, and, because of its > critical role and known subclassing, something which isn't ever going to be > cleaned up. > A new, lean, parent class which {{FileOutputFormat}} can handle allows people > to write new committers which don't have to worry about implementation > details of {{FileOutputCommitter}}, but instead how well they implement the > semantics of committing work. > The full MAPREDUCE-6823 goes beyond this with a change to > {{FileOutputFormat}} for a factory for creating FS-specific > {{PathOutputCommitter}} instances. This patch doesn't include that, as that > is something which needs to be reviewed in the context of HADOOP-13786 and > ideally 1+ committer for another store, so people can say "this factory model > works". > All I'm proposing here is: tune the committer class hierarchy in MRv2 so that > people can more easily implement committers, and when that factory is done, > for it to be switched to easily. And I'd like this in branch-3 from the > outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to > get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the > new interface across all of Hadoop 3. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area
[ https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167776#comment-16167776 ] Hadoop QA commented on MAPREDUCE-6954: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 44s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 21m 34s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | MAPREDUCE-6954 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887325/MAPREDUCE-6954-004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit xml findbugs checkstyle | | uname | Linux 8a7126aa2100 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 50764ef | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7138/testReport/ | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7138/console | | Powered by | Apache Yetus 0.5.0 http://yetus.apache.org | This message was automatically generated. > Disable erasure coding for files that are uploaded to the MR staging area > - > > Key: MAPREDUCE-6954 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 3.0.0-alpha4 >
[jira] [Commented] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area
[ https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167751#comment-16167751 ] Peter Bacsko commented on MAPREDUCE-6954: - I had to upload a new patch, because I forgot to modify mapred-default.xml > Disable erasure coding for files that are uploaded to the MR staging area > - > > Key: MAPREDUCE-6954 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 3.0.0-alpha4 >Reporter: Peter Bacsko >Assignee: Peter Bacsko > Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, > MAPREDUCE-6954-003.patch, MAPREDUCE-6954-004.patch > > > Depending on the encoder/decoder used and the type or MR workload, EC might > negatively affect the performance of an MR job if too many files are > localized. > In such a scenario, users might want to disable EC in the staging area to > speed up the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area
[ https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated MAPREDUCE-6954: Attachment: MAPREDUCE-6954-004.patch > Disable erasure coding for files that are uploaded to the MR staging area > - > > Key: MAPREDUCE-6954 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 3.0.0-alpha4 >Reporter: Peter Bacsko >Assignee: Peter Bacsko > Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, > MAPREDUCE-6954-003.patch, MAPREDUCE-6954-004.patch > > > Depending on the encoder/decoder used and the type or MR workload, EC might > negatively affect the performance of an MR job if too many files are > localized. > In such a scenario, users might want to disable EC in the staging area to > speed up the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area
[ https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167742#comment-16167742 ] Hadoop QA commented on MAPREDUCE-6954: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 7s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 28m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | MAPREDUCE-6954 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887316/MAPREDUCE-6954-003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit xml findbugs checkstyle | | uname | Linux 6d897802dcb4 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 50764ef | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7137/testReport/ | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7137/console | | Powered by | Apache Yetus 0.5.0 http://yetus.apache.org | This message was automatically generated. > Disable erasure coding for files that are uploaded to the MR staging area > - > > Key: MAPREDUCE-6954 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 3.0.0-alpha4 >
[jira] [Commented] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area
[ https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167708#comment-16167708 ] Peter Bacsko commented on MAPREDUCE-6954: - I changed the boolean parameter as you requested. I agree it was confusing. The default value of the property is now "false" instead of "true", that is it will try to disable EC. > Disable erasure coding for files that are uploaded to the MR staging area > - > > Key: MAPREDUCE-6954 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 3.0.0-alpha4 >Reporter: Peter Bacsko >Assignee: Peter Bacsko > Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, > MAPREDUCE-6954-003.patch > > > Depending on the encoder/decoder used and the type or MR workload, EC might > negatively affect the performance of an MR job if too many files are > localized. > In such a scenario, users might want to disable EC in the staging area to > speed up the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6954) Disable erasure coding for files that are uploaded to the MR staging area
[ https://issues.apache.org/jira/browse/MAPREDUCE-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated MAPREDUCE-6954: Attachment: MAPREDUCE-6954-003.patch > Disable erasure coding for files that are uploaded to the MR staging area > - > > Key: MAPREDUCE-6954 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6954 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 3.0.0-alpha4 >Reporter: Peter Bacsko >Assignee: Peter Bacsko > Attachments: MAPREDUCE-6954-001.patch, MAPREDUCE-6954-002.patch, > MAPREDUCE-6954-003.patch > > > Depending on the encoder/decoder used and the type or MR workload, EC might > negatively affect the performance of an MR job if too many files are > localized. > In such a scenario, users might want to disable EC in the staging area to > speed up the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6956) FileOutputCommitter to gain abstract superclass PathOutputCommitter
[ https://issues.apache.org/jira/browse/MAPREDUCE-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167582#comment-16167582 ] Steve Loughran commented on MAPREDUCE-6956: --- great! thx for the review. I'll do the cleanup today. All the javac complaints are deprecation of subclassed methods, so I can't fix them. checkstyle: yes > FileOutputCommitter to gain abstract superclass PathOutputCommitter > --- > > Key: MAPREDUCE-6956 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6956 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: MAPREDUCE-6956-001.patch > > > This is the initial step of MAPREDUCE-6823, which proposes a factory behind > {{FileOutputFormat}} to create different committers for different > filesystems, if so configured.. > This patch simply adds the new abstract superclass of > {{FileOutputCommitter}}, {{PathOutputCommitter extends OutputCommitter}}. > This abstract class adds the {{getWorkPath()}} method as an abstract method, > with {{FIleOutputCommitter}} being the implementation.. > {{FileOutputFormat}} then relaxes its requirement of any committer returned > by {{getOutputCommitter()}}, so that instead of requiring a > {{FileOutputCommitter}} or subclass, it only needs a {{PathOutputCommitter}}, > using {{PathOutputCommitter.getWorkPath()}} to get the work path. > What does that do? > It allows people to implement subclasses of {{FileOutputFormat}} which can > provide their own committers *which don't need to inherit the complexity that > FileOutputCommitter has acquired over time* > Currently anyone implementing a new committer (example: Netflix S3 committer) > needs to subclass {{FileOutputCommitter}}, which is too complex to understand > except under a debugger with co-recursive routines, lots of methods which > need to be overwritten to guarantee a safe subclass, and, because of its > critical role and known subclassing, something which isn't ever going to be > cleaned up. > A new, lean, parent class which {{FileOutputFormat}} can handle allows people > to write new committers which don't have to worry about implementation > details of {{FileOutputCommitter}}, but instead how well they implement the > semantics of committing work. > The full MAPREDUCE-6823 goes beyond this with a change to > {{FileOutputFormat}} for a factory for creating FS-specific > {{PathOutputCommitter}} instances. This patch doesn't include that, as that > is something which needs to be reviewed in the context of HADOOP-13786 and > ideally 1+ committer for another store, so people can say "this factory model > works". > All I'm proposing here is: tune the committer class hierarchy in MRv2 so that > people can more easily implement committers, and when that factory is done, > for it to be switched to easily. And I'd like this in branch-3 from the > outset, so existing code which calls {{FileOutputFormat.getCommitter()}} to > get a {{FileOutputCommitter}} *just to call getWorkPath()* can move to the > new interface across all of Hadoop 3. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org