[jira] [Updated] (MAPREDUCE-5583) Ability to limit running map and reduce tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-5583: Release Note: This introduces two new MR2 job configs, mentioned below, which allow users to control the maximum simultaneously-running tasks of the submitted job, across the cluster: * mapreduce.job.running.map.limit (default: 0, for no limit) * mapreduce.job.running.reduce.limit (default: 0, for no limit) This is controllable at a per-job level. was: This introduces two new MR2 job configs, mentioned below, which allow users to control the maximum simultaneously-running tasks of the submitted job, across the cluster: * mapreduce.job.running.map.limit (default: 0, for no limit) * mapreduce.job.running.reduce.limit (default: 0, for no limit) This is controllable at a per-job level. > Ability to limit running map and reduce tasks > - > > Key: MAPREDUCE-5583 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5583 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am, mrv2 >Affects Versions: 0.23.9, 2.1.1-beta >Reporter: Jason Lowe >Assignee: Jason Lowe > Fix For: 2.7.0 > > Attachments: MAPREDUCE-5583-branch2.4.1.patch, > MAPREDUCE-5583v1.patch, MAPREDUCE-5583v2.patch, MAPREDUCE-5583v3.patch, > MAPREDUCE-5583v4.patch > > > It would be nice if users could specify a limit to the number of map or > reduce tasks that are running simultaneously. Occasionally users are > performing operations in tasks that can lead to DDoS scenarios if too many > tasks run simultaneously (e.g.: accessing a database, web service, etc.). > Having the ability to throttle the number of tasks simultaneously running > would provide users a way to mitigate issues with too many tasks on a large > cluster attempting to access a serivce at any one time. > This is similar to the functionality requested by MAPREDUCE-224 and > implemented by HADOOP-3412 but was dropped in mrv2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6552) Add job search button in JobHistoryServer WebUI
[ https://issues.apache.org/jira/browse/MAPREDUCE-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144099#comment-15144099 ] Hadoop QA commented on MAPREDUCE-6552: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 6s {color} | {color:red} hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs-jdk1.8.0_72 with JDK v1.8.0_72 generated 5 new + 95 unchanged - 5 fixed = 100 total (was 100) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 33s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 3s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s {color} | {color:red} Patch generated 14 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 41s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12773399/MAPREDUCE-6552.002.patch | | JIRA Issue | MAPREDUCE-6552 | | Optional Tests | a
[jira] [Commented] (MAPREDUCE-6552) Add job search button in JobHistoryServer WebUI
[ https://issues.apache.org/jira/browse/MAPREDUCE-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144085#comment-15144085 ] Akira AJISAKA commented on MAPREDUCE-6552: -- Thanks [~linyiqun] for creating the patch. Now JHS WebUI has filter for searching retired job. Isn't it enough for you? > Add job search button in JobHistoryServer WebUI > --- > > Key: MAPREDUCE-6552 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6552 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: webapps >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: MAPREDUCE-6552.001.patch, MAPREDUCE-6552.002.patch, > Screen Shot 2015-11-19.png > > > In jobhistory webui, it's not convenient to direct search the specific > retired job when there are only few job records in page.And if you want to > show more jobs in page(so that you can find your target job), the main page > will be opened slowly.Because the renderd page will be large,and the browser > will be cost much time to download this page.So we can add a search job > button , and by inputing a specific jobid and clicking the button, we can > jump its job page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6579) JobStatus#getFailureInfo should not output diagnostic information when the job is running
[ https://issues.apache.org/jira/browse/MAPREDUCE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144044#comment-15144044 ] Sunil G commented on MAPREDUCE-6579: Yes [~rohithsharma]. From YARN if diagnostics can be cleared out once app moves to final state, we could avoid such scenarios. Meantime we would have already collected enough reasons for apps stuck in ACCEPTED or SCHEDULED state. And in final states, I see there is no need to keep such informations as we already pass that checkpoint. > JobStatus#getFailureInfo should not output diagnostic information when the > job is running > - > > Key: MAPREDUCE-6579 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6579 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Akira AJISAKA >Priority: Blocker > Attachments: MAPREDUCE-6579.01.patch, MAPREDUCE-6579.02.patch, > MAPREDUCE-6579.03.patch, MAPREDUCE-6579.04.patch, MAPREDUCE-6579.05.patch > > > From > [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestNetworkedJob are failed intermittently. > {code} > Running org.apache.hadoop.mapred.TestNetworkedJob > Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 81.131 sec > <<< FAILURE! - in org.apache.hadoop.mapred.TestNetworkedJob > testNetworkedJob(org.apache.hadoop.mapred.TestNetworkedJob) Time elapsed: > 30.55 sec <<< FAILURE! > org.junit.ComparisonFailure: expected:<[[Tue Dec 15 14:02:45 + 2015] > Application is Activated, waiting for resources to be assigned for AM. > Details : AM Partition = ; Partition Resource = > ; Queue's Absolute capacity = 100.0 % ; Queue's > Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; ]> > but was:<[]> > at org.junit.Assert.assertEquals(Assert.java:115) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapred.TestNetworkedJob.testNetworkedJob(TestNetworkedJob.java:174) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6579) JobStatus#getFailureInfo should not output diagnostic information when the job is running
[ https://issues.apache.org/jira/browse/MAPREDUCE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144024#comment-15144024 ] Rohith Sharma K S commented on MAPREDUCE-6579: -- I think fix should be done at YARN so that it will not break any downstream projects. One of the approach at YARN I can think of is reset the RM added diagnosis messages to empty string once app is in final_saving state. Any thoughts? > JobStatus#getFailureInfo should not output diagnostic information when the > job is running > - > > Key: MAPREDUCE-6579 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6579 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Akira AJISAKA >Priority: Blocker > Attachments: MAPREDUCE-6579.01.patch, MAPREDUCE-6579.02.patch, > MAPREDUCE-6579.03.patch, MAPREDUCE-6579.04.patch, MAPREDUCE-6579.05.patch > > > From > [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestNetworkedJob are failed intermittently. > {code} > Running org.apache.hadoop.mapred.TestNetworkedJob > Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 81.131 sec > <<< FAILURE! - in org.apache.hadoop.mapred.TestNetworkedJob > testNetworkedJob(org.apache.hadoop.mapred.TestNetworkedJob) Time elapsed: > 30.55 sec <<< FAILURE! > org.junit.ComparisonFailure: expected:<[[Tue Dec 15 14:02:45 + 2015] > Application is Activated, waiting for resources to be assigned for AM. > Details : AM Partition = ; Partition Resource = > ; Queue's Absolute capacity = 100.0 % ; Queue's > Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; ]> > but was:<[]> > at org.junit.Assert.assertEquals(Assert.java:115) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapred.TestNetworkedJob.testNetworkedJob(TestNetworkedJob.java:174) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6579) JobStatus#getFailureInfo should not output diagnostic information when the job is running
[ https://issues.apache.org/jira/browse/MAPREDUCE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143942#comment-15143942 ] Rohith Sharma K S commented on MAPREDUCE-6579: -- Thanks Jason for sharing your opinion. bq. It seems these new messages only make sense to report when the job is active and are mostly noise afterwards. very true > JobStatus#getFailureInfo should not output diagnostic information when the > job is running > - > > Key: MAPREDUCE-6579 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6579 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Akira AJISAKA >Priority: Blocker > Attachments: MAPREDUCE-6579.01.patch, MAPREDUCE-6579.02.patch, > MAPREDUCE-6579.03.patch, MAPREDUCE-6579.04.patch, MAPREDUCE-6579.05.patch > > > From > [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestNetworkedJob are failed intermittently. > {code} > Running org.apache.hadoop.mapred.TestNetworkedJob > Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 81.131 sec > <<< FAILURE! - in org.apache.hadoop.mapred.TestNetworkedJob > testNetworkedJob(org.apache.hadoop.mapred.TestNetworkedJob) Time elapsed: > 30.55 sec <<< FAILURE! > org.junit.ComparisonFailure: expected:<[[Tue Dec 15 14:02:45 + 2015] > Application is Activated, waiting for resources to be assigned for AM. > Details : AM Partition = ; Partition Resource = > ; Queue's Absolute capacity = 100.0 % ; Queue's > Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; ]> > but was:<[]> > at org.junit.Assert.assertEquals(Assert.java:115) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapred.TestNetworkedJob.testNetworkedJob(TestNetworkedJob.java:174) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6627) Add machine-readable output to mapred job -history command
[ https://issues.apache.org/jira/browse/MAPREDUCE-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143832#comment-15143832 ] Robert Kanter commented on MAPREDUCE-6627: -- I think I figured out the "if x then return idiom" stuff you were referring to. I'll change those. > Add machine-readable output to mapred job -history command > -- > > Key: MAPREDUCE-6627 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6627 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 2.9.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: MAPREDUCE-6627.001.patch, MAPREDUCE-6627.002.patch, > json.txt, json_all.txt > > > It would be great if we could add a machine-readable output format, say JSON, > to the {{mapred job -history \[all\] }} command so that it's > easier for programs to consume that information and do further processing on > it. At the same time, we should keep the existing API and formatting intact > for backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6608) Work Preserving AM Restart for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143761#comment-15143761 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-6608: bq. I agree that storing state in zookeeper may have scalability issues. I am just thinking that will it be ended up having too many small files in hdfs if we are planning to store AM information in HDFS. A solution for this is already given at YARN-1489 by [~bikassaha]. See this comment: https://issues.apache.org/jira/browse/YARN-1489?focusedCommentId=13862359&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13862359. The solution is essentially a combination of registry with YARN acting as a distributed readers solution: Registry owns the write path and storage, RM/NMs take care of providing scalable reads. > Work Preserving AM Restart for MapReduce > > > Key: MAPREDUCE-6608 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Srikanth Sampath >Assignee: Srikanth Sampath > Attachments: Patch1.patch, WorkPreservingMRAppMaster-1.pdf, > WorkPreservingMRAppMaster-2.pdf, WorkPreservingMRAppMaster.pdf > > > Providing a framework for work preserving AM is achieved in > [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489]. We would like > to take advantage of this for MapReduce(MR) applications. There are some > challenges which have been described in the attached document and few options > discussed. We solicit feedback from the community. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-1664) Job Acls affect Queue Acls
[ https://issues.apache.org/jira/browse/MAPREDUCE-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-1664: Release Note: * Removed aclsEnabled flag from queues configuration files. * Removed the configuration property mapreduce.cluster.job-authorization-enabled. * Added mapreduce.cluster.acls.enabled as the single configuration property in mapred-default.xml that enables the authorization checks for all job level and queue level operations. * To enable authorization of users to do job level and queue level operations, mapreduce.cluster.acls.enabled is to be set to true in JobTracker's configuration and in all TaskTrackers' configurations. * To get access to a job, it is enough for a user to be part of one of the access lists i.e. either job-acl or queue-admins-acl(unlike before, when, one has to be part of both the lists). * Queue administrators(configured via acl-administer-jobs) of a queue can do all view-job and modify-job operations on all jobs submitted to that queue. * ClusterOwner(who started the mapreduce cluster) and cluster administrators(configured via mapreduce.cluster.permissions.supergroup) can do all job level operations and queue level operations on all jobs on all queues in that cluster irrespective of job-acls and queue-acls configured. * JobOwner(who submitted job to a queue) can do all view-job and modify-job operations on his/her job irrespective of job-acls and queue-acls. * Since aclsEnabled flag is removed from queues configuration files, "refresh of queues configuration" will not change mapreduce.cluster.acls.enabled on the fly. mapreduce.cluster.acls.enabled can be modified only when restarting the mapreduce cluster. was: * Removed aclsEnabled flag from queues configuration files. * Removed the configuration property mapreduce.cluster.job-authorization-enabled. * Added mapreduce.cluster.acls.enabled as the single configuration property in mapred-default.xml that enables the authorization checks for all job level and queue level operations. * To enable authorization of users to do job level and queue level operations, mapreduce.cluster.acls.enabled is to be set to true in JobTracker's configuration and in all TaskTrackers' configurations. * To get access to a job, it is enough for a user to be part of one of the access lists i.e. either job-acl or queue-admins-acl(unlike before, when, one has to be part of both the lists). * Queue administrators(configured via acl-administer-jobs) of a queue can do all view-job and modify-job operations on all jobs submitted to that queue. * ClusterOwner(who started the mapreduce cluster) and cluster administrators(configured via mapreduce.cluster.permissions.supergroup) can do all job level operations and queue level operations on all jobs on all queues in that cluster irrespective of job-acls and queue-acls configured. * JobOwner(who submitted job to a queue) can do all view-job and modify-job operations on his/her job irrespective of job-acls and queue-acls. * Since aclsEnabled flag is removed from queues configuration files, "refresh of queues configuration" will not change mapreduce.cluster.acls.enabled on the fly. mapreduce.cluster.acls.enabled can be modified only when restarting the mapreduce cluster. > Job Acls affect Queue Acls > -- > > Key: MAPREDUCE-1664 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1664 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: security >Affects Versions: 0.22.0 >Reporter: Ravi Gummadi >Assignee: Ravi Gummadi > Fix For: 0.22.0 > > Attachments: 1664.20S.3.4.patch, 1664.patch, > 1664.qAdminsJobView.20S.v1.6.patch, 1664.v1.1.patch, 1664.v1.2.patch, > 1664.v1.patch, M1664y20s-testfix.patch, mr-1664-20-bugfix.patch > > > MAPREDUCE-1307 introduced job ACLs for securing job level operations. So in > current trunk, queue ACLs and job ACLs are checked(with AND for both acls) > for allowing job level operations. So for doing operations like killJob, > killTask and setJobPriority user should be part of both > mapred.queue.{queuename}.acl-administer-jobs and in > mapreduce.job.acl-modify-job. This needs to change so that users who are part > of mapred.queue.{queuename}.acl-administer-jobs will be able to do > killJob,killTask,setJobPriority and users part of > mapreduce.job.acl-modify-job will be able to do > killJob,killTask,setJobPriority. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-323) Improve the way job history files are managed
[ https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-323: --- Release Note: This patch does four things: * it changes the directory structure of the done directory that holds history logs for jobs that are completed, * it builds toy databases for completed jobs, so we no longer have to scan 2N files on DFS to find out facts about the N jobs that have completed since the job tracker started [which can be hundreds of thousands of files in practical cases], * it changes the job history browser to display more information and allow more filtering criteria, and * it creates a new programmatic interface for finding files matching user-chosen criteria. This allows users to no longer be concerned with our methods of storing them, in turn allowing us to change those at will. The new API described above, which can be used to programmatically obtain history file PATHs given search criteria, is described below: package org.apache.hadoop.mapreduce.jobhistory; ... // this interface is within O.A.H.mapreduce.jobhistory.JobHistory: // holds information about one job hostory log in the done // job history logs public static class JobHistoryJobRecord { public Path getPath() { ... } public String getJobIDString() { ... } public long getSubmitTime() { ... } public String getUserName() { ... } public String getJobName() { ... } } public class JobHistoryRecordRetriever implements Iterator { // usual Interface methods -- remove() throws UnsupportedOperationException // returns the number of calls to next() that will succeed public int numMatches() { ... } } // returns a JobHistoryRecordRetriever that delivers all Path's of job matching job history files, // in no particular order. Any criterion that is null or the empty string does not constrain. // All criteria that are specified are applied conjunctively, except that if there's more than // one date you retrieve all Path's matching ANY date. // soughtUser and soughtJobid must match exactly. // soughtJobName can match the entire job name or any substring. // dates must be in the format exactly MM/DD/ . // Dates' leading digits must be 2's . We're incubating a Y3K problem. public JobHistoryRecordRetriever getMatchingJob (String soughtUser, String soughtJobName, String[] dateStrings, String soughtJobid) throws IOException was: This patch does four things: * it changes the directory structure of the done directory that holds history logs for jobs that are completed, * it builds toy databases for completed jobs, so we no longer have to scan 2N files on DFS to find out facts about the N jobs that have completed since the job tracker started [which can be hundreds of thousands of files in practical cases], * it changes the job history browser to display more information and allow more filtering criteria, and * it creates a new programmatic interface for finding files matching user-chosen criteria. This allows users to no longer be concerned with our methods of storing them, in turn allowing us to change those at will. The new API described above, which can be used to programmatically obtain history file PATHs given search criteria, is described below: package org.apache.hadoop.mapreduce.jobhistory; ... // this interface is within O.A.H.mapreduce.jobhistory.JobHistory: // holds information about one job hostory log in the done // job history logs public static class JobHistoryJobRecord { public Path getPath() { ... } public String getJobIDString() { ... } public long getSubmitTime() { ... } public String getUserName() { ... } public String getJobName() { ... } } public class JobHistoryRecordRetriever implements Iterator { // usual Interface methods -- remove() throws UnsupportedOperationException // returns the number of calls to next() that will succeed public int numMatches() { ... } } // returns a JobHistoryRecordRetriever that delivers all Path's of job matching job history files, // in no particular order. Any criterion that is null or the empty string does not constrain. // All criteria that are specified are applied conjunctively, except that if there's more than // one date you retrieve all Path's matching ANY date. // soughtUser and soughtJobid must match exactly. // soughtJobName can match the entire job name or any substring. // dates must be in the format exactly MM/DD/ . // Dates' leading digits must be 2's . We're incubating a Y3K problem. public JobHistoryRecordRetriever getMatchingJob (String soughtUser, String soughtJobName, String[] dateStrings, String sought
[jira] [Updated] (MAPREDUCE-4583) Wrong paths for CapacityScheduler/FairScheduler jar in documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-4583: Fix Version/s: (was: 1.0.4) > Wrong paths for CapacityScheduler/FairScheduler jar in documentation > > > Key: MAPREDUCE-4583 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4583 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: documentation >Affects Versions: 1.0.3 >Reporter: Bertrand Dechoux > Labels: documentation > > Both documentations > http://hadoop.apache.org/common/docs/r1.0.3/fair_scheduler.html > http://hadoop.apache.org/common/docs/r1.0.3/capacity_scheduler.html > say that the jar should be copied from the contrib/*scheduler directory. > But that's not the case ; both jars are actually in the lib folder -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-2064) Tutorial should mention SetMapOutputKeyClass
[ https://issues.apache.org/jira/browse/MAPREDUCE-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated MAPREDUCE-2064: Fix Version/s: (was: 1.0.4) > Tutorial should mention SetMapOutputKeyClass > > > Key: MAPREDUCE-2064 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2064 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: documentation >Affects Versions: 0.21.0 >Reporter: Clarence Gardner >Priority: Minor > Labels: newbie > > The official tutorial (mapred_tutorial.html) (and all other tutorials I've > seen on the web) show a program that has the same datatypes for the key/value > pairs emitted by the mapper and by the reducer, and shows a configuration > call to Job.setOutput{Key,Value}Class but doesn't say that it refers to both > the mapper and the reducer. It sounds like it refers to the reducer output. > This might be mentioned in the "Job Configuration" section. Here is a > possible addition, after the "The Job is used to specify ..." paragraph. > The job also configures the types of its key/value pairs with > setOutputKeyClass(type) andsetOutputValueClass(type), which appy to both the > mapper and reducer classes. If the types output by the mapper and reducer are > not the same, that should be followed with setMapOutputKeyClass(type) and > setMapOutputValueClass(type). > (I'm assuming that at least a call to setOutput{Key,Value}Class is required.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6627) Add machine-readable output to mapred job -history command
[ https://issues.apache.org/jira/browse/MAPREDUCE-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143609#comment-15143609 ] Robert Kanter commented on MAPREDUCE-6627: -- Thanks for the review. I'll start working on the more straightforward ones, but here's some that I either had questions on or wanted to clarify: - FindBugs actually suggested I catch the {{RuntimeException}} and rethrow it like that. The idea is to handle {{Exception}} without accidentally catching {{RuntimeExeption}}. See https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6311/artifact/patchprocess/new-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.html#REC_CATCH_EXCEPTION That said, we're just wrapping {{Exception}} and not really handling it anyway, so I'll revisit this. - Where are you referring to exactly on this? {quote}On a philosophical note, I don't like the if x then return idiom. I'd rather have the code wrapped in an if !x.{quote} - You're right that the comparators are a bit confusing. I didn't write them originally, but now that I'm moving them I'll take the opportunity to rewrite them for clarity. - I originally had {{JobHistoryViewerPrinter}} as an abstract class because it had some common methods in it that both Printer implementations used, but I had since refactored that a lot, so there's no more common code and never changed it. Good catch; I'll make this in an interface. > Add machine-readable output to mapred job -history command > -- > > Key: MAPREDUCE-6627 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6627 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 2.9.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: MAPREDUCE-6627.001.patch, MAPREDUCE-6627.002.patch, > json.txt, json_all.txt > > > It would be great if we could add a machine-readable output format, say JSON, > to the {{mapred job -history \[all\] }} command so that it's > easier for programs to consume that information and do further processing on > it. At the same time, we should keep the existing API and formatting intact > for backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6627) Add machine-readable output to mapred job -history command
[ https://issues.apache.org/jira/browse/MAPREDUCE-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143582#comment-15143582 ] Daniel Templeton commented on MAPREDUCE-6627: - Thanks for the patch, [~rkanter]. A couple of first pass comments: * Since you're touching the javadocs, the param and return descriptions shouldn't start with capital letters. It would also be nice to have description fields for the throws tags. * In the new {{HistoryViewer}} constructor you have: {code} } catch (RuntimeException e) { throw e; } ... {code} * Please add javadoc headers for the {{JobHistoryViewerHumanPrinter}} and {{JobHistoryViewerJsonPrinter}} classes * On a philosophical note, I don't like the {{if x then return}} idiom. I'd rather have the code wrapped in an {{if !x}}. * I really don't love the nested ternary operators in the comparators in {{JobHistoryViewerHumanPrinter}} * In the {{JobHistoryViewerJsonPrinter.print()}} method, you have: {code} } catch (JSONException je) { throw new IOException(je); } finally { {code} I'd rather see the IOException have a message that sets some useful context, e.g. {{throw new IOException("Failure parsing JSON document: " + je, je)}} * In {{JobHistoryViewerJsonPrinter.print()}}, I'd rather that this: {code} private String fixGroupNameForShuffleErrors(String name) { if (name.equals("Shuffle Errors")) { return "org.apache.hadoop.mapreduce.task.reduce.Fetcher.ShuffleErrors"; } return name; } {code} were this: {code} private String fixGroupNameForShuffleErrors(String name) { String retName = name; if (name.equals("Shuffle Errors")) { retName = "org.apache.hadoop.mapreduce.task.reduce.Fetcher.ShuffleErrors"; } return retName; } {code} But maybe I'm just obsessive. * I'm not sure the utility you get from {{JobHistoryViewerPrinter}} being an abstract class is worth it. I would think hard about making it an interface. * In {{CLI}}, I'd rather have space around operators, so {{index + 1}} instead of {{index+1}}. * The tests seem really light. I'd like to see the tests dig into the JSON object and confirm that the data is an expected. I'd also like to see some testing of failure scenarios, like {{-format biteMe}}. I haven't applied the patch or played with it yet, though, so there may be more to say later. :) > Add machine-readable output to mapred job -history command > -- > > Key: MAPREDUCE-6627 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6627 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 2.9.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: MAPREDUCE-6627.001.patch, MAPREDUCE-6627.002.patch, > json.txt, json_all.txt > > > It would be great if we could add a machine-readable output format, say JSON, > to the {{mapred job -history \[all\] }} command so that it's > easier for programs to consume that information and do further processing on > it. At the same time, we should keep the existing API and formatting intact > for backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6579) JobStatus#getFailureInfo should not output diagnostic information when the job is running
[ https://issues.apache.org/jira/browse/MAPREDUCE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143456#comment-15143456 ] Jason Lowe commented on MAPREDUCE-6579: --- It's a little unfortunate that YARN-3946 started putting non-fatal messages into what is typically an app-driven diagnostic repository. Now all applications will start getting these (probably mostly annoying) messages for every job completion, assuming that most app frameworks dump the diagnostic strings when the application completes. It seems these new messages only make sense to report when the job is active and are mostly noise afterwards. Back to the MapReduce side of this, IMHO we need to return diagnostics for any case where we used to return diagnostics before. Since this is specific to MapReduce, we can check the MR AM to see all the places where we could set a diagnostic. Most places I found only set the diagnostic when the job fails, but I did find at least one place where the diagnostic could be set yet the job could succeed. When a task fails a job diagnostic is added, see JobImpl.TaskCompletedTransition#taskFailed. If the user configured the job to allow some tasks to fail yet the job can succeed then we could end up with a successful job with some task failure messages in the diagnostics. However that's a relatively rare config for a typical MapReduce job, and I'm not sure how many downstream software stacks are going to start getting upset when they see getFailureInfo start returning data on a regular basis for successful jobs. It's rather unfortunate that the method is called getFailureInfo and will now always contain messages unrelated to any failure. The downstream stacks should be checking the overall job status and not empty/non-empty on the getFailureInfo result to know whether the job really did fail or not, so on one hand I'm leaning towards reporting them on success as well. But then part of me thinks it will simply be annoying to have every job dump a bunch of messages on waiting to schedule, waiting to register, etc. on every successful job, which leads me to wonder if we really want YARN-3946 to work the way it does. > JobStatus#getFailureInfo should not output diagnostic information when the > job is running > - > > Key: MAPREDUCE-6579 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6579 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Akira AJISAKA >Priority: Blocker > Attachments: MAPREDUCE-6579.01.patch, MAPREDUCE-6579.02.patch, > MAPREDUCE-6579.03.patch, MAPREDUCE-6579.04.patch, MAPREDUCE-6579.05.patch > > > From > [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestNetworkedJob are failed intermittently. > {code} > Running org.apache.hadoop.mapred.TestNetworkedJob > Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 81.131 sec > <<< FAILURE! - in org.apache.hadoop.mapred.TestNetworkedJob > testNetworkedJob(org.apache.hadoop.mapred.TestNetworkedJob) Time elapsed: > 30.55 sec <<< FAILURE! > org.junit.ComparisonFailure: expected:<[[Tue Dec 15 14:02:45 + 2015] > Application is Activated, waiting for resources to be assigned for AM. > Details : AM Partition = ; Partition Resource = > ; Queue's Absolute capacity = 100.0 % ; Queue's > Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; ]> > but was:<[]> > at org.junit.Assert.assertEquals(Assert.java:115) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapred.TestNetworkedJob.testNetworkedJob(TestNetworkedJob.java:174) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (MAPREDUCE-5422) [Umbrella] Fix invalid state transitions in MRAppMaster
[ https://issues.apache.org/jira/browse/MAPREDUCE-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K resolved MAPREDUCE-5422. -- Resolution: Fixed I am closing this umbrella jira as all the sub tasks are resolved. > [Umbrella] Fix invalid state transitions in MRAppMaster > --- > > Key: MAPREDUCE-5422 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5422 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: mr-am >Affects Versions: 2.0.5-alpha >Reporter: Devaraj K >Assignee: Devaraj K > > There are mutiple invalid state transitions for the state machines present in > MRAppMaster. All these can be handled as part of this umbrell JIRA. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5400) MRAppMaster throws InvalidStateTransitonException: Invalid event: JOB_TASK_COMPLETED at SUCCEEDED for JobImpl
[ https://issues.apache.org/jira/browse/MAPREDUCE-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-5400: - Resolution: Cannot Reproduce Status: Resolved (was: Patch Available) I don't think it is still an issue since MR has undergone many changes after creating the issue. I am closing it, Please reopen it if you see the issue again. > MRAppMaster throws InvalidStateTransitonException: Invalid event: > JOB_TASK_COMPLETED at SUCCEEDED for JobImpl > - > > Key: MAPREDUCE-5400 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5400 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: applicationmaster >Affects Versions: 2.0.5-alpha >Reporter: J.Andreina >Assignee: Devaraj K >Priority: Minor > Attachments: MAPREDUCE-5400-1.patch, MAPREDUCE-5400.patch > > > Step 1: Install cluster with HDFS , MR > Step 2: Execute a job > Step 3: Issue a kill task attempt for which the task has got completed. > Rex@HOST-10-18-91-55:~/NodeAgentTmpDir/installations/hadoop-2.0.5.tar/hadoop-2.0.5/bin> > ./mapred job -kill-task attempt_1373875322959_0032_m_00_0 > No GC_PROFILE is given. Defaults to medium. > 13/07/15 14:46:32 INFO service.AbstractService: > Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited. > 13/07/15 14:46:32 INFO proxy.ResourceManagerProxies: HA Proxy Creation with > xface : interface org.apache.hadoop.yarn.api.ClientRMProtocol > 13/07/15 14:46:33 INFO service.AbstractService: > Service:org.apache.hadoop.yarn.client.YarnClientImpl is started. > Killed task attempt_1373875322959_0032_m_00_0 > Observation: > === > 1. task state has been transitioned from SUCCEEDED to SCHEDULED > 2. For a Succeeded attempt , when client issues Kill , then the client is > notified as killed for a succeeded attempt. > 3. Launched second task_attempt which is succeeded and then killed later on > client request. > 4. Even after the job state transitioned from SUCCEEDED to ERROR , on UI the > state is succeeded > Issue : > = > 1. Client has been notified that the atttempt is killed , but acutually the > attempt is succeeded and the same is displayed in JHS UI. > 2. At App master InvalidStateTransitonException is thrown . > 3. At client side and JHS job has exited with state Finished/succeeded ,At RM > side the state is Finished/Failed. > AM Logs: > > 2013-07-15 14:46:25,461 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1373875322959_0032_m_00_0 TaskAttempt Transitioned from RUNNING > to SUCCEEDED > 2013-07-15 14:46:25,468 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with > attempt attempt_1373875322959_0032_m_00_0 > 2013-07-15 14:46:25,470 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1373875322959_0032_m_00 Task Transitioned from RUNNING to SUCCEEDED > 2013-07-15 14:46:33,810 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1373875322959_0032_m_00 Task Transitioned from SUCCEEDED to SCHEDULED > 2013-07-15 14:46:37,344 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with > attempt attempt_1373875322959_0032_m_00_1 > 2013-07-15 14:46:37,344 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1373875322959_0032_m_00 Task Transitioned from RUNNING to SUCCEEDED > 2013-07-15 14:46:37,345 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Can't handle this event > at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > JOB_TASK_COMPLETED at SUCCEEDED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:866) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:128) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1095) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1091) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130) > at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) > at
[jira] [Resolved] (MAPREDUCE-4754) Job is marked as FAILED and also throwing the TransitonException instead of KILLED when issues a KILL command
[ https://issues.apache.org/jira/browse/MAPREDUCE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K resolved MAPREDUCE-4754. -- Resolution: Cannot Reproduce I don't think it is still an issue since MR has undergone many changes after creating the issue. I am closing it, Please reopen it if you see the issue again. > Job is marked as FAILED and also throwing the TransitonException instead of > KILLED when issues a KILL command > - > > Key: MAPREDUCE-4754 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4754 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: mrv2 >Affects Versions: 2.0.1-alpha, 2.0.2-alpha >Reporter: Nishan Shetty > > {code} > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > JOB_TASK_COMPLETED at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:695) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:119) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:893) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:889) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) > at java.lang.Thread.run(Thread.java:662) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)