[jira] [Updated] (MAPREDUCE-5485) Allow repeating job commit by extending OutputCommitter API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated MAPREDUCE-5485: - Summary: Allow repeating job commit by extending OutputCommitter API (was: Allow repeating job commit by extending OutputCommiter API) Allow repeating job commit by extending OutputCommitter API --- Key: MAPREDUCE-5485 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5485 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.1.0-beta Reporter: Nemon Lou There are chances MRAppMaster crush during job committing,or NodeManager restart cause the committing AM exit due to container expire.In these cases ,the job will fail. However,some jobs can redo commit so failing the job becomes unnecessary. Let clients tell AM to allow redo commit or not is a better choice. This idea comes from Jason Lowe's comments in MAPREDUCE-4819 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5485) Allow repeating job commit by extending OutputCommitter API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated MAPREDUCE-5485: - Assignee: Nemon Lou Allow repeating job commit by extending OutputCommitter API --- Key: MAPREDUCE-5485 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5485 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.1.0-beta Reporter: Nemon Lou Assignee: Nemon Lou There are chances MRAppMaster crush during job committing,or NodeManager restart cause the committing AM exit due to container expire.In these cases ,the job will fail. However,some jobs can redo commit so failing the job becomes unnecessary. Let clients tell AM to allow redo commit or not is a better choice. This idea comes from Jason Lowe's comments in MAPREDUCE-4819 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5497) '5s sleep' in MRAppMaster.shutDownJob is only needed before stopping ClientService
[ https://issues.apache.org/jira/browse/MAPREDUCE-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764201#comment-13764201 ] Hudson commented on MAPREDUCE-5497: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #329 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/329/]) MAPREDUCE-5497. Changed MRAppMaster to sleep only after doing everything else but just before ClientService to avoid race conditions during RM restart. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521699) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/ClientService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/MRClientService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRAppComponentDependencies.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java '5s sleep' in MRAppMaster.shutDownJob is only needed before stopping ClientService --- Key: MAPREDUCE-5497 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5497 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jian He Assignee: Jian He Fix For: 2.1.1-beta Attachments: MAPREDUCE-5497.1.patch, MAPREDUCE-5497.1.patch, MAPREDUCE-5497.2.patch, MAPREDUCE-5497.patch Since the '5s sleep' is for the purpose to let clients know the final states, put it after other services are stopped and only before stopping ClientService is enough. This can reduce some race conditions like MAPREDUCE-5471 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5020) Compile failure with JDK8
[ https://issues.apache.org/jira/browse/MAPREDUCE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764200#comment-13764200 ] Hudson commented on MAPREDUCE-5020: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #329 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/329/]) MAPREDUCE-5020. Compile failure with JDK8 (Trevor Robinson via tgraves) (tgraves: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521576) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java Compile failure with JDK8 - Key: MAPREDUCE-5020 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5020 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.0.3-alpha, 2.1.0-beta Environment: java version 1.8.0-ea Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e) Java HotSpot(TM) Client VM (build 25.0-b04, mixed mode) Reporter: Trevor Robinson Assignee: Trevor Robinson Labels: build-failure, jdk8 Fix For: 3.0.0, 2.3.0, 2.1.1-beta Attachments: MAPREDUCE-5020.patch Compiling {{org/apache/hadoop/mapreduce/lib/partition/InputSampler.java}} fails with the Java 8 preview compiler due to its stricter enforcement of JLS 15.12.2.6 (for [Java 5|http://docs.oracle.com/javase/specs/jls/se5.0/html/expressions.html#15.12.2.6] or [Java 7|http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.12.2.6]), which demands that methods applicable via unchecked conversion have their return type erased: {noformat} [ERROR] hadoop-common/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java:[320,35] error: incompatible types: Object[] cannot be converted to K[] {noformat} {code} @SuppressWarnings(unchecked) // getInputFormat, getOutputKeyComparator public static K,V void writePartitionFile(Job job, SamplerK,V sampler) throws IOException, ClassNotFoundException, InterruptedException { Configuration conf = job.getConfiguration(); final InputFormat inf = ReflectionUtils.newInstance(job.getInputFormatClass(), conf); int numPartitions = job.getNumReduceTasks(); K[] samples = sampler.getSample(inf, job); // returns Object[] according to JLS {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5485) Allow repeating job commit by extending OutputCommitter API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764207#comment-13764207 ] Nemon Lou commented on MAPREDUCE-5485: -- Some initial thoughts: 1,Adding a method : boolean isCommitJobRepeatable() for outputCommitters. The abstract class OutputCommitter will return false for it. And FileOutputCommitter returns true,as FileOutputCommitter's commitJob will remove exists files that will be committed and then do the rename. For commit repeatable jobs,we have 2,3,4: 2,When commitJob method throws exception,AM will retry commit directly with a times limit. 3,When AM has an error during committing(error is not from the commitJob method),it will not reach a job final state ,but just exit and leave work to another AM. 4,For secondly started AM,it will check the phase the job has reached.If the phase is commit failed,then it's state will reach job committing after recovery and start commit again. Allow repeating job commit by extending OutputCommitter API --- Key: MAPREDUCE-5485 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5485 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.1.0-beta Reporter: Nemon Lou Assignee: Nemon Lou There are chances MRAppMaster crush during job committing,or NodeManager restart cause the committing AM exit due to container expire.In these cases ,the job will fail. However,some jobs can redo commit so failing the job becomes unnecessary. Let clients tell AM to allow redo commit or not is a better choice. This idea comes from Jason Lowe's comments in MAPREDUCE-4819 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5020) Compile failure with JDK8
[ https://issues.apache.org/jira/browse/MAPREDUCE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764291#comment-13764291 ] Hudson commented on MAPREDUCE-5020: --- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1519 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1519/]) MAPREDUCE-5020. Compile failure with JDK8 (Trevor Robinson via tgraves) (tgraves: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521576) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java Compile failure with JDK8 - Key: MAPREDUCE-5020 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5020 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.0.3-alpha, 2.1.0-beta Environment: java version 1.8.0-ea Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e) Java HotSpot(TM) Client VM (build 25.0-b04, mixed mode) Reporter: Trevor Robinson Assignee: Trevor Robinson Labels: build-failure, jdk8 Fix For: 3.0.0, 2.3.0, 2.1.1-beta Attachments: MAPREDUCE-5020.patch Compiling {{org/apache/hadoop/mapreduce/lib/partition/InputSampler.java}} fails with the Java 8 preview compiler due to its stricter enforcement of JLS 15.12.2.6 (for [Java 5|http://docs.oracle.com/javase/specs/jls/se5.0/html/expressions.html#15.12.2.6] or [Java 7|http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.12.2.6]), which demands that methods applicable via unchecked conversion have their return type erased: {noformat} [ERROR] hadoop-common/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java:[320,35] error: incompatible types: Object[] cannot be converted to K[] {noformat} {code} @SuppressWarnings(unchecked) // getInputFormat, getOutputKeyComparator public static K,V void writePartitionFile(Job job, SamplerK,V sampler) throws IOException, ClassNotFoundException, InterruptedException { Configuration conf = job.getConfiguration(); final InputFormat inf = ReflectionUtils.newInstance(job.getInputFormatClass(), conf); int numPartitions = job.getNumReduceTasks(); K[] samples = sampler.getSample(inf, job); // returns Object[] according to JLS {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5497) '5s sleep' in MRAppMaster.shutDownJob is only needed before stopping ClientService
[ https://issues.apache.org/jira/browse/MAPREDUCE-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764292#comment-13764292 ] Hudson commented on MAPREDUCE-5497: --- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1519 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1519/]) MAPREDUCE-5497. Changed MRAppMaster to sleep only after doing everything else but just before ClientService to avoid race conditions during RM restart. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521699) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/ClientService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/MRClientService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRAppComponentDependencies.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java '5s sleep' in MRAppMaster.shutDownJob is only needed before stopping ClientService --- Key: MAPREDUCE-5497 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5497 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jian He Assignee: Jian He Fix For: 2.1.1-beta Attachments: MAPREDUCE-5497.1.patch, MAPREDUCE-5497.1.patch, MAPREDUCE-5497.2.patch, MAPREDUCE-5497.patch Since the '5s sleep' is for the purpose to let clients know the final states, put it after other services are stopped and only before stopping ClientService is enough. This can reduce some race conditions like MAPREDUCE-5471 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5020) Compile failure with JDK8
[ https://issues.apache.org/jira/browse/MAPREDUCE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764353#comment-13764353 ] Hudson commented on MAPREDUCE-5020: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1545 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1545/]) MAPREDUCE-5020. Compile failure with JDK8 (Trevor Robinson via tgraves) (tgraves: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521576) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java Compile failure with JDK8 - Key: MAPREDUCE-5020 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5020 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.0.3-alpha, 2.1.0-beta Environment: java version 1.8.0-ea Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e) Java HotSpot(TM) Client VM (build 25.0-b04, mixed mode) Reporter: Trevor Robinson Assignee: Trevor Robinson Labels: build-failure, jdk8 Fix For: 3.0.0, 2.3.0, 2.1.1-beta Attachments: MAPREDUCE-5020.patch Compiling {{org/apache/hadoop/mapreduce/lib/partition/InputSampler.java}} fails with the Java 8 preview compiler due to its stricter enforcement of JLS 15.12.2.6 (for [Java 5|http://docs.oracle.com/javase/specs/jls/se5.0/html/expressions.html#15.12.2.6] or [Java 7|http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.12.2.6]), which demands that methods applicable via unchecked conversion have their return type erased: {noformat} [ERROR] hadoop-common/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/InputSampler.java:[320,35] error: incompatible types: Object[] cannot be converted to K[] {noformat} {code} @SuppressWarnings(unchecked) // getInputFormat, getOutputKeyComparator public static K,V void writePartitionFile(Job job, SamplerK,V sampler) throws IOException, ClassNotFoundException, InterruptedException { Configuration conf = job.getConfiguration(); final InputFormat inf = ReflectionUtils.newInstance(job.getInputFormatClass(), conf); int numPartitions = job.getNumReduceTasks(); K[] samples = sampler.getSample(inf, job); // returns Object[] according to JLS {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5497) '5s sleep' in MRAppMaster.shutDownJob is only needed before stopping ClientService
[ https://issues.apache.org/jira/browse/MAPREDUCE-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764354#comment-13764354 ] Hudson commented on MAPREDUCE-5497: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1545 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1545/]) MAPREDUCE-5497. Changed MRAppMaster to sleep only after doing everything else but just before ClientService to avoid race conditions during RM restart. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521699) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/ClientService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/MRClientService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRAppComponentDependencies.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java '5s sleep' in MRAppMaster.shutDownJob is only needed before stopping ClientService --- Key: MAPREDUCE-5497 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5497 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jian He Assignee: Jian He Fix For: 2.1.1-beta Attachments: MAPREDUCE-5497.1.patch, MAPREDUCE-5497.1.patch, MAPREDUCE-5497.2.patch, MAPREDUCE-5497.patch Since the '5s sleep' is for the purpose to let clients know the final states, put it after other services are stopped and only before stopping ClientService is enough. This can reduce some race conditions like MAPREDUCE-5471 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5332) Support token-preserving restart of history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764384#comment-13764384 ] Daryn Sharp commented on MAPREDUCE-5332: +HistoryServerFileSystemStateStore+ # Suggest: It may be clearer to rename the {{TOKEN_FOO_PREFIX}} constants to be {{TOKEN_FOO_DIR_PREFIX}} or {{TOKEN_*_FILE_PREFIX}}. # Suggest: I'd consider not having the hardcoded {{ROOT_STATE_DIR_NAME}} added to the user's path configured by {{MR_HS_FS_STATE_STORE_URI}}. Is there an advantage to not using exactly what the user specified? Up to you. # Question: In {{startStateStorage}}, why 2 mkdirs instead of 1? A mkdir via {{createDir}} is only going to check the permissions of the leaf dir which seems dubious. If any of the parent dirs are owned by another user with open permissions, the directory created by the JHS can be deleted and recreated with open permissions. Point is I'm not sure the extra checks add value, but I suppose they don't hurt. Up to you. # Bug: Unlike {{HistoryServerMemStateStore}}, there appear to be no checks for things being added twice - although arguably those checks all belong in ADTSM. Token check is there, but not a secret check. I think the state stores should behave consistently. # Bug: In {{getBucketPath}}, I think you want to mod (%) the seq number instead of dividing? Otherwise it creates a janitorial job for someone to clean up empty directories. +HistoryServerStateStore+ # Suggest: For clarify, perhaps rename to {{HistoryServerStateStore*Service*}}. I kept getting it confused with {{HistoryServerState}}. # Suggest: I'd consider removing the dtsm's recover method. Perhaps {{loadState}} can take the dtsm as an argument and directly populate it instead of populating an intermediary {{HistoryServerState}} object before populating the dtsm. +JobHistoryServer+ # Bug: Is the {{stateStore}} going to be started twice? Once in {{startService}} if {{recoveryEnabled}}, again by {{super#startService}} when it iterates the composite services? # Bug: Should the state store service be started after being recovered? Not before? # Suggest: Perhaps the {{stateStore}} should be conditionally created registered in {{serviceInit}}, then having the state loading all in {{stateStore#start}} invoked by the composite service. Then there's no need for additional logic in {{JHS#serviceStart}}. Just an idea. +JobHistoryStateStoreFactory+ # Bug? Seems a bit odd if recovery is enabled but there's no class defined, a {{HistoryServerNullStateStore}} is created. It appears {{JHS#serviceStart}} will fail when it calls {{loadState}} and an {{UnsupportedOperationException}} is thrown. The null store seems to have no real value other than deferring an error from {{JHS#serviceInit}} to {{JHS#serviceStart}}? # Suggest: It feels like {{JobHistoryServer}} should only create register a state store if required - which ties in with the prior comment in JHS. {{serviceInit}} only asks the factory for a state store if recovery is enabled. The factory throws if no class is defined. Support token-preserving restart of history server -- Key: MAPREDUCE-5332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobhistoryserver Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, MAPREDUCE-5332.patch To better support rolling upgrades through a cluster, the history server needs the ability to restart without losing track of delegation tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-5332: -- Attachment: MAPREDUCE-5332-4.patch Updated patch to address Daryn's comments. Summary of changes: * Prefixes prefixed * getBucketPath div-to-mod fix * state stores renamed to state store services Support token-preserving restart of history server -- Key: MAPREDUCE-5332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobhistoryserver Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, MAPREDUCE-5332-4.patch, MAPREDUCE-5332.patch To better support rolling upgrades through a cluster, the history server needs the ability to restart without losing track of delegation tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5332) Support token-preserving restart of history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764594#comment-13764594 ] Jason Lowe commented on MAPREDUCE-5332: --- Thanks for the review, Daryn! bq. Suggest: It may be clearer to rename the TOKEN_FOO_PREFIX constants to be TOKEN_FOO_DIR_PREFIX or TOKEN_*_FILE_PREFIX. Will do. bq. Suggest: I'd consider not having the hardcoded ROOT_STATE_DIR_NAME added to the user's path configured by MR_HS_FS_STATE_STORE_URI. I followed the precedent set by FileSystemRMStateStore. It seems safer to add a possibly redundant directory level in case the user configured the URI to a public directory intended for other users (e.g.: something like /tmp). bq. Question: In startStateStorage, why 2 mkdirs instead of 1? This is primarly because mkdirs isn't guaranteed to set the permissions of any created parent directories properly. I believe HDFS does this, but RawLocalFileSystem does not. bq. Bug: Unlike HistoryServerMemStateStore, there appear to be no checks for things being added twice - although arguably those checks all belong in ADTSM. Token check is there, but not a secret check. I think the state stores should behave consistently. createFile() should fail if the file is already present, so the file system store should correctly fail if a token or key is added redundantly. bq. Bug: In getBucketPath, I think you want to mod (%) the seq number instead of dividing? Otherwise it creates a janitorial job for someone to clean up empty directories. Good catch! bq. Suggest: For clarify, perhaps rename to HistoryServerStateStore*Service*. I kept getting it confused with HistoryServerState. Will do. bq. Suggest: I'd consider removing the dtsm's recover method. Perhaps loadState can take the dtsm as an argument and directly populate it instead of populating an intermediary HistoryServerState object before populating the dtsm. I'm following RMStateStore precedent here as well. I believe the intent is for the state object to shield the state stores from knowledge of the secret manager internals and vice-versa. {quote} Bug: Is the stateStore going to be started twice? Once in startService if recoveryEnabled, again by super#startService when it iterates the composite services? Bug: Should the state store service be started after being recovered? Not before? {quote} The stateStore needs to be started before recovery can occur, and starting a service twice is a no-op. So it should work OK as written. However I'll clean up the out-of-band state store start with a simple recovery service that is added to the composite service right after the state store. The recovery service's start method can check if recovery is enabled and perform the recovery on the (now started) state store before the other services are started. bq. Bug? Seems a bit odd if recovery is enabled but there's no class defined, a HistoryServerNullStateStore is created. It appears JHS#serviceStart will fail when it calls loadState and an UnsupportedOperationException is thrown. The null store seems to have no real value other than deferring an error from JHS#serviceInit to JHS#serviceStart? The null store is necessary to avoid is-recovery-enabled checks throughout the secret manager as it updates state. It fails on recovery to catch the scenario where the user enabled recovery but forgot to configure a state store. Support token-preserving restart of history server -- Key: MAPREDUCE-5332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobhistoryserver Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, MAPREDUCE-5332.patch To better support rolling upgrades through a cluster, the history server needs the ability to restart without losing track of delegation tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5332) Support token-preserving restart of history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764709#comment-13764709 ] Hadoop QA commented on MAPREDUCE-5332: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602618/MAPREDUCE-5332-4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.TestMRJobClient The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.v2.TestUberAM {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3993//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3993//console This message is automatically generated. Support token-preserving restart of history server -- Key: MAPREDUCE-5332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobhistoryserver Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, MAPREDUCE-5332-4.patch, MAPREDUCE-5332.patch To better support rolling upgrades through a cluster, the history server needs the ability to restart without losing track of delegation tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5504) mapred queue -info inconsistent with types
Thomas Graves created MAPREDUCE-5504: Summary: mapred queue -info inconsistent with types Key: MAPREDUCE-5504 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5504 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.23.9 Reporter: Thomas Graves $ mapred queue -info default == Queue Name : default Queue State : running Scheduling Info : Capacity: 4.0, MaximumCapacity: 0.67, CurrentCapacity: 0.9309831 The capacity is displayed in % as 4, however maximum capacity is displayed as an absolute number 0.67 instead of 67%. We should make these consistent with the type we are displaying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5503) TestMRJobClient.testJobClient is failing
Jason Lowe created MAPREDUCE-5503: - Summary: TestMRJobClient.testJobClient is failing Key: MAPREDUCE-5503 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5503 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 3.0.0 Reporter: Jason Lowe TestMRJobClient.testJobClient is failing on trunk and causing precommit builds to complain: {noformat} testJobClient(org.apache.hadoop.mapreduce.TestMRJobClient) Time elapsed: 26.361 sec FAILURE! junit.framework.AssertionFailedError: expected:1 but was:0 at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at junit.framework.Assert.assertEquals(Assert.java:199) at junit.framework.Assert.assertEquals(Assert.java:205) at org.apache.hadoop.mapreduce.TestMRJobClient.testJobList(TestMRJobClient.java:474) at org.apache.hadoop.mapreduce.TestMRJobClient.testJobClient(TestMRJobClient.java:112) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5332) Support token-preserving restart of history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764781#comment-13764781 ] Jason Lowe commented on MAPREDUCE-5332: --- The test failures are unrelated. TestUberAM still hasn't been fixed, see MAPREDUCE-5481. I can reproduce the TestMRJobClient failure on trunk, filed MAPREDUCE-5503. Support token-preserving restart of history server -- Key: MAPREDUCE-5332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobhistoryserver Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, MAPREDUCE-5332-4.patch, MAPREDUCE-5332.patch To better support rolling upgrades through a cluster, the history server needs the ability to restart without losing track of delegation tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov resolved MAPREDUCE-5501. Resolution: Won't Fix This is caused by a bug in MiniYARNCluster. Reported in YARN-1183 RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov Attachments: hanging-rmcontainer-allocator.stdout, hanging-rmcontainer-allocator.syslog After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov reassigned MAPREDUCE-5501: -- Assignee: Andrey Klochkov RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov Assignee: Andrey Klochkov Attachments: hanging-rmcontainer-allocator.stdout, hanging-rmcontainer-allocator.syslog After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Status: Patch Available (was: Open) Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: MAPREDUCE-4980--n6.patch Rebased the patch. When testing, made an additional fix in MiniMRYarnCluster which sometimes leads to incorrectly configured history server address. Also, a fix submitted separately into YARN-1183 makes builds much more stable. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-5500) Accessing task page for running job throw 500 Error code
[ https://issues.apache.org/jira/browse/MAPREDUCE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Han resolved MAPREDUCE-5500. - Resolution: Duplicate Accessing task page for running job throw 500 Error code Key: MAPREDUCE-5500 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5500 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.0.5-alpha Reporter: Paul Han Assignee: Paul Han For running jobs on Hadoop 2.0, trying to access Task counters page throws Server 500 error. Digging a bit I see this exception in MRAppMaster logs {noformat} 2013-08-09 21:54:35,083 ERROR [556661283@qtp-875702288-23] org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /mapreduce/task/task_1376081364308_0002_m_01 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:150) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:123) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1069) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: nestLevel=6 expected 5 at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:66) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:74) at org.apache.hadoop.yarn.webapp.View.render(View.java:233) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:47) at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) at
[jira] [Commented] (MAPREDUCE-5500) Accessing task page for running job throw 500 Error code
[ https://issues.apache.org/jira/browse/MAPREDUCE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764988#comment-13764988 ] Paul Han commented on MAPREDUCE-5500: - When creating patch, realized that there's a similar fix is already in. Marked this as dup. My fix was to add a closing block: ._(). which is similar to Thomas's addition of .tbody(). Accessing task page for running job throw 500 Error code Key: MAPREDUCE-5500 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5500 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.0.5-alpha Reporter: Paul Han Assignee: Paul Han For running jobs on Hadoop 2.0, trying to access Task counters page throws Server 500 error. Digging a bit I see this exception in MRAppMaster logs {noformat} 2013-08-09 21:54:35,083 ERROR [556661283@qtp-875702288-23] org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /mapreduce/task/task_1376081364308_0002_m_01 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:150) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:123) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1069) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: nestLevel=6 expected 5 at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:66) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:74) at