[jira] [Moved] (MAPREDUCE-6393) Application Master and Task Tracker timeouts are applied incorrectly
[ https://issues.apache.org/jira/browse/MAPREDUCE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith moved YARN-3788 to MAPREDUCE-6393: - Affects Version/s: (was: 2.4.1) 2.4.1 Key: MAPREDUCE-6393 (was: YARN-3788) Project: Hadoop Map/Reduce (was: Hadoop YARN) Application Master and Task Tracker timeouts are applied incorrectly Key: MAPREDUCE-6393 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6393 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.1 Reporter: Dmitry Sivachenko I am running a streaming job which requires a big (~50GB) data file to run (file is attached via hadoop jar ... -file BigFile.dat). Most likely this command will fail as follows (note that error message is rather meaningless): 2015-05-27 15:55:00,754 WARN [main] streaming.StreamJob (StreamJob.java:parseArgv(291)) - -file option is deprecated, please use generic option -files instead. packageJobJar: [/ssd/mt/lm/en_reorder.ylm, mapper.py, /tmp/hadoop-mitya/hadoop-unjar3778165585140840383/] [] /var/tmp/streamjob633547925483233845.jar tmpDir=null 2015-05-27 19:46:22,942 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(92)) - Connecting to ResourceManager at nezabudka1-00.yandex.ru/5.255.231.129:8032 2015-05-27 19:46:23,733 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(92)) - Connecting to ResourceManager at nezabudka1-00.yandex.ru/5.255.231.129:8032 2015-05-27 20:13:37,231 INFO [main] mapred.FileInputFormat (FileInputFormat.java:listStatus(247)) - Total input paths to process : 1 2015-05-27 20:13:38,110 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(396)) - number of splits:1 2015-05-27 20:13:38,136 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1009)) - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 2015-05-27 20:13:38,390 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(479)) - Submitting tokens for job: job_1431704916575_2531 2015-05-27 20:13:38,689 INFO [main] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(204)) - Submitted application application_1431704916575_2531 2015-05-27 20:13:38,743 INFO [main] mapreduce.Job (Job.java:submit(1289)) - The url to track the job: http://nezabudka1-00.yandex.ru:8088/proxy/application_1431704916575_2531/ 2015-05-27 20:13:38,746 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1334)) - Running job: job_1431704916575_2531 2015-05-27 21:04:12,353 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1355)) - Job job_1431704916575_2531 running in uber mode : false 2015-05-27 21:04:12,356 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1362)) - map 0% reduce 0% 2015-05-27 21:04:12,374 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1375)) - Job job_1431704916575_2531 failed with state FAILED due to: Application application_1431704916575_2531 failed 2 times due to ApplicationMaster for attempt appattempt_1431704916575_2531_02 timed out. Failing the application. 2015-05-27 21:04:12,473 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1380)) - Counters: 0 2015-05-27 21:04:12,474 ERROR [main] streaming.StreamJob (StreamJob.java:submitAndMonitorJob(1019)) - Job not Successful! Streaming Command Failed! This is because yarn.am.liveness-monitor.expiry-interval-ms (defaults to 600 sec) timeout expires before large data file is transferred. Next step I increase yarn.am.liveness-monitor.expiry-interval-ms. After that application is successfully initialized and tasks are spawned. But I encounter another error: the default 600 seconds mapreduce.task.timeout expires before tasks are initialized and tasks fail. Error message Task attempt_XXX failed to report status for 600 seconds is also misleading: this timeout is supposed to kill non-responsive (stuck) tasks but it rather strikes because auxiliary data files are copying slowly. So I need to increase mapreduce.task.timeout too and only after that my job is successful. At the very least error messages need to be tweaked to indicate that Application (or Task) is failing because auxiliary files are not copied during that time, not just generic timeout expired. Better solution would be not to account time spent for data files distribution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6332) Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580072#comment-14580072 ] Rohith commented on MAPREDUCE-6332: --- [~devaraj.k] Kindly review the updated patch. -1 for whitespace, this line is not modified in the patch but still QA reports -1. Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch, 0003-MAPREDUCE-6332.patch, 0004-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6332) Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6332: -- Status: Open (was: Patch Available) Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch, 0003-MAPREDUCE-6332.patch, 0004-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6332) Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6332: -- Attachment: 0004-MAPREDUCE-6332.patch Updated the patch to fix checkstyle whitespace warnings. Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch, 0003-MAPREDUCE-6332.patch, 0004-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6332) Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6332: -- Status: Patch Available (was: Open) Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch, 0003-MAPREDUCE-6332.patch, 0004-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6332) Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6332: -- Attachment: 0003-MAPREDUCE-6332.patch Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.5.0, 2.6.0, 2.7.0 Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch, 0003-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6332) Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14576977#comment-14576977 ] Rohith commented on MAPREDUCE-6332: --- Updated the patch rebasing against trunk. Kindly review.. Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.5.0, 2.6.0, 2.7.0 Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch, 0003-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6332) Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6332: -- Summary: Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used (was: Add more required API's to MergeManager interface ) Provide facility to users for writting custom MergeManager implementation when custom shuffleconsumerPluggin is used Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.5.0, 2.6.0, 2.7.0 Reporter: Rohith Assignee: Rohith Labels: BB2015-05-TBR Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-4754) Job is marked as FAILED and also throwing the TransitonException instead of KILLED when issues a KILL command
[ https://issues.apache.org/jira/browse/MAPREDUCE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-4754: -- Issue Type: Sub-task (was: Bug) Parent: MAPREDUCE-5422 Job is marked as FAILED and also throwing the TransitonException instead of KILLED when issues a KILL command - Key: MAPREDUCE-4754 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4754 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 2.0.1-alpha, 2.0.2-alpha Reporter: Nishan Shetty {code} org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: JOB_TASK_COMPLETED at KILLED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:695) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:119) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:893) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:889) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4288) ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running
[ https://issues.apache.org/jira/browse/MAPREDUCE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539585#comment-14539585 ] Rohith commented on MAPREDUCE-4288: --- Linking to similar issue in MR ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running --- Key: MAPREDUCE-4288 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4288 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Nishan Shetty When no job is running in the cluster invoke the ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() API's Observed that these API's are returning one instead of zero(as no job is running) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4289) JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's not giving any values
[ https://issues.apache.org/jira/browse/MAPREDUCE-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539586#comment-14539586 ] Rohith commented on MAPREDUCE-4289: --- The issue is still exist in the latest code base. This issue require more discussion on whether to change the hard coded values which may break comatibility for the MR client OR bring out a new design to handle this dielema. If not planning to fix , I think this can be closed as wont fix. JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's not giving any values Key: MAPREDUCE-4289 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4289 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Nishan Shetty 1.Run a simple job 2.Invoke JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's Observe that these API's are giving zeros instead of showing map/reduce progress -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6332) Add more required API's to MergeManager interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539669#comment-14539669 ] Rohith commented on MAPREDUCE-6332: --- [~vinodkv] Kindly review the patch .. Add more required API's to MergeManager interface -- Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.5.0, 2.6.0, 2.7.0 Reporter: Rohith Assignee: Rohith Labels: BB2015-05-TBR Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4288) ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running
[ https://issues.apache.org/jira/browse/MAPREDUCE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539583#comment-14539583 ] Rohith commented on MAPREDUCE-4288: --- The issue is still exist in the latest code base. This issue require more discussion on whether to change the hard coded values which may break comatibility for the MR client OR bring out a new design to handle this dielema. If not planning to fix , I think this can be closed as wont fix. ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running --- Key: MAPREDUCE-4288 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4288 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Nishan Shetty When no job is running in the cluster invoke the ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() API's Observed that these API's are returning one instead of zero(as no job is running) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (MAPREDUCE-4461) Resourcemanager UI does not show the queue details in IE
[ https://issues.apache.org/jira/browse/MAPREDUCE-4461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith resolved MAPREDUCE-4461. --- Resolution: Cannot Reproduce IE browser rendering problem should had fixed by YARN-1868. I can not reproduce the issue. Closing as Can not reproduce. Feel free to reopen if problem still exist. Resourcemanager UI does not show the queue details in IE Key: MAPREDUCE-4461 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4461 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Nishan Shetty Attachments: ASF.LICENSE.NOT.GRANTED--screenshot-1.jpg Resourcemanager UI does not show the queue details in IE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (MAPREDUCE-4378) hadoop-validate-setup.sh fails to execute kinit command in secure mode
[ https://issues.apache.org/jira/browse/MAPREDUCE-4378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith resolved MAPREDUCE-4378. --- Resolution: Cannot Reproduce In trunk or branch-2, I dont see hadoop-validate-setup.sh script file anymore. Closing as 'Cannot reproduce'. Reopen the issue if any script files fails while execution kinit. hadoop-validate-setup.sh fails to execute kinit command in secure mode -- Key: MAPREDUCE-4378 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4378 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha, 3.0.0 Environment: SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 1 Reporter: Nishan Shetty hadoop-validate-setup.sh is refering to the invalid kinit location. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4754) Job is marked as FAILED and also throwing the TransitonException instead of KILLED when issues a KILL command
[ https://issues.apache.org/jira/browse/MAPREDUCE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537834#comment-14537834 ] Rohith commented on MAPREDUCE-4754: --- I see current JobImpl also does not handle JOB_TASK_COMPLETED at KILLED, is it potential issue still exist in trunk or branch-2? Job is marked as FAILED and also throwing the TransitonException instead of KILLED when issues a KILL command - Key: MAPREDUCE-4754 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4754 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.1-alpha, 2.0.2-alpha Reporter: Nishan Shetty {code} org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: JOB_TASK_COMPLETED at KILLED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:695) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:119) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:893) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:889) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4754) Job is marked as FAILED and also throwing the TransitonException instead of KILLED when issues a KILL command
[ https://issues.apache.org/jira/browse/MAPREDUCE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537835#comment-14537835 ] Rohith commented on MAPREDUCE-4754: --- Any folks has seen this issue recently? Job is marked as FAILED and also throwing the TransitonException instead of KILLED when issues a KILL command - Key: MAPREDUCE-4754 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4754 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.1-alpha, 2.0.2-alpha Reporter: Nishan Shetty {code} org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: JOB_TASK_COMPLETED at KILLED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:695) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:119) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:893) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:889) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (MAPREDUCE-4453) Jobs should be executed as same user in hadoop-validate-setup.sh
[ https://issues.apache.org/jira/browse/MAPREDUCE-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith resolved MAPREDUCE-4453. --- Resolution: Cannot Reproduce In trunk or branch-2 , *su -c* dont exist any more in the script files. Closing as cannot reporduce' Jobs should be executed as same user in hadoop-validate-setup.sh Key: MAPREDUCE-4453 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4453 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.1-alpha Reporter: Nishan Shetty 'su -c' command should be removed in hadoop-validate-setup.sh as TeraGen, Terasort and teravalidate jobs should be executed as same user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6342) Make POM project names consistent
[ https://issues.apache.org/jira/browse/MAPREDUCE-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6342: -- Attachment: MAPREDUCE-6342-branch-2.patch Updated patch for branch-2 Make POM project names consistent - Key: MAPREDUCE-6342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6342 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Reporter: Rohith Assignee: Rohith Priority: Minor Labels: BB2015-05-RFC Attachments: MAPREDUCE-6342-branch-2.patch, MAPREDUCE-6342.patch This is track MR changes for POM changes by name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6342) Make POM project names consistent
[ https://issues.apache.org/jira/browse/MAPREDUCE-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6342: -- Labels: BB2015-05-RFC (was: BB2015-05-TBR) Make POM project names consistent - Key: MAPREDUCE-6342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6342 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Reporter: Rohith Assignee: Rohith Priority: Minor Labels: BB2015-05-RFC Attachments: MAPREDUCE-6342.patch This is track MR changes for POM changes by name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6338) MR AppMaster does not honor ephemeral port range
[ https://issues.apache.org/jira/browse/MAPREDUCE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520744#comment-14520744 ] Rohith commented on MAPREDUCE-6338: --- The patch looks good to me.. ApplicationMaster web port also random, I think it is good to make it also in port range.. Any thoughts? MR AppMaster does not honor ephemeral port range Key: MAPREDUCE-6338 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6338 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 2.6.0 Reporter: Frank Nguyen Assignee: Frank Nguyen Attachments: MAPREDUCE-6338.002.patch The MR AppMaster should only use port ranges defined in the yarn.app.mapreduce.am.job.client.port-range property. On initial startup of the MRAppMaster, it does use the port range defined in the property. However, it also opens up a listener on a random ephemeral port. This is not the Jetty listener. It is another listener opened by the MRAppMaster via another thread and is recognized by the RM. Other nodes will try to communicate to it via that random port. With firewall settings on, the MR job will fail because the random port is not opened. This problem has caused others to have all OS ephemeral ports opened to have MR jobs run. This is related to MAPREDUCE-4079 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6332) Add more required API's to MergeManager interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513831#comment-14513831 ] Rohith commented on MAPREDUCE-6332: --- I'd appreciate if any commiter/PMC member comment on the JIRA. Add more required API's to MergeManager interface -- Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0, 2.6.0, 2.7.0 Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6332) Add more required API's to MergeManager interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6332: -- Attachment: 0002-MAPREDUCE-6332.patch Add more required API's to MergeManager interface -- Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.5.0, 2.6.0, 2.7.0 Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6332) Add more required API's to MergeManager interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516344#comment-14516344 ] Rohith commented on MAPREDUCE-6332: --- bq. MergeThread.java directly uses the merge-manager impl, that should be fixed too? Agree, I missed it bq. Add some javadoc to the new methods? Done. Updated the patch fixing comments. Kindly review the updated patch. Add more required API's to MergeManager interface -- Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.5.0, 2.6.0, 2.7.0 Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch, 0002-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6332) Add more required API's to MergeManager interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516318#comment-14516318 ] Rohith commented on MAPREDUCE-6332: --- Thanks [~vinodkv] for sharing your thoughts.. bq. This almost seems like a new feature to me, at least given that we have to expose more APIs to the outside world. I will mark the JIRA as New Feature. Add more required API's to MergeManager interface -- Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0, 2.6.0, 2.7.0 Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6332) Add more required API's to MergeManager interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6332: -- Issue Type: New Feature (was: Bug) Add more required API's to MergeManager interface -- Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.5.0, 2.6.0, 2.7.0 Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6342) Make POM project names consistent
Rohith created MAPREDUCE-6342: - Summary: Make POM project names consistent Key: MAPREDUCE-6342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6342 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Reporter: Rohith Assignee: Rohith Priority: Minor This is track MR changes for POM changes by name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6342) Make POM project names consistent
[ https://issues.apache.org/jira/browse/MAPREDUCE-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6342: -- Status: Patch Available (was: Open) Make POM project names consistent - Key: MAPREDUCE-6342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6342 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Reporter: Rohith Assignee: Rohith Priority: Minor Attachments: MAPREDUCE-6342.patch This is track MR changes for POM changes by name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6342) Make POM project names consistent
[ https://issues.apache.org/jira/browse/MAPREDUCE-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6342: -- Attachment: MAPREDUCE-6342.patch Make POM project names consistent - Key: MAPREDUCE-6342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6342 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Reporter: Rohith Assignee: Rohith Priority: Minor Attachments: MAPREDUCE-6342.patch This is track MR changes for POM changes by name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6329) Failure of start map task on NM cause job hang
[ https://issues.apache.org/jira/browse/MAPREDUCE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508847#comment-14508847 ] Rohith commented on MAPREDUCE-6329: --- No need to create new JIRA in YARN..This issue only can be moved.. Failure of start map task on NM cause job hang -- Key: MAPREDUCE-6329 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6329 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Attachments: syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6329) Failure of start map task on NM cause job hang
[ https://issues.apache.org/jira/browse/MAPREDUCE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508829#comment-14508829 ] Rohith commented on MAPREDUCE-6329: --- Thanks Peng Zhang for your analysis.. As [~jlowe] said, this is bug in YARN. [~peng.zhang] Would you like to provide patch for this issue? Failure of start map task on NM cause job hang -- Key: MAPREDUCE-6329 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6329 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Attachments: syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6329) Failure of start map task on NM cause job hang
[ https://issues.apache.org/jira/browse/MAPREDUCE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507637#comment-14507637 ] Rohith commented on MAPREDUCE-6329: --- bq. is there anything in the RM log indicating why the container transitioned from ALLOCATED to KILLED? This would be probably because during rolling upgrade , NM was down for some time. So Node_Removed event might have occurred either because of expiry or reconnected event. Node removed event kills all the running containers which has been done before container is pulled by AM. Failure of start map task on NM cause job hang -- Key: MAPREDUCE-6329 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6329 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Attachments: syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6329) Failure of start map task on NM cause job hang
[ https://issues.apache.org/jira/browse/MAPREDUCE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507671#comment-14507671 ] Rohith commented on MAPREDUCE-6329: --- But I don't see any Node removed event from attached logs. The question remains unanswered!! Failure of start map task on NM cause job hang -- Key: MAPREDUCE-6329 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6329 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Attachments: syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6329) Failure of start map task on NM cause job hang
[ https://issues.apache.org/jira/browse/MAPREDUCE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507730#comment-14507730 ] Rohith commented on MAPREDUCE-6329: --- bq. Therefore I don't see how the RM could reasonably be expiring the node, nor should the node be unregistering Agree, practically thinking it won't be possible. bq. Re-registration does not kill containers on the node Without NM work-preserving restart enabled , RM should kill the running containers on re-registration. IIRC, It is legacy behavior. Failure of start map task on NM cause job hang -- Key: MAPREDUCE-6329 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6329 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Attachments: syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6332) Add more required API's to MergeManager interface
Rohith created MAPREDUCE-6332: - Summary: Add more required API's to MergeManager interface Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6332) Add more required API's to MergeManager interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6332: -- Status: Patch Available (was: Open) Add more required API's to MergeManager interface -- Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0, 2.5.0, 2.7.0 Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6332) Add more required API's to MergeManager interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6332: -- Attachment: 0001-MAPREDUCE-6332.patch Attaching the patch, Kindly review the patch Add more required API's to MergeManager interface -- Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6332) Add more required API's to MergeManager interface
[ https://issues.apache.org/jira/browse/MAPREDUCE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6332: -- Target Version/s: 2.8.0, 2.7.1 Affects Version/s: 2.7.0 2.5.0 2.6.0 Add more required API's to MergeManager interface -- Key: MAPREDUCE-6332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6332 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0, 2.6.0, 2.7.0 Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6332.patch MR provides ability to the user for plugin custom ShuffleConsumerPlugin using *mapreduce.job.reduce.shuffle.consumer.plugin.class*. When the user is allowed to use this configuration as plugin, user also interest in implementing his own MergeManagerImpl. But now , user is forced to use MR provided MergeManagerImpl instead of custom MergeManagerImpl when user is using shuffle.consumer.plugin class. There should be well defined API's in MergeManager that can be used for any implementation without much effort to user for custom implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485749#comment-14485749 ] Rohith commented on MAPREDUCE-6311: --- I believe it must be there in 2.7 also. I will check for the check in history and confirm. AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6311.patch, 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485754#comment-14485754 ] Rohith commented on MAPREDUCE-6311: --- It is there in 2.7. The issue breaks is HADOOP-11754 which is fixed in 2.7 AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6311.patch, 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485773#comment-14485773 ] Rohith commented on MAPREDUCE-6311: --- Linking the issue to which breaks this. AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6311.patch, 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6311: -- Affects Version/s: 2.7.0 AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.0 Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6311.patch, 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6311: -- Priority: Blocker (was: Major) AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.0 Reporter: Rohith Assignee: Rohith Priority: Blocker Attachments: 0001-MAPREDUCE-6311.patch, 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485712#comment-14485712 ] Rohith commented on MAPREDUCE-6311: --- I confirm that it is happening every time in my env. I have verified in both SuSE linux and OS X. It is occurring all the time. AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6311.patch, 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6189) TestMRTimelineEventHandling fails in trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-6189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485666#comment-14485666 ] Rohith commented on MAPREDUCE-6189: --- Hi [~zjshen] bq. When an application is finished, the AM container is still alive for minutes. I didn't change my config that was used in 2.6 before. Not sure if it is a related issue. I think this is MAPREDUCE-6311. All the time whenever MR job has run, AM jvm does not shutdown. Everytime AM expired event is triggered. TestMRTimelineEventHandling fails in trunk -- Key: MAPREDUCE-6189 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6189 Project: Hadoop Map/Reduce Issue Type: Test Reporter: Ted Yu Assignee: Junping Du From https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1988/: {code} REGRESSION: org.apache.hadoop.mapred.TestMRTimelineEventHandling.testMRTimelineEventHandling Error Message: Job didn't finish in 30 seconds Stack Trace: java.io.IOException: Job didn't finish in 30 seconds at org.apache.hadoop.mapred.UtilsForTests.runJobSucceed(UtilsForTests.java:622) at org.apache.hadoop.mapred.TestMRTimelineEventHandling.testMRTimelineEventHandling(TestMRTimelineEventHandling.java:105) REGRESSION: org.apache.hadoop.mapred.TestMRTimelineEventHandling.testTimelineServiceStartInMiniCluster Error Message: Job didn't finish in 30 seconds Stack Trace: java.io.IOException: Job didn't finish in 30 seconds at org.apache.hadoop.mapred.UtilsForTests.runJobSucceed(UtilsForTests.java:622) at org.apache.hadoop.mapred.TestMRTimelineEventHandling.testTimelineServiceStartInMiniCluster(TestMRTimelineEventHandling.java:61) REGRESSION: org.apache.hadoop.mapred.TestMRTimelineEventHandling.testMapreduceJobTimelineServiceEnabled Error Message: Job didn't finish in 30 seconds Stack Trace: java.io.IOException: Job didn't finish in 30 seconds at org.apache.hadoop.mapred.UtilsForTests.runJobSucceed(UtilsForTests.java:622) at org.apache.hadoop.mapred.TestMRTimelineEventHandling.testMapreduceJobTimelineServiceEnabled(TestMRTimelineEventHandling.java:198) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485686#comment-14485686 ] Rohith commented on MAPREDUCE-6311: --- I am using trunk version for test. I encountered in trunk version. AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6311.patch, 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
Rohith created MAPREDUCE-6311: - Summary: AM JVM hungs after job unregister and finished Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482704#comment-14482704 ] Rohith commented on MAPREDUCE-6311: --- Extracting thread dump, ScheduledThreadPoolExecutor is started while starting MRAppMaster but is not stopped during shutdown. The stack trace can be seen for the same. {noformat} pool-6-thread-1 prio=10 tid=0x7fe3d0dc7000 nid=0xb4a waiting on condition [0x7fe3d4643000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0xf216dc08 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6311: -- Attachment: MR_TD.out Attaching thread dump for AM JVM AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482928#comment-14482928 ] Rohith commented on MAPREDUCE-6311: --- While starting HttpServer2, SignerSecretProvider is added to webAppContext in attribute. SignerSecretProvider is creating the ScheduledThreadPoolExecutor and scheduled but never stopped explicitly while stopping HttpServer2 which causes MRAppMaster JVM to hung after unregister with RM. AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6311: -- Attachment: 0001-MAPREDUCE-6311.patch AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6311: -- Status: Patch Available (was: Open) AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482936#comment-14482936 ] Rohith commented on MAPREDUCE-6311: --- Attached the patch for stopping ScheduledThreadPoolExecutor using {{SignerSecretProvider#destroy}} API. Tested the patch in cluster, it is working fine. Kindly review the patch AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6311) AM JVM hungs after job unregister and finished
[ https://issues.apache.org/jira/browse/MAPREDUCE-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6311: -- Attachment: 0001-MAPREDUCE-6311.patch Updated patch fixing tests failure AM JVM hungs after job unregister and finished -- Key: MAPREDUCE-6311 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6311 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: 0001-MAPREDUCE-6311.patch, 0001-MAPREDUCE-6311.patch, MR_TD.out It is observed that MRAppMaster JVM hungs after unregistered with ResourceManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6291) Correct mapred queue usage command
[ https://issues.apache.org/jira/browse/MAPREDUCE-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383385#comment-14383385 ] Rohith commented on MAPREDUCE-6291: --- I think it is better to strict to [~qwertymaniac] suggestion, all Hadoop script help message changes can go all together in one jira. I believe this will reduce work. Correct mapred queue usage command -- Key: MAPREDUCE-6291 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6291 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Attachments: MAPRED-6291-001.patch, MAPRED-6291.patch, MAPREDUCE-6291-002.patch *Currently it is like following..* Usage: JobQueueClient command args *It should be* Usage: queue command args *For more Details check following* {noformat} hdfs@host1:/hadoop/bin ./mapred queue Usage: JobQueueClient command args [-list] [-info job-queue-name [-showJobs]] [-showacls] {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6291) Correct mapred queue usage command
[ https://issues.apache.org/jira/browse/MAPREDUCE-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383266#comment-14383266 ] Rohith commented on MAPREDUCE-6291: --- If no commands need to change then I think YARN-3398 can closed as 'Not a Problem' if you dont have any concern. Correct mapred queue usage command -- Key: MAPREDUCE-6291 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6291 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Attachments: MAPRED-6291-001.patch, MAPRED-6291.patch, MAPREDUCE-6291-002.patch *Currently it is like following..* Usage: JobQueueClient command args *It should be* Usage: queue command args *For more Details check following* {noformat} hdfs@host1:/hadoop/bin ./mapred queue Usage: JobQueueClient command args [-list] [-info job-queue-name [-showJobs]] [-showacls] {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6291) Correct mapred queue usage command
[ https://issues.apache.org/jira/browse/MAPREDUCE-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379684#comment-14379684 ] Rohith commented on MAPREDUCE-6291: --- There is another command need to change {{mapred job}}. Similarly, it wold be better to cross check for other commands. Correct mapred queue usage command -- Key: MAPREDUCE-6291 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6291 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Attachments: MAPRED-6291.patch *Currently it is like following..* Usage: JobQueueClient command args *It should be* Usage: queue command args *For more Details check following* {noformat} hdfs@host1:/hadoop/bin ./mapred queue Usage: JobQueueClient command args [-list] [-info job-queue-name [-showJobs]] [-showacls] {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6291) Correct mapred queue usage command
[ https://issues.apache.org/jira/browse/MAPREDUCE-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379687#comment-14379687 ] Rohith commented on MAPREDUCE-6291: --- typo mistake : *wold* - - *would* Correct mapred queue usage command -- Key: MAPREDUCE-6291 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6291 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Attachments: MAPRED-6291.patch *Currently it is like following..* Usage: JobQueueClient command args *It should be* Usage: queue command args *For more Details check following* {noformat} hdfs@host1:/hadoop/bin ./mapred queue Usage: JobQueueClient command args [-list] [-info job-queue-name [-showJobs]] [-showacls] {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6291) Correct mapred queue usage command
[ https://issues.apache.org/jira/browse/MAPREDUCE-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379693#comment-14379693 ] Rohith commented on MAPREDUCE-6291: --- Probably it is would be good if usage command are in consistent like below. {{Usage: mapred command sub-command args}} any thought? Correct mapred queue usage command -- Key: MAPREDUCE-6291 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6291 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Attachments: MAPRED-6291.patch *Currently it is like following..* Usage: JobQueueClient command args *It should be* Usage: queue command args *For more Details check following* {noformat} hdfs@host1:/hadoop/bin ./mapred queue Usage: JobQueueClient command args [-list] [-info job-queue-name [-showJobs]] [-showacls] {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5807) Print usage for TeraSort job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366932#comment-14366932 ] Rohith commented on MAPREDUCE-5807: --- Thanks for the updated patch. Changes looks good to me +1 (non-binding) Print usage for TeraSort job. - Key: MAPREDUCE-5807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5807 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Reporter: Rohith Assignee: Rohith Priority: Trivial Attachments: 0001-MAPREDUCE-5807.patch, 0002-MAPREDUCE-5807.patch, MAPREDUCE-5807.patch For new to hadoop, try for getting help mesage for examples jobs provided in mapreduce. These Usage helps them in providing arguements. terasort job execution does not print Usage message instead throw exception. ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar terasort 14/03/24 15:34:55 INFO terasort.TeraSort: starting java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.examples.terasort.TeraSort.run(TeraSort.java:283) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.terasort.TeraSort.main(TeraSort.java:325) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5807) Print usage for TeraSort job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-5807: -- Status: Patch Available (was: Open) Print usage for TeraSort job. - Key: MAPREDUCE-5807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5807 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Reporter: Rohith Assignee: Rohith Priority: Trivial Attachments: 0001-MAPREDUCE-5807.patch, MAPREDUCE-5807.patch For new to hadoop, try for getting help mesage for examples jobs provided in mapreduce. These Usage helps them in providing arguements. terasort job execution does not print Usage message instead throw exception. ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar terasort 14/03/24 15:34:55 INFO terasort.TeraSort: starting java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.examples.terasort.TeraSort.run(TeraSort.java:283) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.terasort.TeraSort.main(TeraSort.java:325) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5807) Print usage for TeraSort job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364883#comment-14364883 ] Rohith commented on MAPREDUCE-5807: --- Updated the patch, kindly review Print usage for TeraSort job. - Key: MAPREDUCE-5807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5807 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Reporter: Rohith Assignee: Rohith Priority: Trivial Attachments: 0001-MAPREDUCE-5807.patch, MAPREDUCE-5807.patch For new to hadoop, try for getting help mesage for examples jobs provided in mapreduce. These Usage helps them in providing arguements. terasort job execution does not print Usage message instead throw exception. ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar terasort 14/03/24 15:34:55 INFO terasort.TeraSort: starting java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.examples.terasort.TeraSort.run(TeraSort.java:283) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.terasort.TeraSort.main(TeraSort.java:325) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5807) Print usage for TeraSort job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364887#comment-14364887 ] Rohith commented on MAPREDUCE-5807: --- Usage print as below along with configurations {noformat} Usage: terasort in out TeraSort configurations are: mapreduce.terasort.num-rows Number of rows to generate mapreduce.terasort.num.partitions Number of partitions used for sampling mapreduce.terasort.partitions.sample sampling size mapreduce.terasort.final.sync Wheather to do final sync before the stream is closed mapreduce.terasort.use.terascheduler Wheather to use tera scheduler mapreduce.terasort.simplepartitioner Wheather to use simple partitioner mapreduce.terasort.output.replication Number of replications to be stored for output data {noformat} Print usage for TeraSort job. - Key: MAPREDUCE-5807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5807 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Reporter: Rohith Assignee: Rohith Priority: Trivial Attachments: 0001-MAPREDUCE-5807.patch, MAPREDUCE-5807.patch For new to hadoop, try for getting help mesage for examples jobs provided in mapreduce. These Usage helps them in providing arguements. terasort job execution does not print Usage message instead throw exception. ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar terasort 14/03/24 15:34:55 INFO terasort.TeraSort: starting java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.examples.terasort.TeraSort.run(TeraSort.java:283) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.terasort.TeraSort.main(TeraSort.java:325) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5807) Print usage for TeraSort job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-5807: -- Attachment: 0001-MAPREDUCE-5807.patch Print usage for TeraSort job. - Key: MAPREDUCE-5807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5807 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Reporter: Rohith Assignee: Rohith Priority: Trivial Attachments: 0001-MAPREDUCE-5807.patch, MAPREDUCE-5807.patch For new to hadoop, try for getting help mesage for examples jobs provided in mapreduce. These Usage helps them in providing arguements. terasort job execution does not print Usage message instead throw exception. ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar terasort 14/03/24 15:34:55 INFO terasort.TeraSort: starting java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.examples.terasort.TeraSort.run(TeraSort.java:283) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.terasort.TeraSort.main(TeraSort.java:325) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6276) Error in encrypted shuffle
[ https://issues.apache.org/jira/browse/MAPREDUCE-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365041#comment-14365041 ] Rohith commented on MAPREDUCE-6276: --- I suspect importing the certificates into truststore has problem. Each hosts all jks need to be imported to all other hosts. May be cross check the import of certificates. Error in encrypted shuffle -- Key: MAPREDUCE-6276 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6276 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Centos 6.5 Multinode hdp 2.2 cluster using mapreduce version 2 hadoop version 2.6.0 namenode - hdprndmaster.dm.com (10.200.100.83) resourcemanager/datanode/nodemanager - hdprndnode.dm.com (10.200.100.85) Reporter: Kuldeep Kulkarni Priority: Blocker Attachments: capacity-scheduler.xml, commons-logging.properties, container-executor.cfg, core-site.xml, dfs.exclude, dfs_data_dir_mount.hist, hadoop-env.sh, hadoop-metrics2.properties, hadoop-policy.xml, hdfs-site.xml, health_check, log.txt, log4j.properties, mapred-env.sh, mapred-site.xml, slaves, ssl-client.xml, ssl-server.xml, task-log4j.properties, taskcontroller.cfg, yarn-env.sh, yarn-site.xml Hey Guys, After enabling wire encryption my UIs are working fine, I'm able to read/write to hdfs securely however encrypted shuffle is not working. I'm getting below error, could you please help me ? Note - mappers are getting finished successfully however job gets failed during shuffle. {code} 2015-03-17 17:00:54,322 WARN [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to hdprndnode.dm.com:13562 with 8 map outputs javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) {code} Please find attached full log for more details. Thanks, Kuldeep -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6190) MR Job is stuck because of one mapper stuck in STARTING
[ https://issues.apache.org/jira/browse/MAPREDUCE-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342806#comment-14342806 ] Rohith commented on MAPREDUCE-6190: --- [~amalhotra159] [~tannin] Would mind attaching full log of MRAppMaster. And did anyone observe scenario mentioned in YARN-1680? MR Job is stuck because of one mapper stuck in STARTING --- Key: MAPREDUCE-6190 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6190 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Ankit Malhotra Trying to figure out a weird issue we started seeing on our CDH5.1.0 cluster with map reduce jobs on YARN. We had a job stuck for hours because one of the mappers never started up fully. Basically, the map task had 2 attempts, the first one failed and the AM tried to schedule a second one and the second attempt was stuck on STATE: STARTING, STATUS: NEW. A node never got assigned and the task along with the job was stuck indefinitely. The AM logs had this being logged again and again: {code} 2014-12-09 19:25:12,347 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down 0 2014-12-09 19:25:13,352 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1408745633994_450952_02_003807 2014-12-09 19:25:13,352 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce preemption successful attempt_1408745633994_450952_r_48_1000 2014-12-09 19:25:13,352 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all scheduled reduces:0 2014-12-09 19:25:13,352 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 1 2014-12-09 19:25:13,353 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting attempt_1408745633994_450952_r_50_1000 2014-12-09 19:25:13,353 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=0 2014-12-09 19:25:13,353 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: completedMapPercent 0.99968 totalMemLimit:1722880 finalMapMemLimit:2560 finalReduceMemLimit:1720320 netScheduledMapMem:2560 netScheduledReduceMem:1722880 2014-12-09 19:25:13,353 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down 0 2014-12-09 19:25:13,353 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:77 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:673 CompletedMaps:3124 CompletedReds:0 ContAlloc:4789 ContRel:798 HostLocal:2944 RackLocal:155 2014-12-09 19:25:14,353 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:78 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:673 CompletedMaps:3124 CompletedReds:0 ContAlloc:4789 ContRel:798 HostLocal:2944 RackLocal:155 2014-12-09 19:25:14,359 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=0 {code} On killing the task manually, the AM started up the task again, scheduled and ran it successfully completing the task and the job with it. Some quick code grepping led us here: http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-mapreduce-client-app/2.3.0/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java#397 But still dont quite understand why this would happen once in a while and why the job would suddenly be ok once the stuck task is manually killed. Note: Other jobs succeed on the cluster while this job is stuck. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (MAPREDUCE-6212) Hadoop 2.6.0: Basic error “starting MRAppMaster” after installing
[ https://issues.apache.org/jira/browse/MAPREDUCE-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith moved YARN-3012 to MAPREDUCE-6212: - Component/s: (was: security) security Fix Version/s: (was: 2.6.0) 2.6.0 Target Version/s: (was: 2.6.0) Affects Version/s: (was: 2.6.0) 2.6.0 Key: MAPREDUCE-6212 (was: YARN-3012) Project: Hadoop Map/Reduce (was: Hadoop YARN) Hadoop 2.6.0: Basic error “starting MRAppMaster” after installing - Key: MAPREDUCE-6212 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6212 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 2.6.0 Environment: Ubuntu 64bit Reporter: Dinh Hoang Mai Priority: Critical Fix For: 2.6.0 I have just started to work with Hadoop 2. After installing with basic configs, I always failed to run any examples. Has anyone seen this problem and please help me? This is the log 2015-01-08 01:52:01,599 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1420648881673_0004_01 2015-01-08 01:52:01,764 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131) at org.apache.hadoop.security.Groups.init(Groups.java:70) at org.apache.hadoop.security.Groups.init(Groups.java:66) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:299) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1429) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:129) ... 7 more Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V at org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative(Native Method) at org.apache.hadoop.security.JniBasedUnixGroupsMapping.clinit(JniBasedUnixGroupsMapping.java:49) at org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.init(JniBasedUnixGroupsMappingWithFallback.java:39) ... 12 more 2015-01-08 01:52:01,767 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6212) Hadoop 2.6.0: Basic error “starting MRAppMaster” after installing
[ https://issues.apache.org/jira/browse/MAPREDUCE-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268142#comment-14268142 ] Rohith commented on MAPREDUCE-6212: --- Since issue is in MRAppMaster, moving to MapReduce project. Hadoop 2.6.0: Basic error “starting MRAppMaster” after installing - Key: MAPREDUCE-6212 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6212 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 2.6.0 Environment: Ubuntu 64bit Reporter: Dinh Hoang Mai Priority: Critical Fix For: 2.6.0 I have just started to work with Hadoop 2. After installing with basic configs, I always failed to run any examples. Has anyone seen this problem and please help me? This is the log 2015-01-08 01:52:01,599 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1420648881673_0004_01 2015-01-08 01:52:01,764 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131) at org.apache.hadoop.security.Groups.init(Groups.java:70) at org.apache.hadoop.security.Groups.init(Groups.java:66) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:299) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1429) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:129) ... 7 more Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V at org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative(Native Method) at org.apache.hadoop.security.JniBasedUnixGroupsMapping.clinit(JniBasedUnixGroupsMapping.java:49) at org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.init(JniBasedUnixGroupsMappingWithFallback.java:39) ... 12 more 2015-01-08 01:52:01,767 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-6189) TestMRTimelineEventHandling fails in trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-6189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith reassigned MAPREDUCE-6189: - Assignee: Rohith TestMRTimelineEventHandling fails in trunk -- Key: MAPREDUCE-6189 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6189 Project: Hadoop Map/Reduce Issue Type: Test Reporter: Ted Yu Assignee: Rohith Priority: Minor From https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1988/: {code} REGRESSION: org.apache.hadoop.mapred.TestMRTimelineEventHandling.testMRTimelineEventHandling Error Message: Job didn't finish in 30 seconds Stack Trace: java.io.IOException: Job didn't finish in 30 seconds at org.apache.hadoop.mapred.UtilsForTests.runJobSucceed(UtilsForTests.java:622) at org.apache.hadoop.mapred.TestMRTimelineEventHandling.testMRTimelineEventHandling(TestMRTimelineEventHandling.java:105) REGRESSION: org.apache.hadoop.mapred.TestMRTimelineEventHandling.testTimelineServiceStartInMiniCluster Error Message: Job didn't finish in 30 seconds Stack Trace: java.io.IOException: Job didn't finish in 30 seconds at org.apache.hadoop.mapred.UtilsForTests.runJobSucceed(UtilsForTests.java:622) at org.apache.hadoop.mapred.TestMRTimelineEventHandling.testTimelineServiceStartInMiniCluster(TestMRTimelineEventHandling.java:61) REGRESSION: org.apache.hadoop.mapred.TestMRTimelineEventHandling.testMapreduceJobTimelineServiceEnabled Error Message: Job didn't finish in 30 seconds Stack Trace: java.io.IOException: Job didn't finish in 30 seconds at org.apache.hadoop.mapred.UtilsForTests.runJobSucceed(UtilsForTests.java:622) at org.apache.hadoop.mapred.TestMRTimelineEventHandling.testMapreduceJobTimelineServiceEnabled(TestMRTimelineEventHandling.java:198) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-6183) Fix findbugs warnings in mapreduce-examples
[ https://issues.apache.org/jira/browse/MAPREDUCE-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith reassigned MAPREDUCE-6183: - Assignee: Rohith Fix findbugs warnings in mapreduce-examples --- Key: MAPREDUCE-6183 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6183 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Reporter: Akira AJISAKA Assignee: Rohith Labels: findbugs, newbie Work on MAPREDUCE-5800 has exposed some findbugs warnings. https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5063//artifact/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-examples.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6183) Fix findbugs warnings in mapreduce-examples
[ https://issues.apache.org/jira/browse/MAPREDUCE-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239046#comment-14239046 ] Rohith commented on MAPREDUCE-6183: --- I believe these findbug error would be due to either change is java version or change in findbug version. Previously these were not failed. Fix findbugs warnings in mapreduce-examples --- Key: MAPREDUCE-6183 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6183 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Reporter: Akira AJISAKA Assignee: Rohith Labels: findbugs, newbie Work on MAPREDUCE-5800 has exposed some findbugs warnings. https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5063//artifact/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-examples.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6179) Running any MapReduce jobs is throwing Heap Error
[ https://issues.apache.org/jira/browse/MAPREDUCE-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234322#comment-14234322 ] Rohith commented on MAPREDUCE-6179: --- What is the applicationMaster heap memory configured? Running any MapReduce jobs is throwing Heap Error - Key: MAPREDUCE-6179 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6179 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.2.0 Environment: Linux 2.6.32-358.el6.x86_64 RAM : 12 GB Reporter: Ketan Deshmukh I have a hadoop distribution installed on a cluster with 1 namenode and 2 data nodes. I have been trying to run the default mapreduce example included in the hadoop package and its been throwing the following error : Error occurred during initialization of VM Could not reserve enough space for object heap The complete message test is as following : [192769@hawq /]$ hadoop jar /usr/lib/gphd/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0-gphd-3.0.1.0.jar pi 10 100 Number of Maps = 10 Samples per Map = 100 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 14/12/04 17:28:44 INFO client.RMProxy: Connecting to ResourceManager at dn2.tcsgegdc.com/3.209.124.208:8032 14/12/04 17:28:44 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 50 for 192769 on 3.209.124.204:8020 14/12/04 17:28:44 INFO security.TokenCache: Got dt for hdfs://pcc.tcsgegdc.com:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 3.209.124.204:8020, Ident: (HDFS_DELEGATION_TOKEN token 50 for 192769) 14/12/04 17:28:44 INFO input.FileInputFormat: Total input paths to process : 10 14/12/04 17:28:44 INFO mapreduce.JobSubmitter: number of splits:10 14/12/04 17:28:44 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/12/04 17:28:44 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/12/04 17:28:44 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class 14/12/04 17:28:44 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/12/04 17:28:44 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/12/04 17:28:44 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/12/04 17:28:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1413806590994_0039 14/12/04 17:28:44 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: 3.209.124.204:8020, Ident: (HDFS_DELEGATION_TOKEN token 50 for 192769) 14/12/04 17:28:44 INFO impl.YarnClientImpl: Submitted application application_1413806590994_0039 to ResourceManager at dn2.tcsgegdc.com/3.209.124.208:8032 14/12/04 17:28:44 INFO mapreduce.Job: The url to track the job: http://dn2.tcsgegdc.com:8088/proxy/application_1413806590994_0039/ 14/12/04 17:28:44 INFO mapreduce.Job: Running job: job_1413806590994_0039 14/12/04 17:28:48 INFO mapreduce.Job: Job job_1413806590994_0039 running in uber mode : false 14/12/04 17:28:48 INFO mapreduce.Job: map 0% reduce 0%
[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14233932#comment-14233932 ] Rohith commented on MAPREDUCE-3902: --- I will update earliest MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc. -- Key: MAPREDUCE-3902 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, mrv2 Reporter: Arun C Murthy Assignee: Kannan Rajah Attachments: AMContainerRefactorNotes.pdf, AM_ContainerRefactor.pdf, MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch The MR AM is now in a great position to reuse containers across (map) tasks. This is something similar to JVM re-use we had in 0.20.x, but in a significantly better manner: # Consider data-locality when re-using containers # Consider the new shuffle - ensure that reduces fetch output of the whole container at once (i.e. all maps) : MAPREDUCE-4525 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232635#comment-14232635 ] Rohith commented on MAPREDUCE-3902: --- I wonder this jira has become stale for long time and would like to know the reason. I personally think this feature would be helpfull in terms of latency container allocation latency.We have done few analysis and implemented support for JVM reuse on branch-2 without breaking existing AM functionality. We would be ready to share prototype patch along with design doc. MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc. -- Key: MAPREDUCE-3902 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, mrv2 Reporter: Arun C Murthy Assignee: Siddharth Seth Attachments: AMContainerRefactorNotes.pdf, AM_ContainerRefactor.pdf, MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch The MR AM is now in a great position to reuse containers across (map) tasks. This is something similar to JVM re-use we had in 0.20.x, but in a significantly better manner: # Consider data-locality when re-using containers # Consider the new shuffle - ensure that reduces fetch output of the whole container at once (i.e. all maps) : MAPREDUCE-4525 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6160: -- Attachment: MAPREDUCE-6160.3.patch Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.1.patch, MAPREDUCE-6160.2.patch, MAPREDUCE-6160.3.patch, MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224764#comment-14224764 ] Rohith commented on MAPREDUCE-6160: --- bq. just one nit. Should we just say Unknown Job + jobId for the error message? I changed log as above. I updated the patch with changing comment message for consistency. Please review. Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.1.patch, MAPREDUCE-6160.2.patch, MAPREDUCE-6160.3.patch, MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6160: -- Status: Patch Available (was: Open) Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.1.patch, MAPREDUCE-6160.2.patch, MAPREDUCE-6160.3.patch, MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223245#comment-14223245 ] Rohith commented on MAPREDUCE-6160: --- Thanks [~jlowe] for your inputs. Given client APIs already throwing IOExceptions, leaving getJobReport() others API's can throw IOExceptions with valid message. Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.1.patch, MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6160: -- Attachment: MAPREDUCE-6160.2.patch I updated the patch as per discussion.Please review Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.1.patch, MAPREDUCE-6160.2.patch, MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6160: -- Attachment: MAPREDUCE-6160.1.patch Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.1.patch, MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222391#comment-14222391 ] Rohith commented on MAPREDUCE-6160: --- Attached patch with following fix 1. If invalid JobId is detected at AM/HistoryServer, the API's return with null i.e consistent with getJobReport(). There is no change made to getJobReport(). 2. ClientServiceDelegate identifies response having null for corresponding api's and throw IOException with message Unknown job + jobId. Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.1.patch, MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222394#comment-14222394 ] Rohith commented on MAPREDUCE-6160: --- Please review the patch. Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.1.patch, MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222456#comment-14222456 ] Rohith commented on MAPREDUCE-6160: --- Oops can't imagine these many test cases are failing!!:-( Deeply need to look into individual tests failures are really caused by patch or something else problem. Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.1.patch, MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (MAPREDUCE-5429) App Master throw OutOfMemoryErrors.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith resolved MAPREDUCE-5429. --- Resolution: Not a Problem Closing the issue since problem was with am memory configuration. After increasing memory to 5GB, it worked well. Thanks all who looked into this issue and for their suggestions. App Master throw OutOfMemoryErrors. --- Key: MAPREDUCE-5429 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5429 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.5-alpha Reporter: Rohith While running job , got OOM in app master and exitted the app master jvm. {noformat} 2013-07-28 13:45:21,937 ERROR [IPC Server handler 14 on 59522] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:job_1374920247399_0422 (auth:TOKEN) cause:java.io.IOException: java.lang.OutOfMemoryError: Java heap space 2013-07-28 13:45:21,937 INFO [IPC Server handler 22 on 59522] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1374920247399_0422_r_000384_0 2013-07-28 13:45:46,100 INFO [IPC Server handler 22 on 59522] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1374920247399_0422_r_000384_0 is : 0.22976667 2013-07-28 13:45:21,937 ERROR [IPC Server handler 15 on 59522] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:job_1374920247399_0422 (auth:TOKEN) cause:java.io.IOException: java.lang.OutOfMemoryError: Java heap space 2013-07-28 13:45:21,937 ERROR [IPC Server handler 13 on 59522] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:job_1374920247399_0422 (auth:TOKEN) cause:java.io.IOException: java.lang.OutOfMemoryError: Java heap space 2013-07-28 13:45:54,522 INFO [IPC Server handler 15 on 59522] org.apache.hadoop.ipc.Server: IPC Server handler 15 on 59522, call statusUpdate(attempt_1374920247399_0422_r_000225_0, org.apache.hadoop.mapred.ReduceTaskStatus@dd89c26), rpc version=2, client version=19, methodsFingerPrint=937413979 from 10.71.115.238:59691: error: java.io.IOException: java.lang.OutOfMemoryError: Java heap space java.io.IOException: java.lang.OutOfMemoryError: Java heap space 2013-07-28 13:45:21,937 INFO [IPC Server handler 19 on 59522] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1374920247399_0422_r_000307_0 2013-07-28 13:45:21,937 INFO [IPC Server handler 16 on 59522] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1374920247399_0422_r_000552_0 2013-07-28 13:46:09,900 INFO [IPC Server handler 16 on 59522] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1374920247399_0422_r_000552_0 is : 0.17983334 2013-07-28 13:45:14,870 ERROR [IPC Server handler 6 on 59522] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:job_1374920247399_0422 (auth:TOKEN) cause:java.io.IOException: java.lang.OutOfMemoryError: Java heap space 2013-07-28 13:45:14,870 FATAL [ResponseProcessor for block BP-myhacluster-25656:blk_-2026966945468195799_12352] org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[ResponseProcessor for block BP-myhacluster-25656:blk_-2026966945468195799_12352,5,main] threw an Error. Shutting down now... java.lang.OutOfMemoryError: Java heap space {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219582#comment-14219582 ] Rohith commented on MAPREDUCE-6160: --- The only API that expect null in the MRClientProtocol getJobReport. null has been checked in the trailing call hierarchy of getJobReport.I agree this breaks compatibility. But other API's implementation is vulnerable to throw NPE if queried with invalid job id. This would be better if either it should be RuntimeException or API defined IOException with proper message to client.Any thoughts? Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219625#comment-14219625 ] Rohith commented on MAPREDUCE-6160: --- bq. However it may be better to return null instead if that would make the interface more self-consistent. I understand this way, let ClientServiceDelegate check for response null and handle it by throwing IOException.Is it so? Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220402#comment-14220402 ] Rohith commented on MAPREDUCE-6160: --- I mean leaving getJobReport for others APIs. Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217759#comment-14217759 ] Rohith commented on MAPREDUCE-6160: --- NPE trace in {{MRClientProtocol#getTaskAttemptCompletionEvents()}} {code} 14/11/07 15:09:38 INFO mapreduce.Job: map 92% reduce 25% 14/11/07 15:09:40 INFO mapreduce.Job: map 96% reduce 25% 14/11/07 15:09:42 INFO mapreduce.Job: map 100% reduce 25% 14/11/07 15:09:43 INFO mapreduce.Job: map 100% reduce 100% 14/11/07 15:09:43 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 14/11/07 15:09:43 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 14/11/07 15:09:43 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server java.io.IOException: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getTaskAttemptCompletionEvents(HistoryClientService.java:277) at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBServiceImpl.java:173) at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:283) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1612) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) {code} Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6160: -- Attachment: MAPREDUCE-6160.patch Update the patch for correcting tests. Please review the patch Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218912#comment-14218912 ] Rohith commented on MAPREDUCE-6160: --- [~jlowe] Could you please look at this jira whenever you have free time? Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219049#comment-14219049 ] Rohith commented on MAPREDUCE-6160: --- bq. Is there any reason why the job is not available in History? I see in your above comment the Job got succeeded, Do you see any exception in in MR AM/JHS log for the same job? MRAppMaster was failed to write to HDFS because of safemode and safemode was recovered later. Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.patch, MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6049) AM JVM does not exit if MRClientService gracefull shutdown fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216418#comment-14216418 ] Rohith commented on MAPREDUCE-6049: --- Thanks [~devaraj.k] for reviewing and commiting patch:-) AM JVM does not exit if MRClientService gracefull shutdown fails Key: MAPREDUCE-6049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6049 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, resourcemanager Affects Versions: 2.5.0 Reporter: Nishan Shetty Assignee: Rohith Fix For: 2.7.0 Attachments: AM_ThreadDump.td, MAPREDUCE-6049.1.patch, MAPREDUCE-6049.patch, MAPREDUCE-6049.patch Eventhough job got FAILED, AM process still not exiting ThreadDump of AM process is below {noformat} Job Fail Wait Timeout Monitor #0 daemon prio=10 tid=0x00aa9000 nid=0x41fa waiting on condition [0x7f0e0d1d] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0xc104c688 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1079) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6160: -- Status: Patch Available (was: Open) Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-6160: -- Attachment: MAPREDUCE-6160.patch Attached the patch for handling NPE for invalid job id's. Please review. Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-6160.patch In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6160) Potential NullPointerException in MRClientProtocol interface implementation.
Rohith created MAPREDUCE-6160: - Summary: Potential NullPointerException in MRClientProtocol interface implementation. Key: MAPREDUCE-6160 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6160 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Assignee: Rohith In the implementation of MRClientProtocol, many methods can throw NullPointerExceptions. Instead of NullPointerExceptions, better to throw IOException with proper message. In the HistoryClientService class and MRClientService class has #verifyAndGetJob() method that return job object as null. {code} getTaskReport(GetTaskReportRequest request) throws IOException; getTaskAttemptReport(GetTaskAttemptReportRequest request) throws IOException; getCounters(GetCountersRequest request) throws IOException; getTaskAttemptCompletionEvents(GetTaskAttemptCompletionEventsRequest request) throws IOException; getTaskReports(GetTaskReportsRequest request) throws IOException; getDiagnostics(GetDiagnosticsRequest request) throws IOException; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6049) AM JVM does not exit if MRClientService gracefull shutdown fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14198002#comment-14198002 ] Rohith commented on MAPREDUCE-6049: --- Hi [~vinodkv] [~jianhe] Can this issue fix go into 2.6 release? Here also ApplicationMaster JVM does not exit for ever. AM JVM does not exit if MRClientService gracefull shutdown fails Key: MAPREDUCE-6049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6049 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, resourcemanager Affects Versions: 2.5.0 Reporter: Nishan Shetty Assignee: Rohith Attachments: AM_ThreadDump.td, MAPREDUCE-6049.1.patch, MAPREDUCE-6049.patch, MAPREDUCE-6049.patch Eventhough job got FAILED, AM process still not exiting ThreadDump of AM process is below {noformat} Job Fail Wait Timeout Monitor #0 daemon prio=10 tid=0x00aa9000 nid=0x41fa waiting on condition [0x7f0e0d1d] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0xc104c688 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1079) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client
[ https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14174755#comment-14174755 ] Rohith commented on MAPREDUCE-5542: --- bq. Also the name doesn't seem to match how it's being used Agree.. bq. The logic in the new isJobFinished does not look correct. isJobFinished will always return true as currently written I think because of method name,it was being confused. But IIUC , It will return true if Job is not in terminal states(KILLED/FAILED/SUCCEDED). Once Job reaches terminal states(killed/failed/succeded), it return false. bq. but we want to loop while the job is still active. is below code OK? {code} private boolean isJobInTerminalState(JobStatus status) { return status.getState() == JobStatus.State.KILLED || status.getState() == JobStatus.State.FAILED || status.getState() == JobStatus.State.SUCCEEDED; } public void killJob(JobID arg0) throws IOException, InterruptedException { // while ((currentTimeMillis timeKillIssued + 1L) (!isJobInTerminalState(status))) { // inside while loop } if (status != null !isJobInTerminalState(status)) { killApplication(appId); } } {code} Killing a job just as it finishes can generate an NPE in client --- Key: MAPREDUCE-5542 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Affects Versions: 2.1.0-beta, 0.23.9 Reporter: Jason Lowe Assignee: Rohith Attachments: MAPREDUCE-5542.1.patch, MAPREDUCE-5542.2.patch, MAPREDUCE-5542.3.patch, MAPREDUCE-5542.4.patch, MAPREDUCE-5542.5.patch If a client tries to kill a job just as the job is finishing then the client can crash with an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client
[ https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated MAPREDUCE-5542: -- Attachment: MAPREDUCE-5542.6.patch Killing a job just as it finishes can generate an NPE in client --- Key: MAPREDUCE-5542 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Affects Versions: 2.1.0-beta, 0.23.9 Reporter: Jason Lowe Assignee: Rohith Attachments: MAPREDUCE-5542.1.patch, MAPREDUCE-5542.2.patch, MAPREDUCE-5542.3.patch, MAPREDUCE-5542.4.patch, MAPREDUCE-5542.5.patch, MAPREDUCE-5542.6.patch If a client tries to kill a job just as the job is finishing then the client can crash with an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client
[ https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175184#comment-14175184 ] Rohith commented on MAPREDUCE-5542: --- I updated patch addressing above comments, please review.. Killing a job just as it finishes can generate an NPE in client --- Key: MAPREDUCE-5542 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Affects Versions: 2.1.0-beta, 0.23.9 Reporter: Jason Lowe Assignee: Rohith Attachments: MAPREDUCE-5542.1.patch, MAPREDUCE-5542.2.patch, MAPREDUCE-5542.3.patch, MAPREDUCE-5542.4.patch, MAPREDUCE-5542.5.patch, MAPREDUCE-5542.6.patch If a client tries to kill a job just as the job is finishing then the client can crash with an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client
[ https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175815#comment-14175815 ] Rohith commented on MAPREDUCE-5542: --- Thanks Jason Lowe for review and committing patch:-) Killing a job just as it finishes can generate an NPE in client --- Key: MAPREDUCE-5542 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Affects Versions: 2.1.0-beta, 0.23.9 Reporter: Jason Lowe Assignee: Rohith Fix For: 2.6.0 Attachments: MAPREDUCE-5542.1.patch, MAPREDUCE-5542.2.patch, MAPREDUCE-5542.3.patch, MAPREDUCE-5542.4.patch, MAPREDUCE-5542.5.patch, MAPREDUCE-5542.6.patch If a client tries to kill a job just as the job is finishing then the client can crash with an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)