[jira] [Updated] (YARN-434) fix coverage org.apache.hadoop.fs.ftp
[ https://issues.apache.org/jira/browse/YARN-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Gorshkov updated YARN-434: -- Description: fix coverage org.apache.hadoop.fs.ftp patch YARN-434-trunk.patch for trunk, branch-2, branch-0.23 was:fix coverage org.apache.hadoop.fs.ftp fix coverage org.apache.hadoop.fs.ftp -- Key: YARN-434 URL: https://issues.apache.org/jira/browse/YARN-434 Project: Hadoop YARN Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-beta Environment: fix coverage org.apache.hadoop.fs.ftp Reporter: Aleksey Gorshkov Attachments: YARN-434-trunk.patch fix coverage org.apache.hadoop.fs.ftp patch YARN-434-trunk.patch for trunk, branch-2, branch-0.23 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-434) fix coverage org.apache.hadoop.fs.ftp
[ https://issues.apache.org/jira/browse/YARN-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589355#comment-13589355 ] Hadoop QA commented on YARN-434: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571375/YARN-434-trunk.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/446//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/446//console This message is automatically generated. fix coverage org.apache.hadoop.fs.ftp -- Key: YARN-434 URL: https://issues.apache.org/jira/browse/YARN-434 Project: Hadoop YARN Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-beta Environment: fix coverage org.apache.hadoop.fs.ftp Reporter: Aleksey Gorshkov Attachments: YARN-434-trunk.patch fix coverage org.apache.hadoop.fs.ftp patch YARN-434-trunk.patch for trunk, branch-2, branch-0.23 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-426) Failure to download a public resource on a node prevents further downloads of the resource from that node
[ https://issues.apache.org/jira/browse/YARN-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589425#comment-13589425 ] Hudson commented on YARN-426: - Integrated in Hadoop-Yarn-trunk #141 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/141/]) YARN-426. Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450807 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java Failure to download a public resource on a node prevents further downloads of the resource from that node - Key: YARN-426 URL: https://issues.apache.org/jira/browse/YARN-426 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Fix For: 3.0.0, 0.23.7, 2.0.4-beta Attachments: YARN-426.patch If the NM encounters an error while downloading a public resource, it fails to empty the list of request events corresponding to the resource request in {{attempts}}. If the same public resource is subsequently requested on that node, {{PublicLocalizer.addResource}} will skip the download since it will mistakenly believe a download of that resource is already in progress. At that point any container that requests the public resource will just hang in the {{LOCALIZING}} state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-426) Failure to download a public resource on a node prevents further downloads of the resource from that node
[ https://issues.apache.org/jira/browse/YARN-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589489#comment-13589489 ] Hudson commented on YARN-426: - Integrated in Hadoop-Hdfs-0.23-Build #539 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/539/]) svn merge -c 1450807 FIXES: YARN-426. Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450813) Result = UNSTABLE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450813 Files : * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java Failure to download a public resource on a node prevents further downloads of the resource from that node - Key: YARN-426 URL: https://issues.apache.org/jira/browse/YARN-426 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Fix For: 3.0.0, 0.23.7, 2.0.4-beta Attachments: YARN-426.patch If the NM encounters an error while downloading a public resource, it fails to empty the list of request events corresponding to the resource request in {{attempts}}. If the same public resource is subsequently requested on that node, {{PublicLocalizer.addResource}} will skip the download since it will mistakenly believe a download of that resource is already in progress. At that point any container that requests the public resource will just hang in the {{LOCALIZING}} state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-426) Failure to download a public resource on a node prevents further downloads of the resource from that node
[ https://issues.apache.org/jira/browse/YARN-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589501#comment-13589501 ] Hudson commented on YARN-426: - Integrated in Hadoop-Hdfs-trunk #1330 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1330/]) YARN-426. Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450807 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java Failure to download a public resource on a node prevents further downloads of the resource from that node - Key: YARN-426 URL: https://issues.apache.org/jira/browse/YARN-426 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Fix For: 3.0.0, 0.23.7, 2.0.4-beta Attachments: YARN-426.patch If the NM encounters an error while downloading a public resource, it fails to empty the list of request events corresponding to the resource request in {{attempts}}. If the same public resource is subsequently requested on that node, {{PublicLocalizer.addResource}} will skip the download since it will mistakenly believe a download of that resource is already in progress. At that point any container that requests the public resource will just hang in the {{LOCALIZING}} state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-426) Failure to download a public resource on a node prevents further downloads of the resource from that node
[ https://issues.apache.org/jira/browse/YARN-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589525#comment-13589525 ] Hudson commented on YARN-426: - Integrated in Hadoop-Mapreduce-trunk #1358 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1358/]) YARN-426. Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450807 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java Failure to download a public resource on a node prevents further downloads of the resource from that node - Key: YARN-426 URL: https://issues.apache.org/jira/browse/YARN-426 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Fix For: 3.0.0, 0.23.7, 2.0.4-beta Attachments: YARN-426.patch If the NM encounters an error while downloading a public resource, it fails to empty the list of request events corresponding to the resource request in {{attempts}}. If the same public resource is subsequently requested on that node, {{PublicLocalizer.addResource}} will skip the download since it will mistakenly believe a download of that resource is already in progress. At that point any container that requests the public resource will just hang in the {{LOCALIZING}} state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-149: Summary: ResourceManager (RM) High-Availability (HA) (was: ZK-based High Availability (HA) for ResourceManager (RM)) ResourceManager (RM) High-Availability (HA) --- Key: YARN-149 URL: https://issues.apache.org/jira/browse/YARN-149 Project: Hadoop YARN Issue Type: New Feature Reporter: Harsh J Assignee: Bikas Saha One of the goals presented on MAPREDUCE-279 was to have high availability. One way that was discussed, per Mahadev/others on https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK: {quote} Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK. {quote} There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is meant to track HA via ZK. Currently there isn't a HA solution for RM, via ZK or otherwise. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers
[ https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589785#comment-13589785 ] Sandy Ryza commented on YARN-417: - bq. I think if ContainerExitCodes needs to be added then it should be its own jira Will move the container exit codes in a separate JIRA. bq. The helper function would have helped because containers contain information set by 2 entities... The issue is that there is not a ton of information for a helper function to interpret. From what I can tell, The framework only defines two special exit codes, and does not distinguish between OOMs and other kinds of container failures, or between killing a container because it was preempted or because the RM lost track of it. These exit codes are platform independent, and any other exit codes can be both application and platform dependent, so the AMRMClientAsync wouldn't know how to interpret them. As ContainerStatuses coming from the RM are only in the context of container completions, ContainerState provides no extra information. Additional information can sometimes be found in the diagnostics strings, but if the reasons that containers die are to be codified, I don't think it should be done by interpreting strings at the API level. bq. Why is client.start() being called in init? client.stop() is being called in stop(). registerApplicationMaster needs to be called after setting up the RM proxy, which occurs in AMRMClient#start, but before starting the heartbeater, which occurs in AMRMClientAsync#start. Another way to accomplish this would be to move the code in AMRMClientImpl#start to AMRMClientImpl#init, which also seems reasonable to me. A third way would be to call registerApplicationMaster from AMRMClientAsync#start. bq. I am wary of calling back on the heartbeat thread itself. Will add a handling thread. bq. Not waiting for the thread to join()? Why interrupt()? Thread needs to be stopped first so that it stops calling into the client. or else it can call into a client that has already stopped. Good point. My reason was that I've seen this as convention other places in YARN (see NodeStatusUpdaterImpl, for example), and that it would allow stop to be called from onContainerCompleted without deadlock, but with the handling thread, the latter shouldn't be a problem, so I'll change it. Add a poller that allows the AM to receive notifications when it is assigned containers --- Key: YARN-417 URL: https://issues.apache.org/jira/browse/YARN-417 Project: Hadoop YARN Issue Type: Sub-task Components: api, applications Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, YARN-417-1.patch, YARN-417.patch, YarnAppMaster.java, YarnAppMasterListener.java Writing AMs would be easier for some if they did not have to handle heartbeating to the RM on their own. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-173) Page navigation support for container logs page
[ https://issues.apache.org/jira/browse/YARN-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] omkar vinit joshi reassigned YARN-173: -- Assignee: omkar vinit joshi Page navigation support for container logs page --- Key: YARN-173 URL: https://issues.apache.org/jira/browse/YARN-173 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.2-alpha, 0.23.3 Reporter: Jason Lowe Assignee: omkar vinit joshi Labels: usability ContainerLogsPage and AggregatedLogsBlock both support {{start}} and {{end}} parameters which are a big help when trying to sift through a huge log. However it's annoying to have to manually edit the URL to go through a giant log page-by-page. It would be very handy if the web page also provided page navigation links so flipping to the next/previous/first/last chunk of log is a simple click away. Bonus points for providing a way to easily change the size of the log chunk shown per page. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-269) Resource Manager not logging the health_check_script result when taking it out
[ https://issues.apache.org/jira/browse/YARN-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589875#comment-13589875 ] Kihwal Lee commented on YARN-269: - please rebase. Resource Manager not logging the health_check_script result when taking it out -- Key: YARN-269 URL: https://issues.apache.org/jira/browse/YARN-269 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 0.23.5 Reporter: Thomas Graves Assignee: Jason Lowe Attachments: YARN-269.patch The Resource Manager not logging the health_check_script result when taking it out. This was added to jobtracker in 1.x with MAPREDUCE-2451, we should do the same thing for RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-269) Resource Manager not logging the health_check_script result when taking it out
[ https://issues.apache.org/jira/browse/YARN-269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-269: Attachment: YARN-269.patch Thanks for the review, Kihwal. Here's an updated patch. Resource Manager not logging the health_check_script result when taking it out -- Key: YARN-269 URL: https://issues.apache.org/jira/browse/YARN-269 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 0.23.5 Reporter: Thomas Graves Assignee: Jason Lowe Attachments: YARN-269.patch, YARN-269.patch The Resource Manager not logging the health_check_script result when taking it out. This was added to jobtracker in 1.x with MAPREDUCE-2451, we should do the same thing for RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-269) Resource Manager not logging the health_check_script result when taking it out
[ https://issues.apache.org/jira/browse/YARN-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589905#comment-13589905 ] Hadoop QA commented on YARN-269: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571462/YARN-269.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/447//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/447//console This message is automatically generated. Resource Manager not logging the health_check_script result when taking it out -- Key: YARN-269 URL: https://issues.apache.org/jira/browse/YARN-269 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 0.23.5 Reporter: Thomas Graves Assignee: Jason Lowe Attachments: YARN-269.patch, YARN-269.patch The Resource Manager not logging the health_check_script result when taking it out. This was added to jobtracker in 1.x with MAPREDUCE-2451, we should do the same thing for RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-237) Refreshing the RM page forgets how many rows I had in my Datatables
[ https://issues.apache.org/jira/browse/YARN-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589976#comment-13589976 ] Robert Joseph Evans commented on YARN-237: -- The change looks more or less OK to me. I am not thrilled about how we modify the data table's init string by looking for the first '{', but I think it is OK. I just have a few concerns, and most if it deals with my lack of knowledge about jQuery and localStorage. I know that localStorage is not supported on all browsers. I also know that localStorage can throw a QUOTA_EXCEEDED exception. What happens when we run into these situations? Will the page stop working or will jQuery degrade gracefully and simply not allow us to save the data. What about if the data stored in the key is not what we expect. Will jQuery make the page unusable. We currently have tables with the same name on different pages. If they are not kept in sync there could be some issues with the data that is saved. Which brings up another point I am also a bit concerned about the key we are using as part of the localStorage. The key is the id of the data table. I would prefer it if we could some how make it obvious that these values are for a data table, and not some other apps storage. Refreshing the RM page forgets how many rows I had in my Datatables --- Key: YARN-237 URL: https://issues.apache.org/jira/browse/YARN-237 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0 Reporter: Ravi Prakash Assignee: jian he Labels: usability Attachments: YARN-237.patch If I choose a 100 rows, and then refresh the page, DataTables goes back to showing me 20 rows. This user preference should be stored in a cookie. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-406) TestRackResolver fails when local network resolves host1 to a valid host
[ https://issues.apache.org/jira/browse/YARN-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589985#comment-13589985 ] Siddharth Seth commented on YARN-406: - +1. Committing this. TestRackResolver fails when local network resolves host1 to a valid host -- Key: YARN-406 URL: https://issues.apache.org/jira/browse/YARN-406 Project: Hadoop YARN Issue Type: Improvement Reporter: Hitesh Shah Assignee: Hitesh Shah Priority: Minor Attachments: YARN-406.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-406) TestRackResolver fails when local network resolves host1 to a valid host
[ https://issues.apache.org/jira/browse/YARN-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1359#comment-1359 ] Hudson commented on YARN-406: - Integrated in Hadoop-trunk-Commit #3401 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3401/]) YARN-406. Fix TestRackResolver to function in networks where host1 resolves to a valid host. Contributed by Hitesh Shah. (Revision 1451391) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1451391 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestRackResolver.java TestRackResolver fails when local network resolves host1 to a valid host -- Key: YARN-406 URL: https://issues.apache.org/jira/browse/YARN-406 Project: Hadoop YARN Issue Type: Improvement Reporter: Hitesh Shah Assignee: Hitesh Shah Priority: Minor Fix For: 2.0.4-beta Attachments: YARN-406.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM.
[ https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-196: --- Attachment: YARN-196.6.patch Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM. --- Key: YARN-196 URL: https://issues.apache.org/jira/browse/YARN-196 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.0.0-alpha Reporter: Ramgopal N Assignee: Xuan Gong Attachments: MAPREDUCE-3676.patch, YARN-196.1.patch, YARN-196.2.patch, YARN-196.3.patch, YARN-196.4.patch, YARN-196.5.patch, YARN-196.6.patch If NM is started before starting the RM ,NM is shutting down with the following error {code} ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting services org.apache.hadoop.yarn.server.nodemanager.NodeManager org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242) Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145) ... 3 more Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131) at $Proxy23.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) ... 5 more Caused by: java.net.ConnectException: Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857) at org.apache.hadoop.ipc.Client.call(Client.java:1141) at org.apache.hadoop.ipc.Client.call(Client.java:1100) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:128) ... 7 more Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:659) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:469) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:563) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:211) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1247) at org.apache.hadoop.ipc.Client.call(Client.java:1117) ... 9 more 2012-01-16 15:04:13,336 WARN org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher thread interrupted java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1934) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:76) at java.lang.Thread.run(Thread.java:619) 2012-01-16 15:04:13,337 INFO org.apache.hadoop.yarn.service.AbstractService: Service:Dispatcher is stopped. 2012-01-16 15:04:13,392 INFO org.mortbay.log:
[jira] [Commented] (YARN-237) Refreshing the RM page forgets how many rows I had in my Datatables
[ https://issues.apache.org/jira/browse/YARN-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590009#comment-13590009 ] jian he commented on YARN-237: -- Tried with latest version of firefox ,safari, chrome.It works. I will wrapper some code to check if the browser doesn't support,then it goes back to previous way. And I'll catch the exception you mentioned. The key is is already the id of the data table. This page has two data tables, one for listing all nodes, and the other for listing all applications. That's why the state is saved separately for 'nodes' and 'applications'. Lacking knowledge about jQuery and localStorage, not sure other potential problems. Refreshing the RM page forgets how many rows I had in my Datatables --- Key: YARN-237 URL: https://issues.apache.org/jira/browse/YARN-237 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0 Reporter: Ravi Prakash Assignee: jian he Labels: usability Attachments: YARN-237.patch If I choose a 100 rows, and then refresh the page, DataTables goes back to showing me 20 rows. This user preference should be stored in a cookie. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-24) Nodemanager fails to start if log aggregation enabled and namenode unavailable
[ https://issues.apache.org/jira/browse/YARN-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590015#comment-13590015 ] Alejandro Abdelnur commented on YARN-24: if we take the NM to unhealthy, then what will try to reset it back to healthy? I think the simplest way to handle this is to remove the creation of the dir on init and do that on app start time. Nodemanager fails to start if log aggregation enabled and namenode unavailable -- Key: YARN-24 URL: https://issues.apache.org/jira/browse/YARN-24 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Jason Lowe Assignee: Sandy Ryza Attachments: YARN-24.patch If log aggregation is enabled and the namenode is currently unavailable, the nodemanager fails to startup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM
[ https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590025#comment-13590025 ] Hadoop QA commented on YARN-196: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571485/YARN-196.6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/448//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/448//console This message is automatically generated. Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM. --- Key: YARN-196 URL: https://issues.apache.org/jira/browse/YARN-196 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.0.0-alpha Reporter: Ramgopal N Assignee: Xuan Gong Attachments: MAPREDUCE-3676.patch, YARN-196.1.patch, YARN-196.2.patch, YARN-196.3.patch, YARN-196.4.patch, YARN-196.5.patch, YARN-196.6.patch If NM is started before starting the RM ,NM is shutting down with the following error {code} ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting services org.apache.hadoop.yarn.server.nodemanager.NodeManager org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242) Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145) ... 3 more Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131) at $Proxy23.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) ... 5 more Caused by: java.net.ConnectException: Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857) at org.apache.hadoop.ipc.Client.call(Client.java:1141) at org.apache.hadoop.ipc.Client.call(Client.java:1100) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:128) ... 7 more Caused by: java.net.ConnectException: Connection
[jira] [Created] (YARN-435) Make it easier to access cluster topology information in an AM
Hitesh Shah created YARN-435: Summary: Make it easier to access cluster topology information in an AM Key: YARN-435 URL: https://issues.apache.org/jira/browse/YARN-435 Project: Hadoop YARN Issue Type: Bug Reporter: Hitesh Shah ClientRMProtocol exposes a getClusterNodes api that provides a report on all nodes in the cluster including their rack information. However, this requires the AM to open and establish a separate connection to the RM in addition to one for the AMRMProtocol. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI
[ https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated YARN-376: Attachment: YARN-376.patch Re-uploading the second patch for jenkins. Apps that have completed can appear as RUNNING on the NM UI --- Key: YARN-376 URL: https://issues.apache.org/jira/browse/YARN-376 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376.patch On a busy cluster we've noticed a growing number of applications appear as RUNNING on a nodemanager web pages but the applications have long since finished. Looking at the NM logs, it appears the RM never told the nodemanager that the application had finished. This is also reflected in a jstack of the NM process, since many more log aggregation threads are running then one would expect from the number of actively running applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI
[ https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590126#comment-13590126 ] Siddharth Seth commented on YARN-376: - bq. Thanks for the review, Sidd. I originally had it update the heartbeat since the RMNode interface already knew about the heartbeat type and it's more efficient (don't need to create an extra copy of the app list and grab the write lock only once instead of twice). Good point. Re-uploading the old patch again for Jenkins. Apps that have completed can appear as RUNNING on the NM UI --- Key: YARN-376 URL: https://issues.apache.org/jira/browse/YARN-376 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376.patch On a busy cluster we've noticed a growing number of applications appear as RUNNING on a nodemanager web pages but the applications have long since finished. Looking at the NM logs, it appears the RM never told the nodemanager that the application had finished. This is also reflected in a jstack of the NM process, since many more log aggregation threads are running then one would expect from the number of actively running applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI
[ https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590150#comment-13590150 ] Hadoop QA commented on YARN-376: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571514/YARN-376.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/449//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/449//console This message is automatically generated. Apps that have completed can appear as RUNNING on the NM UI --- Key: YARN-376 URL: https://issues.apache.org/jira/browse/YARN-376 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376.patch On a busy cluster we've noticed a growing number of applications appear as RUNNING on a nodemanager web pages but the applications have long since finished. Looking at the NM logs, it appears the RM never told the nodemanager that the application had finished. This is also reflected in a jstack of the NM process, since many more log aggregation threads are running then one would expect from the number of actively running applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-345) Many InvalidStateTransitonException errors for ApplicationImpl in Node Manager
[ https://issues.apache.org/jira/browse/YARN-345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated YARN-345: --- Attachment: YARN-354v2.patch Many InvalidStateTransitonException errors for ApplicationImpl in Node Manager -- Key: YARN-345 URL: https://issues.apache.org/jira/browse/YARN-345 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.2-alpha, 2.0.1-alpha, 0.23.5 Reporter: Devaraj K Assignee: Robert Parker Priority: Critical Attachments: YARN-345.patch, YARN-354v2.patch {code:xml} org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: FINISH_APPLICATION at FINISHED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} {code:xml} 2013-01-17 04:03:46,726 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Can't handle this event at current state org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: FINISH_APPLICATION at APPLICATION_RESOURCES_CLEANINGUP at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} {code:xml} 2013-01-17 00:01:11,006 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Can't handle this event at current state org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: FINISH_APPLICATION at FINISHING_CONTAINERS_WAIT at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} {code:xml} 2013-01-17 10:56:36,975 INFO
[jira] [Updated] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI
[ https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated YARN-376: Attachment: YARN-376_branch-0.23.txt Apps that have completed can appear as RUNNING on the NM UI --- Key: YARN-376 URL: https://issues.apache.org/jira/browse/YARN-376 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: YARN-376_branch-0.23.txt, YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376-trunk.txt On a busy cluster we've noticed a growing number of applications appear as RUNNING on a nodemanager web pages but the applications have long since finished. Looking at the NM logs, it appears the RM never told the nodemanager that the application had finished. This is also reflected in a jstack of the NM process, since many more log aggregation threads are running then one would expect from the number of actively running applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI
[ https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated YARN-376: Attachment: YARN-376-trunk.txt Apps that have completed can appear as RUNNING on the NM UI --- Key: YARN-376 URL: https://issues.apache.org/jira/browse/YARN-376 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: YARN-376_branch-0.23.txt, YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376-trunk.txt On a busy cluster we've noticed a growing number of applications appear as RUNNING on a nodemanager web pages but the applications have long since finished. Looking at the NM logs, it appears the RM never told the nodemanager that the application had finished. This is also reflected in a jstack of the NM process, since many more log aggregation threads are running then one would expect from the number of actively running applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-436) Document how to use DistributedShell yarn application
Hitesh Shah created YARN-436: Summary: Document how to use DistributedShell yarn application Key: YARN-436 URL: https://issues.apache.org/jira/browse/YARN-436 Project: Hadoop YARN Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-437) Update documentation of Writing Yarn applications to match current best practices
Hitesh Shah created YARN-437: Summary: Update documentation of Writing Yarn applications to match current best practices Key: YARN-437 URL: https://issues.apache.org/jira/browse/YARN-437 Project: Hadoop YARN Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Should fix docs to point to usage of YarnClient and AMRMClient helper libs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-437) Update documentation of Writing Yarn applications to match current best practices
[ https://issues.apache.org/jira/browse/YARN-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated YARN-437: - Labels: usability (was: ) Update documentation of Writing Yarn applications to match current best practices - Key: YARN-437 URL: https://issues.apache.org/jira/browse/YARN-437 Project: Hadoop YARN Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Labels: usability Should fix docs to point to usage of YarnClient and AMRMClient helper libs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-436) Document how to use DistributedShell yarn application
[ https://issues.apache.org/jira/browse/YARN-436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated YARN-436: - Labels: usability (was: ) Document how to use DistributedShell yarn application - Key: YARN-436 URL: https://issues.apache.org/jira/browse/YARN-436 Project: Hadoop YARN Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Labels: usability -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-345) Many InvalidStateTransitonException errors for ApplicationImpl in Node Manager
[ https://issues.apache.org/jira/browse/YARN-345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590171#comment-13590171 ] Hadoop QA commented on YARN-345: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571519/YARN-354v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/450//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/450//console This message is automatically generated. Many InvalidStateTransitonException errors for ApplicationImpl in Node Manager -- Key: YARN-345 URL: https://issues.apache.org/jira/browse/YARN-345 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.2-alpha, 2.0.1-alpha, 0.23.5 Reporter: Devaraj K Assignee: Robert Parker Priority: Critical Attachments: YARN-345.patch, YARN-354v2.patch {code:xml} org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: FINISH_APPLICATION at FINISHED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} {code:xml} 2013-01-17 04:03:46,726 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Can't handle this event at current state org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: FINISH_APPLICATION at APPLICATION_RESOURCES_CLEANINGUP at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} {code:xml} 2013-01-17
[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI
[ https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590175#comment-13590175 ] Hadoop QA commented on YARN-376: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571522/YARN-376-trunk.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/451//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/451//console This message is automatically generated. Apps that have completed can appear as RUNNING on the NM UI --- Key: YARN-376 URL: https://issues.apache.org/jira/browse/YARN-376 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: YARN-376_branch-0.23.txt, YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376-trunk.txt On a busy cluster we've noticed a growing number of applications appear as RUNNING on a nodemanager web pages but the applications have long since finished. Looking at the NM logs, it appears the RM never told the nodemanager that the application had finished. This is also reflected in a jstack of the NM process, since many more log aggregation threads are running then one would expect from the number of actively running applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-437) Update documentation of Writing Yarn applications to match current best practices
[ https://issues.apache.org/jira/browse/YARN-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590200#comment-13590200 ] Sandy Ryza commented on YARN-437: - YARN-417 is adding an async AMRMClient to simplify writing apps, so it might be good to incorporate that when it's finished. Update documentation of Writing Yarn applications to match current best practices - Key: YARN-437 URL: https://issues.apache.org/jira/browse/YARN-437 Project: Hadoop YARN Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Labels: usability Should fix docs to point to usage of YarnClient and AMRMClient helper libs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI
[ https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590260#comment-13590260 ] Siddharth Seth commented on YARN-376: - The eclipse failures is not related. Committing this. Apps that have completed can appear as RUNNING on the NM UI --- Key: YARN-376 URL: https://issues.apache.org/jira/browse/YARN-376 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: YARN-376_branch-0.23.txt, YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376-trunk.txt On a busy cluster we've noticed a growing number of applications appear as RUNNING on a nodemanager web pages but the applications have long since finished. Looking at the NM logs, it appears the RM never told the nodemanager that the application had finished. This is also reflected in a jstack of the NM process, since many more log aggregation threads are running then one would expect from the number of actively running applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI
[ https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590269#comment-13590269 ] Hudson commented on YARN-376: - Integrated in Hadoop-trunk-Commit #3403 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3403/]) YARN-376. Fixes a bug which would prevent the NM knowing about completed containers and applications. Contributed by Jason Lowe. (Revision 1451473) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1451473 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java Apps that have completed can appear as RUNNING on the NM UI --- Key: YARN-376 URL: https://issues.apache.org/jira/browse/YARN-376 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha, 0.23.6 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 0.23.7, 2.0.4-beta Attachments: YARN-376_branch-0.23.txt, YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376-trunk.txt On a busy cluster we've noticed a growing number of applications appear as RUNNING on a nodemanager web pages but the applications have long since finished. Looking at the NM logs, it appears the RM never told the nodemanager that the application had finished. This is also reflected in a jstack of the NM process, since many more log aggregation threads are running then one would expect from the number of actively running applications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira