[jira] [Updated] (YARN-434) fix coverage org.apache.hadoop.fs.ftp

2013-02-28 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-434:
--

Description: 
fix coverage  org.apache.hadoop.fs.ftp
patch YARN-434-trunk.patch for trunk, branch-2, branch-0.23

  was:fix coverage  org.apache.hadoop.fs.ftp


 fix coverage  org.apache.hadoop.fs.ftp
 --

 Key: YARN-434
 URL: https://issues.apache.org/jira/browse/YARN-434
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-beta
 Environment: fix coverage  org.apache.hadoop.fs.ftp
Reporter: Aleksey Gorshkov
 Attachments: YARN-434-trunk.patch


 fix coverage  org.apache.hadoop.fs.ftp
 patch YARN-434-trunk.patch for trunk, branch-2, branch-0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-434) fix coverage org.apache.hadoop.fs.ftp

2013-02-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589355#comment-13589355
 ] 

Hadoop QA commented on YARN-434:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571375/YARN-434-trunk.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/446//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/446//console

This message is automatically generated.

 fix coverage  org.apache.hadoop.fs.ftp
 --

 Key: YARN-434
 URL: https://issues.apache.org/jira/browse/YARN-434
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-beta
 Environment: fix coverage  org.apache.hadoop.fs.ftp
Reporter: Aleksey Gorshkov
 Attachments: YARN-434-trunk.patch


 fix coverage  org.apache.hadoop.fs.ftp
 patch YARN-434-trunk.patch for trunk, branch-2, branch-0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-426) Failure to download a public resource on a node prevents further downloads of the resource from that node

2013-02-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589425#comment-13589425
 ] 

Hudson commented on YARN-426:
-

Integrated in Hadoop-Yarn-trunk #141 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/141/])
YARN-426. Failure to download a public resource prevents further downloads 
(Jason Lowe via bobby) (Revision 1450807)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450807
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java


 Failure to download a public resource on a node prevents further downloads of 
 the resource from that node
 -

 Key: YARN-426
 URL: https://issues.apache.org/jira/browse/YARN-426
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 3.0.0, 0.23.7, 2.0.4-beta

 Attachments: YARN-426.patch


 If the NM encounters an error while downloading a public resource, it fails 
 to empty the list of request events corresponding to the resource request in 
 {{attempts}}.  If the same public resource is subsequently requested on that 
 node, {{PublicLocalizer.addResource}} will skip the download since it will 
 mistakenly believe a download of that resource is already in progress.  At 
 that point any container that requests the public resource will just hang in 
 the {{LOCALIZING}} state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-426) Failure to download a public resource on a node prevents further downloads of the resource from that node

2013-02-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589489#comment-13589489
 ] 

Hudson commented on YARN-426:
-

Integrated in Hadoop-Hdfs-0.23-Build #539 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/539/])
svn merge -c 1450807 FIXES: YARN-426. Failure to download a public resource 
prevents further downloads (Jason Lowe via bobby) (Revision 1450813)

 Result = UNSTABLE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450813
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java


 Failure to download a public resource on a node prevents further downloads of 
 the resource from that node
 -

 Key: YARN-426
 URL: https://issues.apache.org/jira/browse/YARN-426
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 3.0.0, 0.23.7, 2.0.4-beta

 Attachments: YARN-426.patch


 If the NM encounters an error while downloading a public resource, it fails 
 to empty the list of request events corresponding to the resource request in 
 {{attempts}}.  If the same public resource is subsequently requested on that 
 node, {{PublicLocalizer.addResource}} will skip the download since it will 
 mistakenly believe a download of that resource is already in progress.  At 
 that point any container that requests the public resource will just hang in 
 the {{LOCALIZING}} state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-426) Failure to download a public resource on a node prevents further downloads of the resource from that node

2013-02-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589501#comment-13589501
 ] 

Hudson commented on YARN-426:
-

Integrated in Hadoop-Hdfs-trunk #1330 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1330/])
YARN-426. Failure to download a public resource prevents further downloads 
(Jason Lowe via bobby) (Revision 1450807)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450807
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java


 Failure to download a public resource on a node prevents further downloads of 
 the resource from that node
 -

 Key: YARN-426
 URL: https://issues.apache.org/jira/browse/YARN-426
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 3.0.0, 0.23.7, 2.0.4-beta

 Attachments: YARN-426.patch


 If the NM encounters an error while downloading a public resource, it fails 
 to empty the list of request events corresponding to the resource request in 
 {{attempts}}.  If the same public resource is subsequently requested on that 
 node, {{PublicLocalizer.addResource}} will skip the download since it will 
 mistakenly believe a download of that resource is already in progress.  At 
 that point any container that requests the public resource will just hang in 
 the {{LOCALIZING}} state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-426) Failure to download a public resource on a node prevents further downloads of the resource from that node

2013-02-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589525#comment-13589525
 ] 

Hudson commented on YARN-426:
-

Integrated in Hadoop-Mapreduce-trunk #1358 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1358/])
YARN-426. Failure to download a public resource prevents further downloads 
(Jason Lowe via bobby) (Revision 1450807)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450807
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java


 Failure to download a public resource on a node prevents further downloads of 
 the resource from that node
 -

 Key: YARN-426
 URL: https://issues.apache.org/jira/browse/YARN-426
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 3.0.0, 0.23.7, 2.0.4-beta

 Attachments: YARN-426.patch


 If the NM encounters an error while downloading a public resource, it fails 
 to empty the list of request events corresponding to the resource request in 
 {{attempts}}.  If the same public resource is subsequently requested on that 
 node, {{PublicLocalizer.addResource}} will skip the download since it will 
 mistakenly believe a download of that resource is already in progress.  At 
 that point any container that requests the public resource will just hang in 
 the {{LOCALIZING}} state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-02-28 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-149:


Summary: ResourceManager (RM) High-Availability (HA)  (was: ZK-based High 
Availability (HA) for ResourceManager (RM))

 ResourceManager (RM) High-Availability (HA)
 ---

 Key: YARN-149
 URL: https://issues.apache.org/jira/browse/YARN-149
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Harsh J
Assignee: Bikas Saha

 One of the goals presented on MAPREDUCE-279 was to have high availability. 
 One way that was discussed, per Mahadev/others on 
 https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:
 {quote}
 Am not sure, if you already know about the MR-279 branch (the next version of 
 MR framework). We've been trying to integrate ZK into the framework from the 
 beginning. As for now, we are just doing restart with ZK but soon we should 
 have a HA soln with ZK.
 {quote}
 There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is 
 meant to track HA via ZK.
 Currently there isn't a HA solution for RM, via ZK or otherwise.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-28 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589785#comment-13589785
 ] 

Sandy Ryza commented on YARN-417:
-

bq. I think if ContainerExitCodes needs to be added then it should be its own 
jira 
Will move the container exit codes in a separate JIRA.

bq. The helper function would have helped because containers contain 
information set by 2 entities...
The issue is that there is not a ton of information for a helper function to 
interpret.  From what I can tell, The framework only defines two special exit 
codes, and does not distinguish between OOMs and other kinds of container 
failures, or between killing a container because it was preempted or because 
the RM lost track of it.  These exit codes are platform independent, and any 
other exit codes can be both application and platform dependent, so the 
AMRMClientAsync wouldn't know how to interpret them.  As ContainerStatuses 
coming from the RM are only in the context of container completions, 
ContainerState provides no extra information. Additional information can 
sometimes be found in the diagnostics strings, but if the reasons that 
containers die are to be codified, I don't think it should be done by 
interpreting strings at the API level.

bq. Why is client.start() being called in init? client.stop() is being called 
in stop().
registerApplicationMaster needs to be called after setting up the RM proxy, 
which occurs in AMRMClient#start, but before starting the heartbeater, which 
occurs in AMRMClientAsync#start.  Another way to accomplish this would be to 
move the code in AMRMClientImpl#start to AMRMClientImpl#init, which also seems 
reasonable to me.  A third way would be to call registerApplicationMaster from 
AMRMClientAsync#start.

bq. I am wary of calling back on the heartbeat thread itself.
Will add a handling thread.

bq. Not waiting for the thread to join()? Why interrupt()? Thread needs to be 
stopped first so that it stops calling into the client. or else it can call 
into a client that has already stopped.
Good point. My reason was that I've seen this as convention other places in 
YARN (see NodeStatusUpdaterImpl, for example), and that it would allow stop to 
be called from onContainerCompleted without deadlock, but with the handling 
thread, the latter shouldn't be a problem, so I'll change it.


 Add a poller that allows the AM to receive notifications when it is assigned 
 containers
 ---

 Key: YARN-417
 URL: https://issues.apache.org/jira/browse/YARN-417
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, applications
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, 
 YARN-417-1.patch, YARN-417.patch, YarnAppMaster.java, 
 YarnAppMasterListener.java


 Writing AMs would be easier for some if they did not have to handle 
 heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-173) Page navigation support for container logs page

2013-02-28 Thread omkar vinit joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

omkar vinit joshi reassigned YARN-173:
--

Assignee: omkar vinit joshi

 Page navigation support for container logs page
 ---

 Key: YARN-173
 URL: https://issues.apache.org/jira/browse/YARN-173
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.0.2-alpha, 0.23.3
Reporter: Jason Lowe
Assignee: omkar vinit joshi
  Labels: usability

 ContainerLogsPage and AggregatedLogsBlock both support {{start}} and {{end}} 
 parameters which are a big help when trying to sift through a huge log.  
 However it's annoying to have to manually edit the URL to go through a giant 
 log page-by-page.  It would be very handy if the web page also provided page 
 navigation links so flipping to the next/previous/first/last chunk of log is 
 a simple click away.  Bonus points for providing a way to easily change the 
 size of the log chunk shown per page.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-269) Resource Manager not logging the health_check_script result when taking it out

2013-02-28 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589875#comment-13589875
 ] 

Kihwal Lee commented on YARN-269:
-

please rebase.

 Resource Manager not logging the health_check_script result when taking it out
 --

 Key: YARN-269
 URL: https://issues.apache.org/jira/browse/YARN-269
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 0.23.5
Reporter: Thomas Graves
Assignee: Jason Lowe
 Attachments: YARN-269.patch


 The Resource Manager not logging the health_check_script result when taking 
 it out. This was added to jobtracker in 1.x with MAPREDUCE-2451, we should do 
 the same thing for RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-269) Resource Manager not logging the health_check_script result when taking it out

2013-02-28 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-269:


Attachment: YARN-269.patch

Thanks for the review, Kihwal.  Here's an updated patch.

 Resource Manager not logging the health_check_script result when taking it out
 --

 Key: YARN-269
 URL: https://issues.apache.org/jira/browse/YARN-269
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 0.23.5
Reporter: Thomas Graves
Assignee: Jason Lowe
 Attachments: YARN-269.patch, YARN-269.patch


 The Resource Manager not logging the health_check_script result when taking 
 it out. This was added to jobtracker in 1.x with MAPREDUCE-2451, we should do 
 the same thing for RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-269) Resource Manager not logging the health_check_script result when taking it out

2013-02-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589905#comment-13589905
 ] 

Hadoop QA commented on YARN-269:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571462/YARN-269.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/447//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/447//console

This message is automatically generated.

 Resource Manager not logging the health_check_script result when taking it out
 --

 Key: YARN-269
 URL: https://issues.apache.org/jira/browse/YARN-269
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 0.23.5
Reporter: Thomas Graves
Assignee: Jason Lowe
 Attachments: YARN-269.patch, YARN-269.patch


 The Resource Manager not logging the health_check_script result when taking 
 it out. This was added to jobtracker in 1.x with MAPREDUCE-2451, we should do 
 the same thing for RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-237) Refreshing the RM page forgets how many rows I had in my Datatables

2013-02-28 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589976#comment-13589976
 ] 

Robert Joseph Evans commented on YARN-237:
--

The change looks more or less OK to me.  I am not thrilled about how we modify 
the data table's init string by looking for the first '{', but I think it is 
OK.  I just have a few concerns, and most if it deals with my lack of knowledge 
about jQuery and localStorage.  I know that localStorage is not supported on 
all browsers.  I also know that localStorage can throw a QUOTA_EXCEEDED 
exception.  What happens when we run into these situations?  Will the page stop 
working or will jQuery degrade gracefully and simply not allow us to save the 
data.  What about if the data stored in the key is not what we expect.  Will 
jQuery make the page unusable.  We currently have tables with the same name on 
different pages.  If they are not kept in sync there could be some issues with 
the data that is saved.

Which brings up another point I am also a bit concerned about the key we are 
using as part of the localStorage.  The key is the id of the data table.  I 
would prefer it if we could some how make it obvious that these values are for 
a data table, and not some other apps storage.

 Refreshing the RM page forgets how many rows I had in my Datatables
 ---

 Key: YARN-237
 URL: https://issues.apache.org/jira/browse/YARN-237
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0
Reporter: Ravi Prakash
Assignee: jian he
  Labels: usability
 Attachments: YARN-237.patch


 If I choose a 100 rows, and then refresh the page, DataTables goes back to 
 showing me 20 rows.
 This user preference should be stored in a cookie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-406) TestRackResolver fails when local network resolves host1 to a valid host

2013-02-28 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589985#comment-13589985
 ] 

Siddharth Seth commented on YARN-406:
-

+1. Committing this.

 TestRackResolver fails when local network resolves host1 to a valid host
 --

 Key: YARN-406
 URL: https://issues.apache.org/jira/browse/YARN-406
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Hitesh Shah
Assignee: Hitesh Shah
Priority: Minor
 Attachments: YARN-406.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-406) TestRackResolver fails when local network resolves host1 to a valid host

2013-02-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1359#comment-1359
 ] 

Hudson commented on YARN-406:
-

Integrated in Hadoop-trunk-Commit #3401 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3401/])
YARN-406. Fix TestRackResolver to function in networks where host1 
resolves to a valid host. Contributed by Hitesh Shah. (Revision 1451391)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1451391
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestRackResolver.java


 TestRackResolver fails when local network resolves host1 to a valid host
 --

 Key: YARN-406
 URL: https://issues.apache.org/jira/browse/YARN-406
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Hitesh Shah
Assignee: Hitesh Shah
Priority: Minor
 Fix For: 2.0.4-beta

 Attachments: YARN-406.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM.

2013-02-28 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-196:
---

Attachment: YARN-196.6.patch

 Nodemanager if started before starting Resource manager is getting 
 shutdown.But if both RM and NM are started and then after if RM is going 
 down,NM is retrying for the RM.
 ---

 Key: YARN-196
 URL: https://issues.apache.org/jira/browse/YARN-196
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.0-alpha
Reporter: Ramgopal N
Assignee: Xuan Gong
 Attachments: MAPREDUCE-3676.patch, YARN-196.1.patch, 
 YARN-196.2.patch, YARN-196.3.patch, YARN-196.4.patch, YARN-196.5.patch, 
 YARN-196.6.patch


 If NM is started before starting the RM ,NM is shutting down with the 
 following error
 {code}
 ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting 
 services org.apache.hadoop.yarn.server.nodemanager.NodeManager
 org.apache.avro.AvroRuntimeException: 
 java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149)
   at 
 org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242)
 Caused by: java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145)
   ... 3 more
 Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
 Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on 
 connection exception: java.net.ConnectException: Connection refused; For more 
 details see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131)
   at $Proxy23.registerNodeManager(Unknown Source)
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
   ... 5 more
 Caused by: java.net.ConnectException: Call From 
 HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection 
 exception: java.net.ConnectException: Connection refused; For more details 
 see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857)
   at org.apache.hadoop.ipc.Client.call(Client.java:1141)
   at org.apache.hadoop.ipc.Client.call(Client.java:1100)
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:128)
   ... 7 more
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:659)
   at 
 org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:469)
   at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:563)
   at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:211)
   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1247)
   at org.apache.hadoop.ipc.Client.call(Client.java:1117)
   ... 9 more
 2012-01-16 15:04:13,336 WARN org.apache.hadoop.yarn.event.AsyncDispatcher: 
 AsyncDispatcher thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1934)
   at 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:76)
   at java.lang.Thread.run(Thread.java:619)
 2012-01-16 15:04:13,337 INFO org.apache.hadoop.yarn.service.AbstractService: 
 Service:Dispatcher is stopped.
 2012-01-16 15:04:13,392 INFO org.mortbay.log: 

[jira] [Commented] (YARN-237) Refreshing the RM page forgets how many rows I had in my Datatables

2013-02-28 Thread jian he (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590009#comment-13590009
 ] 

jian he commented on YARN-237:
--

Tried with latest version of firefox ,safari, chrome.It works. I will wrapper 
some code to check if the browser doesn't support,then it goes back to previous 
way. And I'll catch the exception you mentioned. 
The key is is already the id of the data table. This page has two data tables, 
one for listing all nodes, and the other for 
listing all applications. That's why the state is saved separately for 'nodes' 
and 'applications'. Lacking knowledge about jQuery and localStorage, not sure 
other potential problems.

 Refreshing the RM page forgets how many rows I had in my Datatables
 ---

 Key: YARN-237
 URL: https://issues.apache.org/jira/browse/YARN-237
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0
Reporter: Ravi Prakash
Assignee: jian he
  Labels: usability
 Attachments: YARN-237.patch


 If I choose a 100 rows, and then refresh the page, DataTables goes back to 
 showing me 20 rows.
 This user preference should be stored in a cookie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-24) Nodemanager fails to start if log aggregation enabled and namenode unavailable

2013-02-28 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590015#comment-13590015
 ] 

Alejandro Abdelnur commented on YARN-24:


if we take the NM to unhealthy, then what will try to reset it back to healthy? 
I think the simplest way to handle this is to remove the creation of the dir on 
init and do that on app start time.

 Nodemanager fails to start if log aggregation enabled and namenode unavailable
 --

 Key: YARN-24
 URL: https://issues.apache.org/jira/browse/YARN-24
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Jason Lowe
Assignee: Sandy Ryza
 Attachments: YARN-24.patch


 If log aggregation is enabled and the namenode is currently unavailable, the 
 nodemanager fails to startup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM

2013-02-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590025#comment-13590025
 ] 

Hadoop QA commented on YARN-196:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571485/YARN-196.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/448//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/448//console

This message is automatically generated.

 Nodemanager if started before starting Resource manager is getting 
 shutdown.But if both RM and NM are started and then after if RM is going 
 down,NM is retrying for the RM.
 ---

 Key: YARN-196
 URL: https://issues.apache.org/jira/browse/YARN-196
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.0-alpha
Reporter: Ramgopal N
Assignee: Xuan Gong
 Attachments: MAPREDUCE-3676.patch, YARN-196.1.patch, 
 YARN-196.2.patch, YARN-196.3.patch, YARN-196.4.patch, YARN-196.5.patch, 
 YARN-196.6.patch


 If NM is started before starting the RM ,NM is shutting down with the 
 following error
 {code}
 ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting 
 services org.apache.hadoop.yarn.server.nodemanager.NodeManager
 org.apache.avro.AvroRuntimeException: 
 java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149)
   at 
 org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242)
 Caused by: java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145)
   ... 3 more
 Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
 Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on 
 connection exception: java.net.ConnectException: Connection refused; For more 
 details see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131)
   at $Proxy23.registerNodeManager(Unknown Source)
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
   ... 5 more
 Caused by: java.net.ConnectException: Call From 
 HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection 
 exception: java.net.ConnectException: Connection refused; For more details 
 see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857)
   at org.apache.hadoop.ipc.Client.call(Client.java:1141)
   at org.apache.hadoop.ipc.Client.call(Client.java:1100)
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:128)
   ... 7 more
 Caused by: java.net.ConnectException: Connection 

[jira] [Created] (YARN-435) Make it easier to access cluster topology information in an AM

2013-02-28 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-435:


 Summary: Make it easier to access cluster topology information in 
an AM
 Key: YARN-435
 URL: https://issues.apache.org/jira/browse/YARN-435
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah


ClientRMProtocol exposes a getClusterNodes api that provides a report on all 
nodes in the cluster including their rack information. 

However, this requires the AM to open and establish a separate connection to 
the RM in addition to one for the AMRMProtocol. 



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-28 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated YARN-376:


Attachment: YARN-376.patch

Re-uploading the second patch for jenkins.

 Apps that have completed can appear as RUNNING on the NM UI
 ---

 Key: YARN-376
 URL: https://issues.apache.org/jira/browse/YARN-376
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: YARN-376.patch, YARN-376.patch, YARN-376.patch, 
 YARN-376.patch


 On a busy cluster we've noticed a growing number of applications appear as 
 RUNNING on a nodemanager web pages but the applications have long since 
 finished.  Looking at the NM logs, it appears the RM never told the 
 nodemanager that the application had finished.  This is also reflected in a 
 jstack of the NM process, since many more log aggregation threads are running 
 then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-28 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590126#comment-13590126
 ] 

Siddharth Seth commented on YARN-376:
-

bq. Thanks for the review, Sidd. I originally had it update the heartbeat since 
the RMNode interface already knew about the heartbeat type and it's more 
efficient (don't need to create an extra copy of the app list and grab the 
write lock only once instead of twice).
Good point. Re-uploading the old patch again for Jenkins.

 Apps that have completed can appear as RUNNING on the NM UI
 ---

 Key: YARN-376
 URL: https://issues.apache.org/jira/browse/YARN-376
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: YARN-376.patch, YARN-376.patch, YARN-376.patch, 
 YARN-376.patch


 On a busy cluster we've noticed a growing number of applications appear as 
 RUNNING on a nodemanager web pages but the applications have long since 
 finished.  Looking at the NM logs, it appears the RM never told the 
 nodemanager that the application had finished.  This is also reflected in a 
 jstack of the NM process, since many more log aggregation threads are running 
 then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590150#comment-13590150
 ] 

Hadoop QA commented on YARN-376:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571514/YARN-376.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/449//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/449//console

This message is automatically generated.

 Apps that have completed can appear as RUNNING on the NM UI
 ---

 Key: YARN-376
 URL: https://issues.apache.org/jira/browse/YARN-376
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: YARN-376.patch, YARN-376.patch, YARN-376.patch, 
 YARN-376.patch


 On a busy cluster we've noticed a growing number of applications appear as 
 RUNNING on a nodemanager web pages but the applications have long since 
 finished.  Looking at the NM logs, it appears the RM never told the 
 nodemanager that the application had finished.  This is also reflected in a 
 jstack of the NM process, since many more log aggregation threads are running 
 then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-345) Many InvalidStateTransitonException errors for ApplicationImpl in Node Manager

2013-02-28 Thread Robert Parker (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Parker updated YARN-345:
---

Attachment: YARN-354v2.patch

 Many InvalidStateTransitonException errors for ApplicationImpl in Node Manager
 --

 Key: YARN-345
 URL: https://issues.apache.org/jira/browse/YARN-345
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.2-alpha, 2.0.1-alpha, 0.23.5
Reporter: Devaraj K
Assignee: Robert Parker
Priority: Critical
 Attachments: YARN-345.patch, YARN-354v2.patch


 {code:xml}
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 FINISH_APPLICATION at FINISHED
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:662)
 {code}
 {code:xml}
 2013-01-17 04:03:46,726 WARN 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
  Can't handle this event at current state
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 FINISH_APPLICATION at APPLICATION_RESOURCES_CLEANINGUP
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:662)
 {code}
 {code:xml}
 2013-01-17 00:01:11,006 WARN 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
  Can't handle this event at current state
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 FINISH_APPLICATION at FINISHING_CONTAINERS_WAIT
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:662)
 {code}
 {code:xml}
 2013-01-17 10:56:36,975 INFO 
 

[jira] [Updated] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-28 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated YARN-376:


Attachment: YARN-376_branch-0.23.txt

 Apps that have completed can appear as RUNNING on the NM UI
 ---

 Key: YARN-376
 URL: https://issues.apache.org/jira/browse/YARN-376
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: YARN-376_branch-0.23.txt, YARN-376.patch, 
 YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376-trunk.txt


 On a busy cluster we've noticed a growing number of applications appear as 
 RUNNING on a nodemanager web pages but the applications have long since 
 finished.  Looking at the NM logs, it appears the RM never told the 
 nodemanager that the application had finished.  This is also reflected in a 
 jstack of the NM process, since many more log aggregation threads are running 
 then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-28 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated YARN-376:


Attachment: YARN-376-trunk.txt

 Apps that have completed can appear as RUNNING on the NM UI
 ---

 Key: YARN-376
 URL: https://issues.apache.org/jira/browse/YARN-376
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: YARN-376_branch-0.23.txt, YARN-376.patch, 
 YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376-trunk.txt


 On a busy cluster we've noticed a growing number of applications appear as 
 RUNNING on a nodemanager web pages but the applications have long since 
 finished.  Looking at the NM logs, it appears the RM never told the 
 nodemanager that the application had finished.  This is also reflected in a 
 jstack of the NM process, since many more log aggregation threads are running 
 then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-436) Document how to use DistributedShell yarn application

2013-02-28 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-436:


 Summary: Document how to use DistributedShell yarn application
 Key: YARN-436
 URL: https://issues.apache.org/jira/browse/YARN-436
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-437) Update documentation of Writing Yarn applications to match current best practices

2013-02-28 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-437:


 Summary: Update documentation of Writing Yarn applications to 
match current best practices
 Key: YARN-437
 URL: https://issues.apache.org/jira/browse/YARN-437
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah


Should fix docs to point to usage of YarnClient and AMRMClient helper libs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-437) Update documentation of Writing Yarn applications to match current best practices

2013-02-28 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-437:
-

Labels: usability  (was: )

 Update documentation of Writing Yarn applications to match current best 
 practices
 -

 Key: YARN-437
 URL: https://issues.apache.org/jira/browse/YARN-437
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
  Labels: usability

 Should fix docs to point to usage of YarnClient and AMRMClient helper libs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-436) Document how to use DistributedShell yarn application

2013-02-28 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-436:
-

Labels: usability  (was: )

 Document how to use DistributedShell yarn application
 -

 Key: YARN-436
 URL: https://issues.apache.org/jira/browse/YARN-436
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
  Labels: usability



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-345) Many InvalidStateTransitonException errors for ApplicationImpl in Node Manager

2013-02-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590171#comment-13590171
 ] 

Hadoop QA commented on YARN-345:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571519/YARN-354v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/450//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/450//console

This message is automatically generated.

 Many InvalidStateTransitonException errors for ApplicationImpl in Node Manager
 --

 Key: YARN-345
 URL: https://issues.apache.org/jira/browse/YARN-345
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.2-alpha, 2.0.1-alpha, 0.23.5
Reporter: Devaraj K
Assignee: Robert Parker
Priority: Critical
 Attachments: YARN-345.patch, YARN-354v2.patch


 {code:xml}
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 FINISH_APPLICATION at FINISHED
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:662)
 {code}
 {code:xml}
 2013-01-17 04:03:46,726 WARN 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
  Can't handle this event at current state
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 FINISH_APPLICATION at APPLICATION_RESOURCES_CLEANINGUP
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:662)
 {code}
 {code:xml}
 2013-01-17 

[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590175#comment-13590175
 ] 

Hadoop QA commented on YARN-376:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571522/YARN-376-trunk.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/451//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/451//console

This message is automatically generated.

 Apps that have completed can appear as RUNNING on the NM UI
 ---

 Key: YARN-376
 URL: https://issues.apache.org/jira/browse/YARN-376
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: YARN-376_branch-0.23.txt, YARN-376.patch, 
 YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376-trunk.txt


 On a busy cluster we've noticed a growing number of applications appear as 
 RUNNING on a nodemanager web pages but the applications have long since 
 finished.  Looking at the NM logs, it appears the RM never told the 
 nodemanager that the application had finished.  This is also reflected in a 
 jstack of the NM process, since many more log aggregation threads are running 
 then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-437) Update documentation of Writing Yarn applications to match current best practices

2013-02-28 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590200#comment-13590200
 ] 

Sandy Ryza commented on YARN-437:
-

YARN-417 is adding an async AMRMClient to simplify writing apps, so it might be 
good to incorporate that when it's finished. 

 Update documentation of Writing Yarn applications to match current best 
 practices
 -

 Key: YARN-437
 URL: https://issues.apache.org/jira/browse/YARN-437
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
  Labels: usability

 Should fix docs to point to usage of YarnClient and AMRMClient helper libs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-28 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590260#comment-13590260
 ] 

Siddharth Seth commented on YARN-376:
-

The eclipse failures is not related. Committing this.

 Apps that have completed can appear as RUNNING on the NM UI
 ---

 Key: YARN-376
 URL: https://issues.apache.org/jira/browse/YARN-376
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Attachments: YARN-376_branch-0.23.txt, YARN-376.patch, 
 YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376-trunk.txt


 On a busy cluster we've noticed a growing number of applications appear as 
 RUNNING on a nodemanager web pages but the applications have long since 
 finished.  Looking at the NM logs, it appears the RM never told the 
 nodemanager that the application had finished.  This is also reflected in a 
 jstack of the NM process, since many more log aggregation threads are running 
 then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590269#comment-13590269
 ] 

Hudson commented on YARN-376:
-

Integrated in Hadoop-trunk-Commit #3403 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3403/])
YARN-376. Fixes a bug which would prevent the NM knowing about completed 
containers and applications. Contributed by Jason Lowe. (Revision 1451473)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1451473
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java


 Apps that have completed can appear as RUNNING on the NM UI
 ---

 Key: YARN-376
 URL: https://issues.apache.org/jira/browse/YARN-376
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.3-alpha, 0.23.6
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Fix For: 0.23.7, 2.0.4-beta

 Attachments: YARN-376_branch-0.23.txt, YARN-376.patch, 
 YARN-376.patch, YARN-376.patch, YARN-376.patch, YARN-376-trunk.txt


 On a busy cluster we've noticed a growing number of applications appear as 
 RUNNING on a nodemanager web pages but the applications have long since 
 finished.  Looking at the NM logs, it appears the RM never told the 
 nodemanager that the application had finished.  This is also reflected in a 
 jstack of the NM process, since many more log aggregation threads are running 
 then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira