[jira] [Created] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-03-11 Thread Aleksey Gorshkov (JIRA)
Aleksey Gorshkov created YARN-465:
-

 Summary: fix coverage  org.apache.hadoop.yarn.server.webproxy
 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
 Attachments: YARN-465-branch-0.23.patch, YARN-465-branch-2.patch, 
YARN-465-trunk.patch

fix coverage  org.apache.hadoop.yarn.server.webproxy

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-03-11 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-465:
--

Attachment: YARN-465-trunk.patch
YARN-465-branch-2.patch
YARN-465-branch-0.23.patch

 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
 Attachments: YARN-465-branch-0.23.patch, YARN-465-branch-2.patch, 
 YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-03-11 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-465:
--

Description: 
fix coverage  org.apache.hadoop.yarn.server.webproxy


  was:fix coverage  org.apache.hadoop.yarn.server.webproxy


 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
 Attachments: YARN-465-branch-0.23.patch, YARN-465-branch-2.patch, 
 YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-03-11 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-465:
--

Description: 
fix coverage  org.apache.hadoop.yarn.server.webproxy
patch YARN-465-trunk.patch for trunk
patch YARN-465-branch-2.patch for branch-2
patch YARN-465-branch-0.23.patch for branch-0.23

There is issue in branch-0.23 . Patch does not creating .keep file.
For fix it need to run commands:

mkdir 
yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
touch 
yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
 


  was:
fix coverage  org.apache.hadoop.yarn.server.webproxy



 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
 Attachments: YARN-465-branch-0.23.patch, YARN-465-branch-2.patch, 
 YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy
 patch YARN-465-trunk.patch for trunk
 patch YARN-465-branch-2.patch for branch-2
 patch YARN-465-branch-0.23.patch for branch-0.23
 There is issue in branch-0.23 . Patch does not creating .keep file.
 For fix it need to run commands:
 mkdir 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
 touch 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-03-11 Thread Dennis Y (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598777#comment-13598777
 ] 

Dennis Y commented on YARN-465:
---

Patch contains new file 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
Which is ignored by patch tool. This is a know issue/feature of patch tool.
What should we do in such cases to pass ASF verification process?
Do you have some common practice for such cases? 

 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
 Attachments: YARN-465-branch-0.23.patch, YARN-465-branch-2.patch, 
 YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy
 patch YARN-465-trunk.patch for trunk
 patch YARN-465-branch-2.patch for branch-2
 patch YARN-465-branch-0.23.patch for branch-0.23
 There is issue in branch-0.23 . Patch does not creating .keep file.
 For fix it need to run commands:
 mkdir 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
 touch 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-465) fix coverage org.apache.hadoop.yarn.server.webproxy

2013-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598784#comment-13598784
 ] 

Hadoop QA commented on YARN-465:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573083/YARN-465-trunk.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/494//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/494//console

This message is automatically generated.

 fix coverage  org.apache.hadoop.yarn.server.webproxy
 

 Key: YARN-465
 URL: https://issues.apache.org/jira/browse/YARN-465
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
 Attachments: YARN-465-branch-0.23.patch, YARN-465-branch-2.patch, 
 YARN-465-trunk.patch


 fix coverage  org.apache.hadoop.yarn.server.webproxy
 patch YARN-465-trunk.patch for trunk
 patch YARN-465-branch-2.patch for branch-2
 patch YARN-465-branch-0.23.patch for branch-0.23
 There is issue in branch-0.23 . Patch does not creating .keep file.
 For fix it need to run commands:
 mkdir 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
 touch 
 yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-460) CS user left in list of active users for the queue even when application finished

2013-03-11 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598814#comment-13598814
 ] 

Thomas Graves commented on YARN-460:


Yeah I think that is a good idea.  Perhaps something in the AMResponse. It 
might be nice if we could include a description of the problem as well, I'll 
look at it more to see what I can come up with.

 CS user left in list of active users for the queue even when application 
 finished
 -

 Key: YARN-460
 URL: https://issues.apache.org/jira/browse/YARN-460
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Critical
 Attachments: YARN-460-branch-0.23.patch, YARN-460.patch, 
 YARN-460.patch


 We have seen a user get left in the queues list of active users even though 
 the application was removed. This can cause everyone else in the queue to get 
 less resources if using the minimum user limit percent config.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM.

2013-03-11 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-196:
---

Attachment: YARN-196.11.patch

 Nodemanager if started before starting Resource manager is getting 
 shutdown.But if both RM and NM are started and then after if RM is going 
 down,NM is retrying for the RM.
 ---

 Key: YARN-196
 URL: https://issues.apache.org/jira/browse/YARN-196
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.0-alpha
Reporter: Ramgopal N
Assignee: Xuan Gong
 Attachments: MAPREDUCE-3676.patch, YARN-196.10.patch, 
 YARN-196.11.patch, YARN-196.1.patch, YARN-196.2.patch, YARN-196.3.patch, 
 YARN-196.4.patch, YARN-196.5.patch, YARN-196.6.patch, YARN-196.7.patch, 
 YARN-196.8.patch, YARN-196.9.patch


 If NM is started before starting the RM ,NM is shutting down with the 
 following error
 {code}
 ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting 
 services org.apache.hadoop.yarn.server.nodemanager.NodeManager
 org.apache.avro.AvroRuntimeException: 
 java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149)
   at 
 org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242)
 Caused by: java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145)
   ... 3 more
 Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
 Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on 
 connection exception: java.net.ConnectException: Connection refused; For more 
 details see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131)
   at $Proxy23.registerNodeManager(Unknown Source)
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
   ... 5 more
 Caused by: java.net.ConnectException: Call From 
 HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection 
 exception: java.net.ConnectException: Connection refused; For more details 
 see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857)
   at org.apache.hadoop.ipc.Client.call(Client.java:1141)
   at org.apache.hadoop.ipc.Client.call(Client.java:1100)
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:128)
   ... 7 more
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:659)
   at 
 org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:469)
   at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:563)
   at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:211)
   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1247)
   at org.apache.hadoop.ipc.Client.call(Client.java:1117)
   ... 9 more
 2012-01-16 15:04:13,336 WARN org.apache.hadoop.yarn.event.AsyncDispatcher: 
 AsyncDispatcher thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1934)
   at 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:76)
   at java.lang.Thread.run(Thread.java:619)
 2012-01-16 15:04:13,337 INFO 

[jira] [Commented] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM

2013-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599024#comment-13599024
 ] 

Hadoop QA commented on YARN-196:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573120/YARN-196.11.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/495//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/495//console

This message is automatically generated.

 Nodemanager if started before starting Resource manager is getting 
 shutdown.But if both RM and NM are started and then after if RM is going 
 down,NM is retrying for the RM.
 ---

 Key: YARN-196
 URL: https://issues.apache.org/jira/browse/YARN-196
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.0-alpha
Reporter: Ramgopal N
Assignee: Xuan Gong
 Attachments: MAPREDUCE-3676.patch, YARN-196.10.patch, 
 YARN-196.11.patch, YARN-196.1.patch, YARN-196.2.patch, YARN-196.3.patch, 
 YARN-196.4.patch, YARN-196.5.patch, YARN-196.6.patch, YARN-196.7.patch, 
 YARN-196.8.patch, YARN-196.9.patch


 If NM is started before starting the RM ,NM is shutting down with the 
 following error
 {code}
 ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting 
 services org.apache.hadoop.yarn.server.nodemanager.NodeManager
 org.apache.avro.AvroRuntimeException: 
 java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149)
   at 
 org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242)
 Caused by: java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145)
   ... 3 more
 Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
 Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on 
 connection exception: java.net.ConnectException: Connection refused; For more 
 details see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131)
   at $Proxy23.registerNodeManager(Unknown Source)
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
   ... 5 more
 Caused by: java.net.ConnectException: Call From 
 HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection 
 exception: java.net.ConnectException: Connection refused; For more details 
 see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857)
   at org.apache.hadoop.ipc.Client.call(Client.java:1141)
   at org.apache.hadoop.ipc.Client.call(Client.java:1100)
   at 
 

[jira] [Updated] (YARN-464) RM should inform AM how many AM retries it has

2013-03-11 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-464:


Summary: RM should inform AM how many AM retries it has  (was: RM should 
inform AM how many AM retires it has)

 RM should inform AM how many AM retries it has
 --

 Key: YARN-464
 URL: https://issues.apache.org/jira/browse/YARN-464
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 Currently, MRAppMaster compares the ApplicationAttemptId with the max retries 
 read from the configuration file to determine isLastAMRetry. However, RM 
 should inform AM how many AM retires it has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-11 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599091#comment-13599091
 ] 

Xuan Gong commented on YARN-71:
---

The approach I am using:
1. rename the folder (local files are not always belong to NM_user, So, rename 
the parent folder instead of files)
2. find the files' or dir's ownership and schedule the deletion, then delete 
them at the back-end.

 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-237) Refreshing the RM page forgets how many rows I had in my Datatables

2013-03-11 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599236#comment-13599236
 ] 

Robert Joseph Evans commented on YARN-237:
--

Sorry to keep adding more things in here, but JQuery.java is a generic part of 
YARN.  It, in theory, can be used by others not just Map/Reduce and YARN.  
Encoding in a special case for the tasks table is not acceptable.  You should 
be able to get the same functionality by switching to the DATATABLES_SELECTOR 
for those tables.

We also need to address the find bugs issues. You are dereferencing type to 
create the ID of the tasks and type could be null, although in practice it 
should never be.  Also there is no need to call toString() on type when using + 
with another string.  This may fix the find bugs issues too, although it would 
not be super clean.

 Refreshing the RM page forgets how many rows I had in my Datatables
 ---

 Key: YARN-237
 URL: https://issues.apache.org/jira/browse/YARN-237
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0
Reporter: Ravi Prakash
Assignee: jian he
  Labels: usability
 Attachments: YARN-237.patch, YARN-237.v2.patch, YARN-237.v3.patch


 If I choose a 100 rows, and then refresh the page, DataTables goes back to 
 showing me 20 rows.
 This user preference should be stored in a cookie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-378) ApplicationMaster retry times should be set by Client

2013-03-11 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599240#comment-13599240
 ] 

Robert Joseph Evans commented on YARN-378:
--

I am perfectly fine with that.  It seems like more overhead, but I am fine 
either way.

 ApplicationMaster retry times should be set by Client
 -

 Key: YARN-378
 URL: https://issues.apache.org/jira/browse/YARN-378
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
 Environment: suse
Reporter: xieguiming
Assignee: Zhijie Shen
  Labels: usability
 Attachments: YARN-378_1.patch, YARN-378_2.patch, YARN-378_3.patch, 
 YARN-378_4.patch


 We should support that different client or user have different 
 ApplicationMaster retry times. It also say that 
 yarn.resourcemanager.am.max-retries should be set by client. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-378) ApplicationMaster retry times should be set by Client

2013-03-11 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599297#comment-13599297
 ] 

Zhijie Shen commented on YARN-378:
--

Sure, I can merge the two patches together and submit one here.

 ApplicationMaster retry times should be set by Client
 -

 Key: YARN-378
 URL: https://issues.apache.org/jira/browse/YARN-378
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
 Environment: suse
Reporter: xieguiming
Assignee: Zhijie Shen
  Labels: usability
 Attachments: YARN-378_1.patch, YARN-378_2.patch, YARN-378_3.patch, 
 YARN-378_4.patch


 We should support that different client or user have different 
 ApplicationMaster retry times. It also say that 
 yarn.resourcemanager.am.max-retries should be set by client. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-464) RM should inform AM how many AM retries it has

2013-03-11 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599299#comment-13599299
 ] 

Zhijie Shen commented on YARN-464:
--

Will be fixed together in YARN-378.

 RM should inform AM how many AM retries it has
 --

 Key: YARN-464
 URL: https://issues.apache.org/jira/browse/YARN-464
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 Currently, MRAppMaster compares the ApplicationAttemptId with the max retries 
 read from the configuration file to determine isLastAMRetry. However, RM 
 should inform AM how many AM retires it has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (YARN-464) RM should inform AM how many AM retries it has

2013-03-11 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen resolved YARN-464.
--

Resolution: Won't Fix

 RM should inform AM how many AM retries it has
 --

 Key: YARN-464
 URL: https://issues.apache.org/jira/browse/YARN-464
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 Currently, MRAppMaster compares the ApplicationAttemptId with the max retries 
 read from the configuration file to determine isLastAMRetry. However, RM 
 should inform AM how many AM retires it has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (YARN-464) RM should inform AM how many AM retries it has

2013-03-11 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reopened YARN-464:
--


 RM should inform AM how many AM retries it has
 --

 Key: YARN-464
 URL: https://issues.apache.org/jira/browse/YARN-464
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 Currently, MRAppMaster compares the ApplicationAttemptId with the max retries 
 read from the configuration file to determine isLastAMRetry. However, RM 
 should inform AM how many AM retires it has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (YARN-464) RM should inform AM how many AM retries it has

2013-03-11 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-464.
--

Resolution: Duplicate

 RM should inform AM how many AM retries it has
 --

 Key: YARN-464
 URL: https://issues.apache.org/jira/browse/YARN-464
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 Currently, MRAppMaster compares the ApplicationAttemptId with the max retries 
 read from the configuration file to determine isLastAMRetry. However, RM 
 should inform AM how many AM retires it has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-03-11 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-417:


Attachment: YARN-417-4.patch

Updated patch uses interrupt to fix the deadlock that was causing the test to 
fail

 Add a poller that allows the AM to receive notifications when it is assigned 
 containers
 ---

 Key: YARN-417
 URL: https://issues.apache.org/jira/browse/YARN-417
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, applications
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, 
 YARN-417-1.patch, YARN-417-2.patch, YARN-417-3.patch, YARN-417-4.patch, 
 YARN-417-4.patch, YARN-417.patch, YarnAppMaster.java, 
 YarnAppMasterListener.java


 Writing AMs would be easier for some if they did not have to handle 
 heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599380#comment-13599380
 ] 

Hadoop QA commented on YARN-417:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573184/YARN-417-4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/496//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/496//console

This message is automatically generated.

 Add a poller that allows the AM to receive notifications when it is assigned 
 containers
 ---

 Key: YARN-417
 URL: https://issues.apache.org/jira/browse/YARN-417
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, applications
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, 
 YARN-417-1.patch, YARN-417-2.patch, YARN-417-3.patch, YARN-417-4.patch, 
 YARN-417-4.patch, YARN-417.patch, YarnAppMaster.java, 
 YarnAppMasterListener.java


 Writing AMs would be easier for some if they did not have to handle 
 heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-03-11 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599383#comment-13599383
 ] 

Luke Lu commented on YARN-18:
-

Preliminary comments:
# We should use net.topology.nodegroup.aware instead of 
net.topology.with.nodegroup for configuration to be consistent with branch-1 
code.
# We should probably use yarn.resourcemanager.topology.aware.factory.impl to 
specify factory class for topology aware objects.
# The topologoy aware objects should include ScheduledRequests and 
AppSchedulingInfo, which are currently not managed by the factory.
# Pluggable interface/class should be public with the InterfaceAudience.Private 
annotation.

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, 
 YARN-18-v3.2.patch, YARN-18-v3.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM.

2013-03-11 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-196:
---

Attachment: YARN-196.12.patch

 Nodemanager if started before starting Resource manager is getting 
 shutdown.But if both RM and NM are started and then after if RM is going 
 down,NM is retrying for the RM.
 ---

 Key: YARN-196
 URL: https://issues.apache.org/jira/browse/YARN-196
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.0-alpha
Reporter: Ramgopal N
Assignee: Xuan Gong
 Attachments: MAPREDUCE-3676.patch, YARN-196.10.patch, 
 YARN-196.11.patch, YARN-196.12.patch, YARN-196.1.patch, YARN-196.2.patch, 
 YARN-196.3.patch, YARN-196.4.patch, YARN-196.5.patch, YARN-196.6.patch, 
 YARN-196.7.patch, YARN-196.8.patch, YARN-196.9.patch


 If NM is started before starting the RM ,NM is shutting down with the 
 following error
 {code}
 ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting 
 services org.apache.hadoop.yarn.server.nodemanager.NodeManager
 org.apache.avro.AvroRuntimeException: 
 java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149)
   at 
 org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242)
 Caused by: java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145)
   ... 3 more
 Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
 Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on 
 connection exception: java.net.ConnectException: Connection refused; For more 
 details see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131)
   at $Proxy23.registerNodeManager(Unknown Source)
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
   ... 5 more
 Caused by: java.net.ConnectException: Call From 
 HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection 
 exception: java.net.ConnectException: Connection refused; For more details 
 see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857)
   at org.apache.hadoop.ipc.Client.call(Client.java:1141)
   at org.apache.hadoop.ipc.Client.call(Client.java:1100)
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:128)
   ... 7 more
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:659)
   at 
 org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:469)
   at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:563)
   at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:211)
   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1247)
   at org.apache.hadoop.ipc.Client.call(Client.java:1117)
   ... 9 more
 2012-01-16 15:04:13,336 WARN org.apache.hadoop.yarn.event.AsyncDispatcher: 
 AsyncDispatcher thread interrupted
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1934)
   at 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:76)
   at java.lang.Thread.run(Thread.java:619)
 2012-01-16 15:04:13,337 INFO 

[jira] [Updated] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-11 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-71:
--

Attachment: YARN-71.5.patch

 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch, YARN-71.5.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599423#comment-13599423
 ] 

Hadoop QA commented on YARN-71:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573197/YARN-71.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/498//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/498//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/498//console

This message is automatically generated.

 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch, YARN-71.5.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM

2013-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599425#comment-13599425
 ] 

Hadoop QA commented on YARN-196:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573196/YARN-196.12.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/497//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/497//console

This message is automatically generated.

 Nodemanager if started before starting Resource manager is getting 
 shutdown.But if both RM and NM are started and then after if RM is going 
 down,NM is retrying for the RM.
 ---

 Key: YARN-196
 URL: https://issues.apache.org/jira/browse/YARN-196
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.0-alpha
Reporter: Ramgopal N
Assignee: Xuan Gong
 Attachments: MAPREDUCE-3676.patch, YARN-196.10.patch, 
 YARN-196.11.patch, YARN-196.12.patch, YARN-196.1.patch, YARN-196.2.patch, 
 YARN-196.3.patch, YARN-196.4.patch, YARN-196.5.patch, YARN-196.6.patch, 
 YARN-196.7.patch, YARN-196.8.patch, YARN-196.9.patch


 If NM is started before starting the RM ,NM is shutting down with the 
 following error
 {code}
 ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting 
 services org.apache.hadoop.yarn.server.nodemanager.NodeManager
 org.apache.avro.AvroRuntimeException: 
 java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149)
   at 
 org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242)
 Caused by: java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145)
   ... 3 more
 Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
 Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on 
 connection exception: java.net.ConnectException: Connection refused; For more 
 details see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131)
   at $Proxy23.registerNodeManager(Unknown Source)
   at 
 org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
   ... 5 more
 Caused by: java.net.ConnectException: Call From 
 HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection 
 exception: java.net.ConnectException: Connection refused; For more details 
 see:  http://wiki.apache.org/hadoop/ConnectionRefused
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857)
   at org.apache.hadoop.ipc.Client.call(Client.java:1141)
   at org.apache.hadoop.ipc.Client.call(Client.java:1100)
   at 
 

[jira] [Updated] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-11 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-71:
--

Attachment: YARN-71.6.patch

 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch, YARN-71.5.patch, YARN-71.6.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599457#comment-13599457
 ] 

Hadoop QA commented on YARN-71:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573201/YARN-71.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/499//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/499//console

This message is automatically generated.

 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch, YARN-71.5.patch, YARN-71.6.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-466) Slave hostname mismatches in ResourceManager/Scheduler

2013-03-11 Thread Roger Hoover (JIRA)
Roger Hoover created YARN-466:
-

 Summary: Slave hostname mismatches in ResourceManager/Scheduler
 Key: YARN-466
 URL: https://issues.apache.org/jira/browse/YARN-466
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Reporter: Roger Hoover


The problem is that the ResourceManager learns the hostname of a slave node 
when the NodeManager registers itself and it seems the node manager is getting 
the hostname by asking the OS.  When a job is submitted, I think the 
ApplicationMaster learns the hostname by doing a reverse DNS lookup based on 
the slaves file.

Therefore, the ApplicationMaster submits requests for containers using the 
fully qualified domain name (node1.foo.com) but the scheduler uses the OS 
hostname (node1) when checking to see if any requests are node-local.  The 
result is that node-local requests are never found using this method of 
searching for node-local requests:

ResourceRequest request = application.getResourceRequest(priority, 
node.getHostName());

I think it's unfriendly to ask users to make sure they configure hostnames to 
match fully qualified domain names. There should be a way for the 
ApplicationMaster and NodeManager to agree on the hostname.

Steps to Reproduce:
1) Configure the OS hostname on slaves to differ from the fully qualified 
domain name.  For example, if the FQDN for the slave is node1.foo.com, set 
the hostname on the node to be just node1.
2) On submitting a job, observe that the AM submits resource requests using the 
FQDN (e.g. node1.foo.com).  You can add logging to the allocate() method of 
whatever scheduler you're using 

for (ResourceRequest req: ask) {
  LOG.debug(String.format(Request %s for %d containers on %s, req, 
req.getNumContainers(), req.getHostName()));
}
3) Observe that when the scheduler checks for node locality (in the handle() 
method) using the FiCaSchedulerNode.getHostName(), the hostname is uses is the 
one set in the host OS (e.g. node1).  NOTE: if you're using FifoScheduler, 
this bug needs to be fixed first 
(https://issues.apache.org/jira/browse/YARN-412).  


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-24) Nodemanager fails to start if log aggregation enabled and namenode unavailable

2013-03-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599490#comment-13599490
 ] 

Alejandro Abdelnur commented on YARN-24:


The approach looks good. Possible to have a testcase? Have you tests it in a 
cluster forcing the non-creation of the root log dir?

 Nodemanager fails to start if log aggregation enabled and namenode unavailable
 --

 Key: YARN-24
 URL: https://issues.apache.org/jira/browse/YARN-24
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Jason Lowe
Assignee: Sandy Ryza
 Attachments: YARN-24-1.patch, YARN-24.patch


 If log aggregation is enabled and the namenode is currently unavailable, the 
 nodemanager fails to startup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-459) DefaultContainerExecutor doesn't log stderr from container launch

2013-03-11 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-459:


Description: 
The DefaultContainerExecutor does not log stderr or add it to the diagnostics 
message it something fails during the container launch.



  was:
The DefaultContainerExecutor does log stderr or add it to the diagnostics 
message it something fails during the container launch.




 DefaultContainerExecutor doesn't log stderr from container launch
 -

 Key: YARN-459
 URL: https://issues.apache.org/jira/browse/YARN-459
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.3-alpha, 0.23.7
Reporter: Thomas Graves

 The DefaultContainerExecutor does not log stderr or add it to the diagnostics 
 message it something fails during the container launch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-378) ApplicationMaster retry times should be set by Client

2013-03-11 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-378:
-

Attachment: YARN-378_5.patch

In the newest patch, I've addressed @Robert's comments, which is setting 
numMaxRetries when composing the ApplicationSubmissionContext, and adding the 
property name into mapred-default.xml.

In addition, I've modified AMRMProtocol to enable AM to get numMaxRetries from 
RM. The number is obtained when AM gets registered on RM. Previously, 
isLastAMRetry was computed in init(). Now, since the number is not available 
until AM is registered, I postponed the computation until it need to be used.

One place in the patch that I'm not quite happy with is that I had to add 
additional member variable and getter for ContainerAllocator to allow AM to 
read numMaxRetries. Please have a look at patch. Your comments are appreciated.

 ApplicationMaster retry times should be set by Client
 -

 Key: YARN-378
 URL: https://issues.apache.org/jira/browse/YARN-378
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
 Environment: suse
Reporter: xieguiming
Assignee: Zhijie Shen
  Labels: usability
 Attachments: YARN-378_1.patch, YARN-378_2.patch, YARN-378_3.patch, 
 YARN-378_4.patch, YARN-378_5.patch


 We should support that different client or user have different 
 ApplicationMaster retry times. It also say that 
 yarn.resourcemanager.am.max-retries should be set by client. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-467) Jobs fail during resource localization when directories in file cache reaches to unix directory limit for public cache

2013-03-11 Thread omkar vinit joshi (JIRA)
omkar vinit joshi created YARN-467:
--

 Summary: Jobs fail during resource localization when directories 
in file cache reaches to unix directory limit for public cache
 Key: YARN-467
 URL: https://issues.apache.org/jira/browse/YARN-467
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: omkar vinit joshi
Assignee: omkar vinit joshi


If we have multiple jobs which uses distributed cache with small size of files, 
the directory limit reaches before reaching the cache size and fails to create 
any directories in file cache (PUBLIC). The jobs start failing with the below 
exception.

java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 
failed
at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
at 
org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
at 
org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

we need to have a mechanism where in we can create directory hierarchy and 
limit number of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-237) Refreshing the RM page forgets how many rows I had in my Datatables

2013-03-11 Thread jian he (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599544#comment-13599544
 ] 

jian he commented on YARN-237:
--

thanks for your comment Robert, updated the patch.

 Refreshing the RM page forgets how many rows I had in my Datatables
 ---

 Key: YARN-237
 URL: https://issues.apache.org/jira/browse/YARN-237
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0
Reporter: Ravi Prakash
Assignee: jian he
  Labels: usability
 Attachments: YARN-237.patch, YARN-237.v2.patch, YARN-237.v3.patch, 
 YARN-237.v4.patch


 If I choose a 100 rows, and then refresh the page, DataTables goes back to 
 showing me 20 rows.
 This user preference should be stored in a cookie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-378) ApplicationMaster retry times should be set by Client

2013-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599552#comment-13599552
 ] 

Hadoop QA commented on YARN-378:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573216/YARN-378_5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.mapreduce.v2.app.TestStagingCleanup

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/500//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/500//console

This message is automatically generated.

 ApplicationMaster retry times should be set by Client
 -

 Key: YARN-378
 URL: https://issues.apache.org/jira/browse/YARN-378
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
 Environment: suse
Reporter: xieguiming
Assignee: Zhijie Shen
  Labels: usability
 Attachments: YARN-378_1.patch, YARN-378_2.patch, YARN-378_3.patch, 
 YARN-378_4.patch, YARN-378_5.patch


 We should support that different client or user have different 
 ApplicationMaster retry times. It also say that 
 yarn.resourcemanager.am.max-retries should be set by client. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-11 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599622#comment-13599622
 ] 

Siddharth Seth commented on YARN-71:


Thanks for the patches Xuan. Couple of comments on the latest one (71-6.txt)

- The ResourceLocalizationService is already creating a FileContxt instance - 
that can be used.
- The rename/exception code is the same across the 3 top level dirs and can 
move into a function.
- The patch needs some formatting fixes (missing spaces after +, , etc)
- Haven't looked at the unit test yet, but I don't think it belongs in 
TestNodeStatusUpdater.

In terms of the approach of moving dirs and scheduling them for background 
deletion - this seems reasonable to me. Once concern I have though is that this 
could cause some load on the NMs after they're restarted (especially on systems 
with a large dist cache and disks). This effectively ends up affecting 
performance. Anyone else have thoughts on this ? Maybe background / inline 
deletion should be configurable ?

 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch, YARN-71.5.patch, YARN-71.6.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-237) Refreshing the RM page forgets how many rows I had in my Datatables

2013-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599664#comment-13599664
 ] 

Hadoop QA commented on YARN-237:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573220/YARN-237.v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/501//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/501//console

This message is automatically generated.

 Refreshing the RM page forgets how many rows I had in my Datatables
 ---

 Key: YARN-237
 URL: https://issues.apache.org/jira/browse/YARN-237
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0
Reporter: Ravi Prakash
Assignee: jian he
  Labels: usability
 Attachments: YARN-237.patch, YARN-237.v2.patch, YARN-237.v3.patch, 
 YARN-237.v4.patch


 If I choose a 100 rows, and then refresh the page, DataTables goes back to 
 showing me 20 rows.
 This user preference should be stored in a cookie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-198) If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager

2013-03-11 Thread jian he (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jian he updated YARN-198:
-

Attachment: (was: YARN-198.patch)

 If we are navigating to Nodemanager UI from Resourcemanager,then there is not 
 link to navigate back to Resource manager
 ---

 Key: YARN-198
 URL: https://issues.apache.org/jira/browse/YARN-198
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Ramgopal N
Assignee: jian he
Priority: Minor
  Labels: usability

 If we are navigating to Nodemanager by clicking on the node link in RM,there 
 is no link provided on the NM to navigate back to RM.
  If there is a link to navigate back to RM it would be good

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-198) If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager

2013-03-11 Thread jian he (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jian he updated YARN-198:
-

Attachment: YARN-198.patch

 If we are navigating to Nodemanager UI from Resourcemanager,then there is not 
 link to navigate back to Resource manager
 ---

 Key: YARN-198
 URL: https://issues.apache.org/jira/browse/YARN-198
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Ramgopal N
Assignee: jian he
Priority: Minor
  Labels: usability
 Attachments: YARN-198.patch


 If we are navigating to Nodemanager by clicking on the node link in RM,there 
 is no link provided on the NM to navigate back to RM.
  If there is a link to navigate back to RM it would be good

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-449) MRAppMaster classpath not set properly for unit tests in downstream projects

2013-03-11 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599719#comment-13599719
 ] 

Siddharth Seth commented on YARN-449:
-

Ted, are all these tests passing against 2.0.2-alpha ? That seems like the 
first thing that should be addressed - did something changed in 2.0.3 which 
causes them to fail.
Also, does HBase run tests in parallel ? Three different tests failing in 
different test-runs on the build box could be caused by this. Looking at the 
way MiniMRCluster works, the work-dir ends up being the same across tests - so 
that can be problematic. (Attaching a wip patch for this which I've tested on a 
couple of HBase tests - running individually). Would it be possible for you to 
try the patch on a box where you're seeing failures.
From running the three tests locally - TestImportExport consistently hangs. 
The other two seem to work ok.

 MRAppMaster classpath not set properly for unit tests in downstream projects
 

 Key: YARN-449
 URL: https://issues.apache.org/jira/browse/YARN-449
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Priority: Blocker
 Attachments: 7904-v5.txt, hbase-7904-v3.txt, 
 hbase-TestHFileOutputFormat-wip.txt, hbase-TestingUtility-wip.txt


 Post YARN-429, unit tests for HBase continue to fail since the classpath for 
 the MRAppMaster is not being set correctly.
 Reverting YARN-129 may fix this, but I'm not sure that's the correct 
 solution. My guess is, as Alexandro pointed out in YARN-129, maven 
 classloader magic is messing up java.class.path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-449) MRAppMaster classpath not set properly for unit tests in downstream projects

2013-03-11 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated YARN-449:


Attachment: minimr_randomdir-branch2.txt

randomize directories generated when using MiniMRCluster.

 MRAppMaster classpath not set properly for unit tests in downstream projects
 

 Key: YARN-449
 URL: https://issues.apache.org/jira/browse/YARN-449
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Priority: Blocker
 Attachments: 7904-v5.txt, hbase-7904-v3.txt, 
 hbase-TestHFileOutputFormat-wip.txt, hbase-TestingUtility-wip.txt, 
 minimr_randomdir-branch2.txt


 Post YARN-429, unit tests for HBase continue to fail since the classpath for 
 the MRAppMaster is not being set correctly.
 Reverting YARN-129 may fix this, but I'm not sure that's the correct 
 solution. My guess is, as Alexandro pointed out in YARN-129, maven 
 classloader magic is messing up java.class.path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-198) If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager

2013-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599724#comment-13599724
 ] 

Hadoop QA commented on YARN-198:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573259/YARN-198.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/502//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/502//console

This message is automatically generated.

 If we are navigating to Nodemanager UI from Resourcemanager,then there is not 
 link to navigate back to Resource manager
 ---

 Key: YARN-198
 URL: https://issues.apache.org/jira/browse/YARN-198
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Ramgopal N
Assignee: jian he
Priority: Minor
  Labels: usability
 Attachments: YARN-198.patch


 If we are navigating to Nodemanager by clicking on the node link in RM,there 
 is no link provided on the NM to navigate back to RM.
  If there is a link to navigate back to RM it would be good

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-378) ApplicationMaster retry times should be set by Client

2013-03-11 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-378:
-

Attachment: YARN-378_6.patch

Fix the broken test and the javadoc warn.

 ApplicationMaster retry times should be set by Client
 -

 Key: YARN-378
 URL: https://issues.apache.org/jira/browse/YARN-378
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
 Environment: suse
Reporter: xieguiming
Assignee: Zhijie Shen
  Labels: usability
 Attachments: YARN-378_1.patch, YARN-378_2.patch, YARN-378_3.patch, 
 YARN-378_4.patch, YARN-378_5.patch, YARN-378_6.patch


 We should support that different client or user have different 
 ApplicationMaster retry times. It also say that 
 yarn.resourcemanager.am.max-retries should be set by client. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-449) MRAppMaster classpath not set properly for unit tests in downstream projects

2013-03-11 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599733#comment-13599733
 ] 

Ted Yu commented on YARN-449:
-

Test failure couldn't be reproduced on Mac. 
On flubber machine, protoc 2.5 gave me error. 
I will try to build with your patch on flubber. 

Thanks

 MRAppMaster classpath not set properly for unit tests in downstream projects
 

 Key: YARN-449
 URL: https://issues.apache.org/jira/browse/YARN-449
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Priority: Blocker
 Attachments: 7904-v5.txt, hbase-7904-v3.txt, 
 hbase-TestHFileOutputFormat-wip.txt, hbase-TestingUtility-wip.txt, 
 minimr_randomdir-branch2.txt


 Post YARN-429, unit tests for HBase continue to fail since the classpath for 
 the MRAppMaster is not being set correctly.
 Reverting YARN-129 may fix this, but I'm not sure that's the correct 
 solution. My guess is, as Alexandro pointed out in YARN-129, maven 
 classloader magic is messing up java.class.path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira