date:20140225


[ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911481#comment-13911481
 ] 

Hudson commented on YARN-1678:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #492 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/492/])
YARN-1678. Fair scheduler gabs incessantly about reservations (Sandy Ryza) 
(sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571468)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Fair scheduler gabs incessantly about reservations
 --

 Key: YARN-1678
 URL: https://issues.apache.org/jira/browse/YARN-1678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.5.0

 Attachments: YARN-1678-1.patch, YARN-1678-1.patch, YARN-1678.patch


 Come on FS. We really don't need to know every time a node with a reservation 
 on it heartbeats.
 {code}
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Trying to fulfill reservation for application 
 appattempt_1390547864213_0347_01 on node: host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
 Making reservation: node=a2330.halxg.cloudera.com 
 app_id=application_1390547864213_0347
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
  Application application_1390547864213_0347 reserved container 
 container_1390547864213_0347_01_03 on node host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8, currently has 6 at priority 0; 
 currentReservation 6144
 2014-01-29 03:48:16,044 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
 Updated reserved container container_1390547864213_0347_01_03 on node 
 host: a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, 
 vCores:8 used=memory:8192, vCores:8 for application 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1686) NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.

[
https://issues.apache.org/jira/browse/YARN-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911478#comment-13911478
]

Hudson commented on YARN-1686:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #492 (See
[https://builds.apache.org/job/Hadoop-Yarn-trunk/492/])
YARN-1686. Fixed NodeManager to properly handle any errors during
re-registration after a RESYNC and thus avoid hanging. Contributed by Rohith
Sharma. (vinodkv:
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571474)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java

NodeManager.resyncWithRM() does not handle exception which cause NodeManger
to Hang.

Key: YARN-1686
URL: https://issues.apache.org/jira/browse/YARN-1686
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.3.0
Reporter: Rohith
Assignee: Rohith
Fix For: 2.4.0

Attachments: YARN-1686.1.patch, YARN-1686.2.patch, YARN-1686.3.patch

During start of NodeManager,if registration with resourcemanager throw
exception then nodemager shutdown happens.
Consider case where NM-1 is registered with RM. RM issued Resync to NM. If
any exception thrown in resyncWithRM (starts new thread which does not
handle exception) during RESYNC evet, then this thread is lost. NodeManger
enters hanged state.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911482#comment-13911482
 ] 

Hudson commented on YARN-1734:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #492 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/492/])
YARN-1734. Fixed ResourceManager to update the configurations when it transits 
from standby to active mode so as to assimilate any changes that happened while 
it was in standby mode. Contributed by Xuan Gong. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571539)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java


 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch, 
 YARN-1734.4.patch, YARN-1734.5.patch, YARN-1734.6.patch, YARN-1734.7.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1686) NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.

[
https://issues.apache.org/jira/browse/YARN-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911578#comment-13911578
]

Hudson commented on YARN-1686:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1684 (See
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1684/])
YARN-1686. Fixed NodeManager to properly handle any errors during
re-registration after a RESYNC and thus avoid hanging. Contributed by Rohith
Sharma. (vinodkv:
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571474)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java

NodeManager.resyncWithRM() does not handle exception which cause NodeManger
to Hang.

Attachments: YARN-1686.1.patch, YARN-1686.2.patch, YARN-1686.3.patch

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911582#comment-13911582
 ] 

Hudson commented on YARN-1734:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1684 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1684/])
YARN-1734. Fixed ResourceManager to update the configurations when it transits 
from standby to active mode so as to assimilate any changes that happened while 
it was in standby mode. Contributed by Xuan Gong. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571539)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java


 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch, 
 YARN-1734.4.patch, YARN-1734.5.patch, YARN-1734.6.patch, YARN-1734.7.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1678) Fair scheduler gabs incessantly about reservations


[ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911581#comment-13911581
 ] 

Hudson commented on YARN-1678:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1684 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1684/])
YARN-1678. Fair scheduler gabs incessantly about reservations (Sandy Ryza) 
(sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571468)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Fair scheduler gabs incessantly about reservations
 --

 Key: YARN-1678
 URL: https://issues.apache.org/jira/browse/YARN-1678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.5.0

 Attachments: YARN-1678-1.patch, YARN-1678-1.patch, YARN-1678.patch


 Come on FS. We really don't need to know every time a node with a reservation 
 on it heartbeats.
 {code}
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Trying to fulfill reservation for application 
 appattempt_1390547864213_0347_01 on node: host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
 Making reservation: node=a2330.halxg.cloudera.com 
 app_id=application_1390547864213_0347
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
  Application application_1390547864213_0347 reserved container 
 container_1390547864213_0347_01_03 on node host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8, currently has 6 at priority 0; 
 currentReservation 6144
 2014-01-29 03:48:16,044 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
 Updated reserved container container_1390547864213_0347_01_03 on node 
 host: a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, 
 vCores:8 used=memory:8192, vCores:8 for application 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1678) Fair scheduler gabs incessantly about reservations


[ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911637#comment-13911637
 ] 

Hudson commented on YARN-1678:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1709 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1709/])
YARN-1678. Fair scheduler gabs incessantly about reservations (Sandy Ryza) 
(sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571468)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Fair scheduler gabs incessantly about reservations
 --

 Key: YARN-1678
 URL: https://issues.apache.org/jira/browse/YARN-1678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.5.0

 Attachments: YARN-1678-1.patch, YARN-1678-1.patch, YARN-1678.patch


 Come on FS. We really don't need to know every time a node with a reservation 
 on it heartbeats.
 {code}
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Trying to fulfill reservation for application 
 appattempt_1390547864213_0347_01 on node: host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
 Making reservation: node=a2330.halxg.cloudera.com 
 app_id=application_1390547864213_0347
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
  Application application_1390547864213_0347 reserved container 
 container_1390547864213_0347_01_03 on node host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8, currently has 6 at priority 0; 
 currentReservation 6144
 2014-01-29 03:48:16,044 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
 Updated reserved container container_1390547864213_0347_01_03 on node 
 host: a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, 
 vCores:8 used=memory:8192, vCores:8 for application 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1686) NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.

[
https://issues.apache.org/jira/browse/YARN-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911634#comment-13911634
]

Hudson commented on YARN-1686:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1709 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1709/])
YARN-1686. Fixed NodeManager to properly handle any errors during
re-registration after a RESYNC and thus avoid hanging. Contributed by Rohith
Sharma. (vinodkv:
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571474)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java

NodeManager.resyncWithRM() does not handle exception which cause NodeManger
to Hang.

Attachments: YARN-1686.1.patch, YARN-1686.2.patch, YARN-1686.3.patch

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911638#comment-13911638
 ] 

Hudson commented on YARN-1734:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1709 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1709/])
YARN-1734. Fixed ResourceManager to update the configurations when it transits 
from standby to active mode so as to assimilate any changes that happened while 
it was in standby mode. Contributed by Xuan Gong. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571539)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java


 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch, 
 YARN-1734.4.patch, YARN-1734.5.patch, YARN-1734.6.patch, YARN-1734.7.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1730) Leveldb timeline store needs simple write locking

2014-02-25 Thread Billie Rinaldi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-1730:
-

Attachment: YARN-1730.3.patch

Rebased patch 1 against trunk.

 Leveldb timeline store needs simple write locking
 -

 Key: YARN-1730
 URL: https://issues.apache.org/jira/browse/YARN-1730
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Attachments: YARN-1730.1.patch, YARN-1730.2.patch, YARN-1730.3.patch


 The actual data writes are performed atomically in a batch, but a lock should 
 be held while identifying a start time for the entity, which precedes every 
 write.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1757) Auxiliary service support for nodemanager recovery

2014-02-25 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-1757:
-

Attachment: YARN-1757.patch

Patch to have the nodemanager create an aux-service-specific path under the 
specified NM recovery directory where an aux service can store recoverable 
state.  The presence or absence of this path indicates whether NM recovery is 
enabled (or aux service could check conf directly).

 Auxiliary service support for nodemanager recovery
 --

 Key: YARN-1757
 URL: https://issues.apache.org/jira/browse/YARN-1757
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1757.patch


 There needs to be a mechanism for communicating to auxiliary services whether 
 nodemanager recovery is enabled and where they should store their state.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1757) Auxiliary service support for nodemanager recovery

2014-02-25 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-1757:
-

Target Version/s: 2.5.0  (was: 2.4.0)

 Auxiliary service support for nodemanager recovery
 --

 Key: YARN-1757
 URL: https://issues.apache.org/jira/browse/YARN-1757
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1757.patch


 There needs to be a mechanism for communicating to auxiliary services whether 
 nodemanager recovery is enabled and where they should store their state.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1760) TestRMAdminService assumes the use of CapacityScheduler

2014-02-25 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911797#comment-13911797
 ] 

Sandy Ryza commented on YARN-1760:
--

+1

 TestRMAdminService assumes the use of CapacityScheduler
 ---

 Key: YARN-1760
 URL: https://issues.apache.org/jira/browse/YARN-1760
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Trivial
  Labels: test
 Attachments: yarn-1760-1.patch, yarn-1760-2.patch, yarn-1760-3.patch


 YARN-1611 adds TestRMAdminService which assumes the use of CapacityScheduler. 
 {noformat}
 java.lang.ClassCastException: 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
 cannot be cast to 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService.testAdminRefreshQueuesWithFileSystemBasedConfigurationProvider(TestRMAdminService.java:115)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1760) TestRMAdminService assumes CapacityScheduler

2014-02-25 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1760:
---

Summary: TestRMAdminService assumes CapacityScheduler  (was: 
TestRMAdminService assumes the use of CapacityScheduler)

 TestRMAdminService assumes CapacityScheduler
 

 Key: YARN-1760
 URL: https://issues.apache.org/jira/browse/YARN-1760
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Trivial
  Labels: test
 Attachments: yarn-1760-1.patch, yarn-1760-2.patch, yarn-1760-3.patch


 YARN-1611 adds TestRMAdminService which assumes the use of CapacityScheduler. 
 {noformat}
 java.lang.ClassCastException: 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
 cannot be cast to 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService.testAdminRefreshQueuesWithFileSystemBasedConfigurationProvider(TestRMAdminService.java:115)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission

[
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911817#comment-13911817
]

Xuan Gong commented on YARN-1410:
-

Sounds good to me.
For 1) RM fails over after getApplicationID() and *before* submitApplication().

The changes we will make is to let RM accept the “old” applicationId which
includes:
* make RM accept the applicationId in the context
* If there is no applicationId specified in the context, RM will assign a new
ApplicationId

For 2) RM fail overs *during* the submitApplication call.

We have many discussions for this scenario. We can open a separate ticket for
it.

For 3) RM fails over *after* the submitApplication call and before the
subsequent getApplicationReport().

We can mark getApplicationReport() as Idempotent, and need to handle two
different cases:
* Failover happens after SubmitApplicationResponse is received, but
RMStateStore does not save the applicationState. In this case, when the
getApplicationReport() is called, we will get an ApplicationNotFoundException.
So, we need to catch this exception and submit this application again
* Failover happens after SubmitApplicationResponse is received, and
RMStateStore saves the applicationState. Nothing need to be changed.

Handle client failover during 2 step client API's like app submission
-

Key: YARN-1410
URL: https://issues.apache.org/jira/browse/YARN-1410
Project: Hadoop YARN
Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
Attachments: YARN-1410-outline.patch, YARN-1410.1.patch,
YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch,
YARN-1410.5.patch

Original Estimate: 48h
Remaining Estimate: 48h

App submission involves
1) creating appId
2) using that appId to submit an ApplicationSubmissionContext to the user.
The client may have obtained an appId from an RM, the RM may have failed
over, and the client may submit the app to the new RM.
Since the new RM has a different notion of cluster timestamp (used to create
app id) the new RM may reject the app submission resulting in unexpected
failure on the client side.
The same may happen for other 2 step client API operations.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1730) Leveldb timeline store needs simple write locking


[ 
https://issues.apache.org/jira/browse/YARN-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911824#comment-13911824
 ] 

Hadoop QA commented on YARN-1730:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12630988/YARN-1730.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3176//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3176//console

This message is automatically generated.

 Leveldb timeline store needs simple write locking
 -

 Key: YARN-1730
 URL: https://issues.apache.org/jira/browse/YARN-1730
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Attachments: YARN-1730.1.patch, YARN-1730.2.patch, YARN-1730.3.patch


 The actual data writes are performed atomically in a batch, but a lock should 
 be held while identifying a start time for the entity, which precedes every 
 write.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1758) MiniYARNCluster broken post YARN-1666

2014-02-25 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911827#comment-13911827
 ] 

Tsuyoshi OZAWA commented on YARN-1758:
--

Thank you for reporting, [~hitesh]. I cannot reproduce NPE with a command mvn 
test under hadoop-yarn-project on local. Could you tell me the case this 
problem occurs?

 MiniYARNCluster broken post YARN-1666
 -

 Key: YARN-1758
 URL: https://issues.apache.org/jira/browse/YARN-1758
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah

 NPE seen when trying to use MiniYARNCluster



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1760) TestRMAdminService assumes CapacityScheduler


[ 
https://issues.apache.org/jira/browse/YARN-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911826#comment-13911826
 ] 

Hudson commented on YARN-1760:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5222 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5222/])
YARN-1760. TestRMAdminService assumes CapacityScheduler. (kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571777)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java


 TestRMAdminService assumes CapacityScheduler
 

 Key: YARN-1760
 URL: https://issues.apache.org/jira/browse/YARN-1760
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Trivial
  Labels: test
 Attachments: yarn-1760-1.patch, yarn-1760-2.patch, yarn-1760-3.patch


 YARN-1611 adds TestRMAdminService which assumes the use of CapacityScheduler. 
 {noformat}
 java.lang.ClassCastException: 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
 cannot be cast to 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService.testAdminRefreshQueuesWithFileSystemBasedConfigurationProvider(TestRMAdminService.java:115)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission

2014-02-25 Thread Bikas Saha (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911842#comment-13911842
]

Bikas Saha commented on YARN-1410:
--

bq. getApplicationReport() is called, we will get an
ApplicationNotFoundException. So, we need to catch this exception and submit
this application again
It would be good if via HAUtil we may be able to get an indication whether a
failover has occurred or not. If it has occurred then its ok to get this
exception but if it has not then its a bug. We can defer that to a separate
jira if its too much work.

Handle client failover during 2 step client API's like app submission
-

Original Estimate: 48h
Remaining Estimate: 48h

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission

2014-02-25 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911845#comment-13911845
 ] 

Karthik Kambatla commented on YARN-1410:


As Vinod suggested, can we limit this JIRA to 1 and open separate JIRAs for 2 
and 3. I don't see 3 to be as straight-forward, and suspect would require 
revisiting the state machine.

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
 YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch, 
 YARN-1410.5.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1758) MiniYARNCluster broken post YARN-1666

2014-02-25 Thread Hitesh Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911863#comment-13911863
 ] 

Hitesh Shah commented on YARN-1758:
---

[~ozawa] The problem is due to the loading of yarn-site.xml and other resources 
in ResourceManager::serviceInit(). It does not show up in yarn tests as the 
yarn-site.xml is added into yarn-resourcemanager-tests.jar I believe. However, 
for all downstream projects, they depend on hadoop-yarn-server-tests-tests.jar 
for MiniYarnCluster which itself does not have the necessary yarn-site, etc.

I think the fix might be as simple as moving the required configs from 
hadoop-yarn-server-resourcemanager/src/test/resources/ to 
hadoop-yarn-server-tests/src/test/resources/ so that the required confs are 
bundled in the same tests jar.

 MiniYARNCluster broken post YARN-1666
 -

 Key: YARN-1758
 URL: https://issues.apache.org/jira/browse/YARN-1758
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah

 NPE seen when trying to use MiniYARNCluster



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1757) Auxiliary service support for nodemanager recovery


[ 
https://issues.apache.org/jira/browse/YARN-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911867#comment-13911867
 ] 

Hadoop QA commented on YARN-1757:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12630989/YARN-1757.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerReboot
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServer
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
  
org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown
  
org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerResync

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3175//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3175//console

This message is automatically generated.

 Auxiliary service support for nodemanager recovery
 --

 Key: YARN-1757
 URL: https://issues.apache.org/jira/browse/YARN-1757
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1757.patch


 There needs to be a mechanism for communicating to auxiliary services whether 
 nodemanager recovery is enabled and where they should store their state.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493

2014-02-25 Thread Naren Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911876#comment-13911876
 ] 

Naren Koneru commented on YARN-1577:


Hi Jian, are you working on this issue? If not, I would like to take a look. 
Can you please comment.

 Unmanaged AM is broken because of YARN-1493
 ---

 Key: YARN-1577
 URL: https://issues.apache.org/jira/browse/YARN-1577
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.3.0
Reporter: Jian He
Assignee: Jian He
Priority: Blocker

 Today unmanaged AM client is waiting for app state to be Accepted to launch 
 the AM. This is broken since we changed in YARN-1493 to start the attempt 
 after the application is Accepted. We may need to introduce an attempt state 
 report that client can rely on to query the attempt state and choose to 
 launch the unmanaged AM.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (YARN-1764) Handle RM fail overs after the submitApplication call.

Xuan Gong created YARN-1764:
---

 Summary: Handle RM fail overs after the submitApplication call.
 Key: YARN-1764
 URL: https://issues.apache.org/jira/browse/YARN-1764
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (YARN-1758) MiniYARNCluster broken post YARN-1666


 [ 
https://issues.apache.org/jira/browse/YARN-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong reassigned YARN-1758:
---

Assignee: Xuan Gong

 MiniYARNCluster broken post YARN-1666
 -

 Key: YARN-1758
 URL: https://issues.apache.org/jira/browse/YARN-1758
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Xuan Gong

 NPE seen when trying to use MiniYARNCluster



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1713) Implement getnewapplication as part of RM web service


 [ 
https://issues.apache.org/jira/browse/YARN-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-1713:


Attachment: (was: yarn-1713.patch)

 Implement getnewapplication as part of RM web service
 -

 Key: YARN-1713
 URL: https://issues.apache.org/jira/browse/YARN-1713
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-1713.cumulative.patch, apache-yarn-1713.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1713) Implement getnewapplication as part of RM web service


 [ 
https://issues.apache.org/jira/browse/YARN-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-1713:


Attachment: apache-yarn-1713.patch

 Implement getnewapplication as part of RM web service
 -

 Key: YARN-1713
 URL: https://issues.apache.org/jira/browse/YARN-1713
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-1713.cumulative.patch, apache-yarn-1713.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1758) MiniYARNCluster broken post YARN-1666


[ 
https://issues.apache.org/jira/browse/YARN-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911902#comment-13911902
 ] 

Xuan Gong commented on YARN-1758:
-

Do not move the configs out of 
hadoop-yarn-server-resourcemanager/src/test/resources/. TestRMAdminService does 
not use miniYarnCluster and need these configs 

 MiniYARNCluster broken post YARN-1666
 -

 Key: YARN-1758
 URL: https://issues.apache.org/jira/browse/YARN-1758
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Xuan Gong

 NPE seen when trying to use MiniYARNCluster



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493


[ 
https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911903#comment-13911903
 ] 

Jian He commented on YARN-1577:
---

Hi [~naren.koneru], sure, you can work on it. Tx

 Unmanaged AM is broken because of YARN-1493
 ---

 Key: YARN-1577
 URL: https://issues.apache.org/jira/browse/YARN-1577
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.3.0
Reporter: Jian He
Assignee: Jian He
Priority: Blocker

 Today unmanaged AM client is waiting for app state to be Accepted to launch 
 the AM. This is broken since we changed in YARN-1493 to start the attempt 
 after the application is Accepted. We may need to introduce an attempt state 
 report that client can rely on to query the attempt state and choose to 
 launch the unmanaged AM.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1713) Implement getnewapplication and submitapp as part of RM web service


[ 
https://issues.apache.org/jira/browse/YARN-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911910#comment-13911910
 ] 

Varun Vasudev commented on YARN-1713:
-

Attached two patch files. Testing for the functionality requires running 
parametrized tests which is also part of the patch for the kill app 
functionality(issue 1702). Without the parametrized testing the submit app 
testing would be incomplete. apache-yarn-1713.patch contains changes just for 
the submit app functionality and apache-yarn-1713.cumulative.patch contains 
changes for the kill app and submit app functionality so that it can be applied 
and tested. 

 Implement getnewapplication and submitapp as part of RM web service
 ---

 Key: YARN-1713
 URL: https://issues.apache.org/jira/browse/YARN-1713
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-1713.cumulative.patch, apache-yarn-1713.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1713) Implement getnewapplication and submitapp as part of RM web service


 [ 
https://issues.apache.org/jira/browse/YARN-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-1713:


Summary: Implement getnewapplication and submitapp as part of RM web 
service  (was: Implement getnewapplication as part of RM web service)

 Implement getnewapplication and submitapp as part of RM web service
 ---

 Key: YARN-1713
 URL: https://issues.apache.org/jira/browse/YARN-1713
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-1713.cumulative.patch, apache-yarn-1713.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (YARN-1577) Unmanaged AM is broken because of YARN-1493

2014-02-25 Thread Naren Koneru (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naren Koneru reassigned YARN-1577:
--

Assignee: Naren Koneru  (was: Jian He)

 Unmanaged AM is broken because of YARN-1493
 ---

 Key: YARN-1577
 URL: https://issues.apache.org/jira/browse/YARN-1577
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.3.0
Reporter: Jian He
Assignee: Naren Koneru
Priority: Blocker

 Today unmanaged AM client is waiting for app state to be Accepted to launch 
 the AM. This is broken since we changed in YARN-1493 to start the attempt 
 after the application is Accepted. We may need to introduce an attempt state 
 report that client can rely on to query the attempt state and choose to 
 launch the unmanaged AM.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1730) Leveldb timeline store needs simple write locking


[ 
https://issues.apache.org/jira/browse/YARN-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911947#comment-13911947
 ] 

Zhijie Shen commented on YARN-1730:
---

bq. The hold count only returns the number of holds that have been obtained by 
the current thread. So as soon as the current thread is done with the lock, it 
would drop the lock from the lock map, which is not what we want.

Make sense. One minor comment: how about make CountReentrantLock extend 
ReentrantLock, in which way the code should a bit cleaner?

 Leveldb timeline store needs simple write locking
 -

 Key: YARN-1730
 URL: https://issues.apache.org/jira/browse/YARN-1730
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Attachments: YARN-1730.1.patch, YARN-1730.2.patch, YARN-1730.3.patch


 The actual data writes are performed atomically in a batch, but a lock should 
 be held while identifying a start time for the entity, which precedes every 
 write.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1713) Implement getnewapplication and submitapp as part of RM web service


[ 
https://issues.apache.org/jira/browse/YARN-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911981#comment-13911981
 ] 

Hadoop QA commented on YARN-1713:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12631012/apache-yarn-1713.cumulative.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3177//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/3177//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3177//console

This message is automatically generated.

 Implement getnewapplication and submitapp as part of RM web service
 ---

 Key: YARN-1713
 URL: https://issues.apache.org/jira/browse/YARN-1713
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-1713.cumulative.patch, apache-yarn-1713.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1730) Leveldb timeline store needs simple write locking

2014-02-25 Thread Billie Rinaldi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-1730:
-

Attachment: YARN-1730.4.patch

Sounds fine to me.

 Leveldb timeline store needs simple write locking
 -

 Key: YARN-1730
 URL: https://issues.apache.org/jira/browse/YARN-1730
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Attachments: YARN-1730.1.patch, YARN-1730.2.patch, YARN-1730.3.patch, 
 YARN-1730.4.patch


 The actual data writes are performed atomically in a batch, but a lock should 
 be held while identifying a start time for the entity, which precedes every 
 write.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1757) Auxiliary service support for nodemanager recovery

2014-02-25 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-1757:
-

Attachment: YARN-1757.patch

Test failures are of the Bind address already in use variety, and the NM 
tests run clean for me locally.  Uploading the same patch again to see if it 
was a sporadic failure.

 Auxiliary service support for nodemanager recovery
 --

 Key: YARN-1757
 URL: https://issues.apache.org/jira/browse/YARN-1757
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1757.patch, YARN-1757.patch


 There needs to be a mechanism for communicating to auxiliary services whether 
 nodemanager recovery is enabled and where they should store their state.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission

[
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912071#comment-13912071
]

Xuan Gong commented on YARN-1410:
-

Create https://issues.apache.org/jira/browse/YARN-1763 to track 2) RM fail
overs *during* the submitApplication call.
Create https://issues.apache.org/jira/browse/YARN-1764 to track 3) RM fails
over *after* the submitApplication call and before the subsequent
getApplicationReport().

This ticket is used to track 1) RM fails over after getApplicationID() and
*before* submitApplication().
Create a patch which includes :
* make RM accept the applicationId in the context. Nothing need to be changed
here.
* If there is no applicationId specified in the context, RM will assign a new
ApplicationId.

Also added two testcases to test AppSubmissionWithApplicationId and
AppSubmissionWithoutApplicationId

Handle client failover during 2 step client API's like app submission
-

Original Estimate: 48h
Remaining Estimate: 48h

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1410) Handle client failover during 2 step client API's like app submission


 [ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1410:


Attachment: YARN-1410.6.patch

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
 YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch, 
 YARN-1410.5.patch, YARN-1410.6.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1730) Leveldb timeline store needs simple write locking


[ 
https://issues.apache.org/jira/browse/YARN-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912090#comment-13912090
 ] 

Hadoop QA commented on YARN-1730:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12631034/YARN-1730.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3179//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3179//console

This message is automatically generated.

 Leveldb timeline store needs simple write locking
 -

 Key: YARN-1730
 URL: https://issues.apache.org/jira/browse/YARN-1730
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Attachments: YARN-1730.1.patch, YARN-1730.2.patch, YARN-1730.3.patch, 
 YARN-1730.4.patch


 The actual data writes are performed atomically in a batch, but a lock should 
 be held while identifying a start time for the entity, which precedes every 
 write.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1757) Auxiliary service support for nodemanager recovery


[ 
https://issues.apache.org/jira/browse/YARN-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912107#comment-13912107
 ] 

Hadoop QA commented on YARN-1757:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12631035/YARN-1757.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3178//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3178//console

This message is automatically generated.

 Auxiliary service support for nodemanager recovery
 --

 Key: YARN-1757
 URL: https://issues.apache.org/jira/browse/YARN-1757
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1757.patch, YARN-1757.patch


 There needs to be a mechanism for communicating to auxiliary services whether 
 nodemanager recovery is enabled and where they should store their state.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)

2014-02-25 Thread Karthik Kambatla (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912144#comment-13912144
]

Karthik Kambatla commented on YARN-1492:

Thanks for sharing this, [~ctrezzo]. The document is nicely written. Few
comments:
* Would SCM be a single point of failure? If yes, would anyone of the following
approaches make sense.
** Make SCM an AM. From YARN-896, the only sub-task that affects this would be
the delegation tokens.
** Add an SCMMonitorService to the RM. If SCM is enabled, this service would
start the SCM on one of the nodes and monitor it.
* SCM Cleaner Service - the doc mentions the tension between frequency of
cleaner and load on the RM. Can you elaborate? I was of the opinion that the RM
is not involved in the caching at all.
* Cleaner protocol doesn't mention when the cleaner lock is cleared. I assume
it is cleared on each exit path.
* Nit: ZK-based store - we can may be do this in the JIRA corresponding to the
sub-task - how would this look like?
* More nit-picking: The rationale for not using in-memory and reconstructing
seems to come from long-running applications. Given long-running applications
don't benefit from the shared cache as much as the shorter ones, is this a huge
concern?

truly shared cache for jars (jobjar/libjar)
---

Key: YARN-1492
URL: https://issues.apache.org/jira/browse/YARN-1492
Project: Hadoop YARN
Issue Type: New Feature
Affects Versions: 2.0.4-alpha
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Attachments: shared_cache_design.pdf, shared_cache_design_v2.pdf,
shared_cache_design_v3.pdf, shared_cache_design_v4.pdf,
shared_cache_design_v5.pdf

Currently there is the distributed cache that enables you to cache jars and
files so that attempts from the same job can reuse them. However, sharing is
limited with the distributed cache because it is normally on a per-job basis.
On a large cluster, sometimes copying of jobjars and libjars becomes so
prevalent that it consumes a large portion of the network bandwidth, not to
speak of defeating the purpose of bringing compute to where data is. This
is wasteful because in most cases code doesn't change much across many jobs.
I'd like to propose and discuss feasibility of introducing a truly shared
cache so that multiple jobs from multiple users can share and cache jars.
This JIRA is to open the discussion.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission


[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912167#comment-13912167
 ] 

Hadoop QA commented on YARN-1410:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12631041/YARN-1410.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:

  org.apache.hadoop.yarn.client.TestRMFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3180//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3180//console

This message is automatically generated.

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
 YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch, 
 YARN-1410.5.patch, YARN-1410.6.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1740) Redirection from AM-URL is broken with HTTPS_ONLY policy


 [ 
https://issues.apache.org/jira/browse/YARN-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1740:
--

Attachment: YARN-1740.2.patch

 Redirection from AM-URL is broken with HTTPS_ONLY policy
 

 Key: YARN-1740
 URL: https://issues.apache.org/jira/browse/YARN-1740
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Yesha Vora
Assignee: Jian He
 Attachments: YARN-1740.1.patch, YARN-1740.2.patch


 Steps to reproduce:
 1) Run a sleep job
 2) Run: yarn application -list command to find AM URL.
 root@host1:~# yarn application -list
 Total number of applications (application-types: [] and states: SUBMITTED, 
 ACCEPTED, RUNNING):1
 Application-Id Application-Name Application-Type User Queue State Final-State 
 Progress Tracking-URL
 application_1383251398986_0003 Sleep job MAPREDUCE hdfs default RUNNING 
 UNDEFINED 5% http://host1:40653
 3) Try to access http://host1:40653/ws/v1/mapreduce/info; url.
 This URL redirects to 
 http://RM_host:RM_https_port/proxy/application_1383251398986_0003/ws/v1/mapreduce/info
 Here, Http protocol is used with HTTPS port for RM.
 The expected Url is 
 https://RM_host:RM_https_port/proxy/application_1383251398986_0003/ws/v1/mapreduce/info



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1740) Redirection from AM-URL is broken with HTTPS_ONLY policy


[ 
https://issues.apache.org/jira/browse/YARN-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912176#comment-13912176
 ] 

Jian He commented on YARN-1740:
---

Added a test case for testing MR web app is explicitly disabling SSL.

The AM-URL redirected by amIPFilter to web proxy is not easily tested since 
there is one more issue that, amIpFilter is not able to differentiate the 
request coming from web proxy or from the localhost if it's a local machine 
cluster setup.

 Redirection from AM-URL is broken with HTTPS_ONLY policy
 

 Key: YARN-1740
 URL: https://issues.apache.org/jira/browse/YARN-1740
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Yesha Vora
Assignee: Jian He
 Attachments: YARN-1740.1.patch, YARN-1740.2.patch


 Steps to reproduce:
 1) Run a sleep job
 2) Run: yarn application -list command to find AM URL.
 root@host1:~# yarn application -list
 Total number of applications (application-types: [] and states: SUBMITTED, 
 ACCEPTED, RUNNING):1
 Application-Id Application-Name Application-Type User Queue State Final-State 
 Progress Tracking-URL
 application_1383251398986_0003 Sleep job MAPREDUCE hdfs default RUNNING 
 UNDEFINED 5% http://host1:40653
 3) Try to access http://host1:40653/ws/v1/mapreduce/info; url.
 This URL redirects to 
 http://RM_host:RM_https_port/proxy/application_1383251398986_0003/ws/v1/mapreduce/info
 Here, Http protocol is used with HTTPS port for RM.
 The expected Url is 
 https://RM_host:RM_https_port/proxy/application_1383251398986_0003/ws/v1/mapreduce/info



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1410) Handle client failover during 2 step client API's like app submission


 [ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1410:


Attachment: YARN-1410.7.patch

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
 YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch, 
 YARN-1410.5.patch, YARN-1410.6.patch, YARN-1410.7.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission


[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912188#comment-13912188
 ] 

Xuan Gong commented on YARN-1410:
-

Test case is passing locally.
Added verifyConnections() and try again...

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
 YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch, 
 YARN-1410.5.patch, YARN-1410.6.patch, YARN-1410.7.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1729) ATSWebServices always passes primary and secondary filters as strings

2014-02-25 Thread Billie Rinaldi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-1729:
-

Attachment: YARN-1729.3.patch

Updated patch for trunk.

 ATSWebServices always passes primary and secondary filters as strings
 -

 Key: YARN-1729
 URL: https://issues.apache.org/jira/browse/YARN-1729
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Attachments: YARN-1729.1.patch, YARN-1729.2.patch, YARN-1729.3.patch


 Primary filters and secondary filter values can be arbitrary json-compatible 
 Object.  The web services should determine if the filters specified as query 
 parameters are objects or strings before passing them to the store.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1658) Webservice should redirect to active RM when HA is enabled.

2014-02-25 Thread Cindy Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cindy Li updated YARN-1658:
---

Attachment: YARN1658.patch

Initial patch, based on YARN1525 patch.

 Webservice should redirect to active RM when HA is enabled.
 ---

 Key: YARN-1658
 URL: https://issues.apache.org/jira/browse/YARN-1658
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Cindy Li
Assignee: Cindy Li
  Labels: YARN
 Attachments: YARN1658.patch


 When HA is enabled, web service to standby RM should be redirected to the 
 active RM. This is a related Jira to YARN-1525.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1740) Redirection from AM-URL is broken with HTTPS_ONLY policy


[ 
https://issues.apache.org/jira/browse/YARN-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912364#comment-13912364
 ] 

Hadoop QA commented on YARN-1740:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12631076/YARN-1740.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3181//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3181//console

This message is automatically generated.

 Redirection from AM-URL is broken with HTTPS_ONLY policy
 

 Key: YARN-1740
 URL: https://issues.apache.org/jira/browse/YARN-1740
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Yesha Vora
Assignee: Jian He
 Attachments: YARN-1740.1.patch, YARN-1740.2.patch


 Steps to reproduce:
 1) Run a sleep job
 2) Run: yarn application -list command to find AM URL.
 root@host1:~# yarn application -list
 Total number of applications (application-types: [] and states: SUBMITTED, 
 ACCEPTED, RUNNING):1
 Application-Id Application-Name Application-Type User Queue State Final-State 
 Progress Tracking-URL
 application_1383251398986_0003 Sleep job MAPREDUCE hdfs default RUNNING 
 UNDEFINED 5% http://host1:40653
 3) Try to access http://host1:40653/ws/v1/mapreduce/info; url.
 This URL redirects to 
 http://RM_host:RM_https_port/proxy/application_1383251398986_0003/ws/v1/mapreduce/info
 Here, Http protocol is used with HTTPS port for RM.
 The expected Url is 
 https://RM_host:RM_https_port/proxy/application_1383251398986_0003/ws/v1/mapreduce/info



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission


[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912382#comment-13912382
 ] 

Hadoop QA commented on YARN-1410:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12631077/YARN-1410.7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:

org.apache.hadoop.yarn.client.api.impl.TestNMClient

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3182//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3182//console

This message is automatically generated.

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
 YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch, 
 YARN-1410.5.patch, YARN-1410.6.patch, YARN-1410.7.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1700) AHS records non-launched containers

[
https://issues.apache.org/jira/browse/YARN-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912383#comment-13912383
]

Zhijie Shen commented on YARN-1700:
---

Log url is nullable. In this scenario, the container is not launched. It is
also possible that the the container is completed, but the finish information
is not written into the history store. The correct fix should be correcting
AppAttemptBlock and ContainerBlock to handle the case that log url is null.
This is what YARN-1685 is supposed to do.

On the other side, even an container is not launched, we still want to record
it, though the current information we have collected can not tell whether the
container is finished after running some executable, or it is even not started.
However, we're going to improve the exposed information to let users see this
difference. Moreover, we are seeking for providing integrated access of the
information for both running and finished containers, via both RPC and web
interfaces. Given this done, users will monitor the containers before being
launched, being running, being completed and etc.

[~jira.shegalov], if you're fine with it. We can close the ticket as duplicate
of YARN-1685. Thanks!

AHS records non-launched containers
---

Key: YARN-1700
URL: https://issues.apache.org/jira/browse/YARN-1700
Project: Hadoop YARN
Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
Attachments: YARN-1700.v01.patch, YARN-1700.v02.patch

When testing AHS with a MR sleep job, AHS sometimes threw NPE out of
AppAttemptBlock.render because logUrl in container report was null. I
realized that this is because AHS may record containers that never launch.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1729) ATSWebServices always passes primary and secondary filters as strings


[ 
https://issues.apache.org/jira/browse/YARN-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912388#comment-13912388
 ] 

Hadoop QA commented on YARN-1729:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12631092/YARN-1729.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3183//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3183//console

This message is automatically generated.

 ATSWebServices always passes primary and secondary filters as strings
 -

 Key: YARN-1729
 URL: https://issues.apache.org/jira/browse/YARN-1729
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Attachments: YARN-1729.1.patch, YARN-1729.2.patch, YARN-1729.3.patch


 Primary filters and secondary filter values can be arbitrary json-compatible 
 Object.  The web services should determine if the filters specified as query 
 parameters are objects or strings before passing them to the store.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1713) Implement getnewapplication and submitapp as part of RM web service


[ 
https://issues.apache.org/jira/browse/YARN-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912394#comment-13912394
 ] 

Jian He commented on YARN-1713:
---

Some comments on patch apache-yarn-1713.cumulative.patch:
-  styling issue, please follow the convention of 80 column limit
- appIdToRMApp is not actually adding Id to RMApp, more likely 
getRMAppFromRMContext() ?
- we have created factory method in each user-facing record for instantiating 
the record, e.g.: ApplicationSubmissionContext.newInstance, you can use that. 
{code}
createAppSubmissionContext(AppSubmissionInfo newApp) t
{code}
- createNewApplication should be a get request as it only returns the 
applicationId etc. just like the one in ClientRMService.getNewApplicaiton
- you can attach a name for the XmlRootElement like @XmlRootElement(name = 
appAttempt) to specify the element name, so we can do @XmlRootElement(name = 
“newApplication”)
- Note that RMWebServices.hasAcess() is checked against VIEW_APP permission, in 
the case of submit/kill, we should check against MODIFY_APP


 Implement getnewapplication and submitapp as part of RM web service
 ---

 Key: YARN-1713
 URL: https://issues.apache.org/jira/browse/YARN-1713
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-1713.cumulative.patch, apache-yarn-1713.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1588) Rebind NM tokens for previous attempt's running containers to the new attempt


 [ 
https://issues.apache.org/jira/browse/YARN-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1588:
--

Attachment: YARN-1588.5.patch

Refactored some logging in the new patch

 Rebind NM tokens for previous attempt's running containers to the new attempt
 -

 Key: YARN-1588
 URL: https://issues.apache.org/jira/browse/YARN-1588
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1588.1.patch, YARN-1588.1.patch, YARN-1588.2.patch, 
 YARN-1588.3.patch, YARN-1588.4.patch, YARN-1588.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1429) YARN_CLASSPATH is referenced in yarn command comments but doesn't do anything

2014-02-25 Thread Jarek Jarcec Cecho (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated YARN-1429:
-

Attachment: YARN-1429.patch

Attaching updated version that changed the name to {{YARN_USER_CLASSPATH}} and 
also introduced second variable {{YARN_USER_CLASSPATH_FIRST}} that will enable 
user to put the content at the beginning of the final classpath. I do feel that 
those names are quite descriptive, but please do not hesitate and let me know 
if you have better names in mind!

 YARN_CLASSPATH is referenced in yarn command comments but doesn't do anything
 -

 Key: YARN-1429
 URL: https://issues.apache.org/jira/browse/YARN-1429
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Sandy Ryza
Assignee: Jarek Jarcec Cecho
Priority: Trivial
  Labels: newbie
 Attachments: YARN-1429.patch, YARN-1429.patch


 YARN_CLASSPATH is referenced in the comments in 
 ./hadoop-yarn-project/hadoop-yarn/bin/yarn and 
 ./hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd, but doesn't do anything.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1588) Rebind NM tokens for previous attempt's running containers to the new attempt


[ 
https://issues.apache.org/jira/browse/YARN-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912453#comment-13912453
 ] 

Hadoop QA commented on YARN-1588:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12631127/YARN-1588.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.client.api.impl.TestNMClient

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3184//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3184//console

This message is automatically generated.

 Rebind NM tokens for previous attempt's running containers to the new attempt
 -

 Key: YARN-1588
 URL: https://issues.apache.org/jira/browse/YARN-1588
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1588.1.patch, YARN-1588.1.patch, YARN-1588.2.patch, 
 YARN-1588.3.patch, YARN-1588.4.patch, YARN-1588.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.


[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912455#comment-13912455
 ] 

Hadoop QA commented on YARN-1506:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629031/YARN-1506-v7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3185//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3185//console

This message is automatically generated.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, 
 YARN-1506-v3.patch, YARN-1506-v4.patch, YARN-1506-v5.patch, 
 YARN-1506-v6.patch, YARN-1506-v7.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1561) Fix a generic type warning in FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912476#comment-13912476
 ] 

Hadoop QA commented on YARN-1561:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12624155/yarn-1561.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3186//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3186//console

This message is automatically generated.

 Fix a generic type warning in FairScheduler
 ---

 Key: YARN-1561
 URL: https://issues.apache.org/jira/browse/YARN-1561
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Junping Du
Assignee: Chen He
Priority: Minor
  Labels: newbie
 Fix For: 2.4.0

 Attachments: yarn-1561.patch


 The Comparator below should be specified with type:
 private Comparator nodeAvailableResourceComparator =
   new NodeAvailableResourceComparator(); 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1429) YARN_CLASSPATH is referenced in yarn command comments but doesn't do anything

2014-02-25 Thread Jarek Jarcec Cecho (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated YARN-1429:
-

Attachment: YARN-1429.linux.patch

I've noticed that my generated patch can't be easily applied with {{patch}} 
utility. I'm having troubles with CRLF encoding of the original file versus LF 
encoding of generated patch. The {{git apply}} seems to be smart enough to over 
that but unix {{patch}} is failing on that. Is there anything like svn apply?

 YARN_CLASSPATH is referenced in yarn command comments but doesn't do anything
 -

 Key: YARN-1429
 URL: https://issues.apache.org/jira/browse/YARN-1429
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Sandy Ryza
Assignee: Jarek Jarcec Cecho
Priority: Trivial
  Labels: newbie
 Attachments: YARN-1429.linux.patch, YARN-1429.patch, YARN-1429.patch


 YARN_CLASSPATH is referenced in the comments in 
 ./hadoop-yarn-project/hadoop-yarn/bin/yarn and 
 ./hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd, but doesn't do anything.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1363) Get / Cancel / Renew delegation token api should be non blocking


 [ 
https://issues.apache.org/jira/browse/YARN-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1363:
--

Hadoop Flags: Incompatible change

 Get / Cancel / Renew delegation token api should be non blocking
 

 Key: YARN-1363
 URL: https://issues.apache.org/jira/browse/YARN-1363
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Zhijie Shen
 Attachments: YARN-1363.1.patch, YARN-1363.2.patch, YARN-1363.3.patch, 
 YARN-1363.4.patch, YARN-1363.5.patch, YARN-1363.6.patch, YARN-1363.7.patch


 Today GetDelgationToken, CancelDelegationToken and RenewDelegationToken are 
 all blocking apis.
 * As a part of these calls we try to update RMStateStore and that may slow it 
 down.
 * Now as we have limited number of client request handlers we may fill up 
 client handlers quickly.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (YARN-1752) Unexpected Unregistered event at Attempt Launched state

2014-02-25 Thread Rohith (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith reassigned YARN-1752:


Assignee: Rohith

 Unexpected Unregistered event at Attempt Launched state
 ---

 Key: YARN-1752
 URL: https://issues.apache.org/jira/browse/YARN-1752
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Rohith

 {code}
 2014-02-21 14:56:03,453 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
 Can't handle this event at current state
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 UNREGISTERED at LAUNCHED
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:647)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:103)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:733)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:714)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
   at java.lang.Thread.run(Thread.java:695)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1752) Unexpected Unregistered event at Attempt Launched state

2014-02-25 Thread Rohith (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912501#comment-13912501
 ] 

Rohith commented on YARN-1752:
--

I reproduced this case using debug point. This need to be fixed from MapReduce 
, better to handle from MapReduce project.

 Unexpected Unregistered event at Attempt Launched state
 ---

 Key: YARN-1752
 URL: https://issues.apache.org/jira/browse/YARN-1752
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Rohith

 {code}
 2014-02-21 14:56:03,453 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
 Can't handle this event at current state
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 UNREGISTERED at LAUNCHED
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:647)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:103)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:733)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:714)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
   at java.lang.Thread.run(Thread.java:695)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1588) Rebind NM tokens for previous attempt's running containers to the new attempt


[ 
https://issues.apache.org/jira/browse/YARN-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912516#comment-13912516
 ] 

Jian He commented on YARN-1588:
---

Can't reproduce test failure locally, it can be similar to YARN-1657

 Rebind NM tokens for previous attempt's running containers to the new attempt
 -

 Key: YARN-1588
 URL: https://issues.apache.org/jira/browse/YARN-1588
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1588.1.patch, YARN-1588.1.patch, YARN-1588.2.patch, 
 YARN-1588.3.patch, YARN-1588.4.patch, YARN-1588.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1700) AHS records non-launched containers

2014-02-25 Thread Gera Shegalov (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912515#comment-13912515
 ] 

Gera Shegalov commented on YARN-1700:
-

Log URL is physically nullable, but in the current source code as is it never 
changes after the launch and is not null. If the intention is to record/display 
even containers that have not launched, I am fine treating it as a dup of 
YARN-1685

 AHS records non-launched containers
 ---

 Key: YARN-1700
 URL: https://issues.apache.org/jira/browse/YARN-1700
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1700.v01.patch, YARN-1700.v02.patch


 When testing AHS with a MR sleep job, AHS sometimes threw NPE out  of 
 AppAttemptBlock.render because logUrl in container report was null. I 
 realized that this is because AHS may record containers that never launch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1429) YARN_CLASSPATH is referenced in yarn command comments but doesn't do anything


[ 
https://issues.apache.org/jira/browse/YARN-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912519#comment-13912519
 ] 

Hadoop QA commented on YARN-1429:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12631149/YARN-1429.linux.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3187//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3187//console

This message is automatically generated.

 YARN_CLASSPATH is referenced in yarn command comments but doesn't do anything
 -

 Key: YARN-1429
 URL: https://issues.apache.org/jira/browse/YARN-1429
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Sandy Ryza
Assignee: Jarek Jarcec Cecho
Priority: Trivial
  Labels: newbie
 Attachments: YARN-1429.linux.patch, YARN-1429.patch, YARN-1429.patch


 YARN_CLASSPATH is referenced in the comments in 
 ./hadoop-yarn-project/hadoop-yarn/bin/yarn and 
 ./hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd, but doesn't do anything.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (YARN-1700) AHS records non-launched containers


 [ 
https://issues.apache.org/jira/browse/YARN-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen resolved YARN-1700.
---

Resolution: Duplicate

 AHS records non-launched containers
 ---

 Key: YARN-1700
 URL: https://issues.apache.org/jira/browse/YARN-1700
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1700.v01.patch, YARN-1700.v02.patch


 When testing AHS with a MR sleep job, AHS sometimes threw NPE out  of 
 AppAttemptBlock.render because logUrl in container report was null. I 
 realized that this is because AHS may record containers that never launch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1700) AHS records non-launched containers


[ 
https://issues.apache.org/jira/browse/YARN-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912551#comment-13912551
 ] 

Zhijie Shen commented on YARN-1700:
---

bq. Log URL is physically nullable, but in the current source code as is it 
never changes after the launch and is not null.

Good catch. This is another issue of the current code, which should be fixed in 
YARN-1685 as well. See my prior comment in YARN-1413: 
https://issues.apache.org/jira/browse/YARN-1413?focusedCommentId=13866844page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13866844

When the container is running, the log url should point to the NM web page, 
which serves the running container log; when it is finished, the log url should 
then be updated (See TODO in RMContainerImpl.java), and point to the AHS web 
page, which serves the aggregated log. 

 AHS records non-launched containers
 ---

 Key: YARN-1700
 URL: https://issues.apache.org/jira/browse/YARN-1700
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1700.v01.patch, YARN-1700.v02.patch


 When testing AHS with a MR sleep job, AHS sometimes threw NPE out  of 
 AppAttemptBlock.render because logUrl in container report was null. I 
 realized that this is because AHS may record containers that never launch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1685) Log URL should be different when the container is running and finished, and null case needs to be handled


 [ 
https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1685:
--

Summary: Log URL should be different when the container is running and 
finished, and null case needs to be handled  (was: [YARN-321] Logs link can be 
null so avoid NPE)

 Log URL should be different when the container is running and finished, and 
 null case needs to be handled
 -

 Key: YARN-1685
 URL: https://issues.apache.org/jira/browse/YARN-1685
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Fix For: YARN-321

 Attachments: YARN-1685-1.patch


 https://issues.apache.org/jira/browse/YARN-1413?focusedCommentId=13866416page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13866416
 https://issues.apache.org/jira/browse/YARN-1413?focusedCommentId=13866844page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13866844



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (YARN-1765) Write test cases to verify that killApplication API works in RM HA

Xuan Gong created YARN-1765:
---

 Summary: Write test cases to verify that killApplication API works 
in RM HA
 Key: YARN-1765
 URL: https://issues.apache.org/jira/browse/YARN-1765
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1765) Write test cases to verify that killApplication API works in RM HA


 [ 
https://issues.apache.org/jira/browse/YARN-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1765:


Attachment: YARN-1765.1.patch

 Write test cases to verify that killApplication API works in RM HA
 --

 Key: YARN-1765
 URL: https://issues.apache.org/jira/browse/YARN-1765
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1765.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1561) Fix a generic type warning in FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912565#comment-13912565
 ] 

Hudson commented on YARN-1561:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5226 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5226/])
YARN-1561. Fix a generic type warning in FairScheduler. (Chen He via 
junping_du) (junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1571924)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Fix a generic type warning in FairScheduler
 ---

 Key: YARN-1561
 URL: https://issues.apache.org/jira/browse/YARN-1561
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Junping Du
Assignee: Chen He
Priority: Minor
  Labels: newbie
 Fix For: 2.5.0

 Attachments: yarn-1561.patch


 The Comparator below should be specified with type:
 private Comparator nodeAvailableResourceComparator =
   new NodeAvailableResourceComparator(); 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1765) Write test cases to verify that killApplication API works in RM HA