[jira] [Commented] (YARN-3443) Create a 'ResourceHandler' subsystem to ease addition of support for new resource types on the NM

2015-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394200#comment-14394200
 ] 

Hadoop QA commented on YARN-3443:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12709194/YARN-3443.002.patch
  against trunk revision 72f6bd4.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1150 javac 
compiler warnings (more than the trunk's current 1148 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7211//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7211//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7211//console

This message is automatically generated.

 Create a 'ResourceHandler' subsystem to ease addition of support for new 
 resource types on the NM
 -

 Key: YARN-3443
 URL: https://issues.apache.org/jira/browse/YARN-3443
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Attachments: YARN-3443.001.patch, YARN-3443.002.patch


 The current cgroups implementation is closely tied to supporting CPU as a 
 resource . We need to separate out CGroups support as well a provide a simple 
 ResourceHandler subsystem that will enable us to add support for new resource 
 types on the NM - e.g Network, Disk etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3443) Create a 'ResourceHandler' subsystem to ease addition of support for new resource types on the NM

2015-04-03 Thread Sidharta Seethana (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sidharta Seethana updated YARN-3443:

Attachment: YARN-3443.002.patch

Reattaching the patch with the fixed findbug warning fixed. Not sure what to 
make of the javac warnings here, however : 
https://builds.apache.org/job/PreCommit-YARN-Build/7210//artifact/patchprocess/diffJavacWarnings.txt

 Create a 'ResourceHandler' subsystem to ease addition of support for new 
 resource types on the NM
 -

 Key: YARN-3443
 URL: https://issues.apache.org/jira/browse/YARN-3443
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Attachments: YARN-3443.001.patch, YARN-3443.002.patch


 The current cgroups implementation is closely tied to supporting CPU as a 
 resource . We need to separate out CGroups support as well a provide a simple 
 ResourceHandler subsystem that will enable us to add support for new resource 
 types on the NM - e.g Network, Disk etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3446) FairScheduler HeadRoom calculation should exclude nodes in the blacklist.

2015-04-03 Thread zhihai xu (JIRA)
zhihai xu created YARN-3446:
---

 Summary: FairScheduler HeadRoom calculation should exclude nodes 
in the blacklist.
 Key: YARN-3446
 URL: https://issues.apache.org/jira/browse/YARN-3446
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: zhihai xu
Assignee: zhihai xu


FairScheduler HeadRoom calculation should exclude nodes in the blacklist.
MRAppMaster does not preempt the reducers because for Reducer preemption 
calculation, headRoom is considering blacklisted nodes. This makes jobs to hang 
forever(ResourceManager does not assign any new containers on blacklisted nodes 
but availableResource AM get from RM includes blacklisted nodes available 
resource).
This issue is similar as YARN-1680 which is for Capacity Scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3410) YARN admin should be able to remove individual application records from RMStateStore

2015-04-03 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394249#comment-14394249
 ] 

Rohith commented on YARN-3410:
--

Attached the initial patch for removing individual applications from state 
store.

 YARN admin should be able to remove individual application records from 
 RMStateStore
 

 Key: YARN-3410
 URL: https://issues.apache.org/jira/browse/YARN-3410
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, yarn
Reporter: Wangda Tan
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-3410-v1.patch


 When RM state store entered an unexpected state, one example is YARN-2340, 
 when an attempt is not in final state but app already completed, RM can never 
 get up unless format RMStateStore.
 I think we should support remove individual application records from 
 RMStateStore to unblock RM admin make choice of either waiting for a fix or 
 format state store.
 In addition, RM should be able to report all fatal errors (which will 
 shutdown RM) when doing app recovery, this can save admin some time to remove 
 apps in bad state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3410) YARN admin should be able to remove individual application records from RMStateStore

2015-04-03 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394248#comment-14394248
 ] 

Rohith commented on YARN-3410:
--

bq. what's the use case of using rmadmin removing a state while RM is running?
Practically rmadmin need not to remove rm state store while RM running. I was 
thinking like if any exception happens during recovery like YARN-2340, then RM 
never get exited. RM keeps on switcing to standby and trying to become active. 
In this case, admin can format state store without stopping RM.

bq. it's better that RM can log all errors of applications recovering before 
die. With this, admin can know which application states caused RM die.
I think this will be hard to get which application caused the problem ICO 
RuntimeExceptions. Admin need to back track the exception in the logs to 
identify it.

 YARN admin should be able to remove individual application records from 
 RMStateStore
 

 Key: YARN-3410
 URL: https://issues.apache.org/jira/browse/YARN-3410
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, yarn
Reporter: Wangda Tan
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-3410-v1.patch


 When RM state store entered an unexpected state, one example is YARN-2340, 
 when an attempt is not in final state but app already completed, RM can never 
 get up unless format RMStateStore.
 I think we should support remove individual application records from 
 RMStateStore to unblock RM admin make choice of either waiting for a fix or 
 format state store.
 In addition, RM should be able to report all fatal errors (which will 
 shutdown RM) when doing app recovery, this can save admin some time to remove 
 apps in bad state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3410) YARN admin should be able to remove individual application records from RMStateStore

2015-04-03 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3410:
-
Attachment: 0001-YARN-3410-v1.patch

 YARN admin should be able to remove individual application records from 
 RMStateStore
 

 Key: YARN-3410
 URL: https://issues.apache.org/jira/browse/YARN-3410
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, yarn
Reporter: Wangda Tan
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-3410-v1.patch


 When RM state store entered an unexpected state, one example is YARN-2340, 
 when an attempt is not in final state but app already completed, RM can never 
 get up unless format RMStateStore.
 I think we should support remove individual application records from 
 RMStateStore to unblock RM admin make choice of either waiting for a fix or 
 format state store.
 In addition, RM should be able to report all fatal errors (which will 
 shutdown RM) when doing app recovery, this can save admin some time to remove 
 apps in bad state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2901) Add errors and warning metrics page to RM, NM web UI

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394305#comment-14394305
 ] 

Hudson commented on YARN-2901:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #152 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/152/])
YARN-2901. Add errors and warning metrics page to RM, NM web UI. (Varun Vasudev 
via wangda) (wangda: rev bad070fe15a642cc6f3a165612fbd272187e03cb)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLog4jWarningErrorMetricsAppender.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Log4jWarningErrorMetricsAppender.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/WebServer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/ErrorsAndWarningsBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMErrorsAndWarningsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMErrorsAndWarningsPage.java
* hadoop-common-project/hadoop-common/src/main/conf/log4j.properties


 Add errors and warning metrics page to RM, NM web UI
 

 Key: YARN-2901
 URL: https://issues.apache.org/jira/browse/YARN-2901
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: Exception collapsed.png, Exception expanded.jpg, Screen 
 Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch, 
 apache-yarn-2901.1.patch, apache-yarn-2901.2.patch, apache-yarn-2901.3.patch, 
 apache-yarn-2901.4.patch, apache-yarn-2901.5.patch


 It would be really useful to have statistics on the number of errors and 
 warnings in the RM and NM web UI. I'm thinking about -
 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day
 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 
 hours/day
 By errors and warnings I'm referring to the log level.
 I suspect we can probably achieve this by writing a custom appender?(I'm open 
 to suggestions on alternate mechanisms for implementing this).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3365) Add support for using the 'tc' tool via container-executor

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394307#comment-14394307
 ] 

Hudson commented on YARN-3365:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #152 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/152/])
YARN-3365. Enhanced NodeManager to support using the 'tc' tool via 
container-executor for outbound network traffic control. Contributed by 
Sidharta Seethana. (vinodkv: rev b21c72777ae664b08fd1a93b4f88fa43f2478d94)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c


 Add support for using the 'tc' tool via container-executor
 --

 Key: YARN-3365
 URL: https://issues.apache.org/jira/browse/YARN-3365
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Fix For: 2.8.0

 Attachments: YARN-3365.001.patch, YARN-3365.002.patch, 
 YARN-3365.003.patch


 We need the following functionality :
 1) modify network interface traffic shaping rules - to be able to attach a 
 qdisc, create child classes etc
 2) read existing rules in place 
 3) read stats for the various classes 
 Using tc requires elevated privileges - hence this functionality is to be 
 made available via container-executor. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler queue

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394313#comment-14394313
 ] 

Hudson commented on YARN-3415:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #152 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/152/])
YARN-3415. Non-AM containers can be counted towards amResourceUsage of a 
fairscheduler queue (Zhihai Xu via Sandy Ryza) (sandy: rev 
6a6a59db7f1bfda47c3c14fb49676a7b22d2eb06)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler 
 queue
 --

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical
 Fix For: 2.8.0

 Attachments: YARN-3415.000.patch, YARN-3415.001.patch, 
 YARN-3415.002.patch


 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2729) Support script based NodeLabelsProvider Interface in Distributed Node Label Configuration Setup

2015-04-03 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394149#comment-14394149
 ] 

Naganarasimha G R commented on YARN-2729:
-


bq. Revisted interval, I think it's better to make it to be provider 
configuration instead of script-provider-only configuration. Since 
config/script will share it (I remember I have some back-and-forth opinions 
here).
:) agree, i dont mind redoing, as long as its for better reason and i was 
expecting for changes here anyway.
For other comments on configuration will get it done, 

bq. I feel like ScriptBased and ConfigBased can share some implementations, 
they will all init a time task, get interval and run, check timeout 
(meaningless for config-based), etc. Can you make an abstract class and 
inherited by ScriptBased?
I can do this (which i feel is correct), but if we do this then it might not be 
possible to generalize much NodeHealthSCriptRunner and 
ScriptBasedNodeLabelsProvider, which i feel should be ok

bq. checkAndThrowLabelName should be called in NodeStatusUpdaterImpl
In a way it would be better in NodeStatusUpdaterImpl as we support external 
class to be a provider, but earlier thought it would not be good for additional 
checks as part of heart beat flow 

bq. label need to be trim() when called checkAndThrowLabelName(...)
Not required as checkAndThrowLabelName takes care of it, but missing test case 
will add it for NodeStatusUpdaterImpl
Other issues will rework in next patch

 Support script based NodeLabelsProvider Interface in Distributed Node Label 
 Configuration Setup
 ---

 Key: YARN-2729
 URL: https://issues.apache.org/jira/browse/YARN-2729
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Fix For: 2.8.0

 Attachments: YARN-2729.20141023-1.patch, YARN-2729.20141024-1.patch, 
 YARN-2729.20141031-1.patch, YARN-2729.20141120-1.patch, 
 YARN-2729.20141210-1.patch, YARN-2729.20150309-1.patch, 
 YARN-2729.20150322-1.patch, YARN-2729.20150401-1.patch, 
 YARN-2729.20150402-1.patch


 Support script based NodeLabelsProvider Interface in Distributed Node Label 
 Configuration Setup . 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2901) Add errors and warning metrics page to RM, NM web UI

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394327#comment-14394327
 ] 

Hudson commented on YARN-2901:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #886 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/886/])
YARN-2901. Add errors and warning metrics page to RM, NM web UI. (Varun Vasudev 
via wangda) (wangda: rev bad070fe15a642cc6f3a165612fbd272187e03cb)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLog4jWarningErrorMetricsAppender.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/WebServer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMErrorsAndWarningsPage.java
* hadoop-common-project/hadoop-common/src/main/conf/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/ErrorsAndWarningsBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Log4jWarningErrorMetricsAppender.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMErrorsAndWarningsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java


 Add errors and warning metrics page to RM, NM web UI
 

 Key: YARN-2901
 URL: https://issues.apache.org/jira/browse/YARN-2901
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: Exception collapsed.png, Exception expanded.jpg, Screen 
 Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch, 
 apache-yarn-2901.1.patch, apache-yarn-2901.2.patch, apache-yarn-2901.3.patch, 
 apache-yarn-2901.4.patch, apache-yarn-2901.5.patch


 It would be really useful to have statistics on the number of errors and 
 warnings in the RM and NM web UI. I'm thinking about -
 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day
 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 
 hours/day
 By errors and warnings I'm referring to the log level.
 I suspect we can probably achieve this by writing a custom appender?(I'm open 
 to suggestions on alternate mechanisms for implementing this).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3365) Add support for using the 'tc' tool via container-executor

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394329#comment-14394329
 ] 

Hudson commented on YARN-3365:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #886 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/886/])
YARN-3365. Enhanced NodeManager to support using the 'tc' tool via 
container-executor for outbound network traffic control. Contributed by 
Sidharta Seethana. (vinodkv: rev b21c72777ae664b08fd1a93b4f88fa43f2478d94)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c


 Add support for using the 'tc' tool via container-executor
 --

 Key: YARN-3365
 URL: https://issues.apache.org/jira/browse/YARN-3365
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Fix For: 2.8.0

 Attachments: YARN-3365.001.patch, YARN-3365.002.patch, 
 YARN-3365.003.patch


 We need the following functionality :
 1) modify network interface traffic shaping rules - to be able to attach a 
 qdisc, create child classes etc
 2) read existing rules in place 
 3) read stats for the various classes 
 Using tc requires elevated privileges - hence this functionality is to be 
 made available via container-executor. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler queue

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394335#comment-14394335
 ] 

Hudson commented on YARN-3415:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #886 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/886/])
YARN-3415. Non-AM containers can be counted towards amResourceUsage of a 
fairscheduler queue (Zhihai Xu via Sandy Ryza) (sandy: rev 
6a6a59db7f1bfda47c3c14fb49676a7b22d2eb06)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* hadoop-yarn-project/CHANGES.txt


 Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler 
 queue
 --

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical
 Fix For: 2.8.0

 Attachments: YARN-3415.000.patch, YARN-3415.001.patch, 
 YARN-3415.002.patch


 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers

2015-04-03 Thread Do Hoai Nam (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394546#comment-14394546
 ] 

Do Hoai Nam commented on YARN-2140:
---

For the case of ingress traffic you can check our solution in YARN-2618 
(Support bandwidth enforcement for containers while reading from HDFS) 
https://issues.apache.org/jira/browse/YARN-2681 and the related paper 
(http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf)

 Add support for network IO isolation/scheduling for containers
 --

 Key: YARN-2140
 URL: https://issues.apache.org/jira/browse/YARN-2140
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: NetworkAsAResourceDesign.pdf






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler queue

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394507#comment-14394507
 ] 

Hudson commented on YARN-3415:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #143 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/143/])
YARN-3415. Non-AM containers can be counted towards amResourceUsage of a 
fairscheduler queue (Zhihai Xu via Sandy Ryza) (sandy: rev 
6a6a59db7f1bfda47c3c14fb49676a7b22d2eb06)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java


 Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler 
 queue
 --

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical
 Fix For: 2.8.0

 Attachments: YARN-3415.000.patch, YARN-3415.001.patch, 
 YARN-3415.002.patch


 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2901) Add errors and warning metrics page to RM, NM web UI

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394499#comment-14394499
 ] 

Hudson commented on YARN-2901:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #143 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/143/])
YARN-2901. Add errors and warning metrics page to RM, NM web UI. (Varun Vasudev 
via wangda) (wangda: rev bad070fe15a642cc6f3a165612fbd272187e03cb)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Log4jWarningErrorMetricsAppender.java
* hadoop-common-project/hadoop-common/src/main/conf/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/WebServer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/ErrorsAndWarningsBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMErrorsAndWarningsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMErrorsAndWarningsPage.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLog4jWarningErrorMetricsAppender.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMController.java


 Add errors and warning metrics page to RM, NM web UI
 

 Key: YARN-2901
 URL: https://issues.apache.org/jira/browse/YARN-2901
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: Exception collapsed.png, Exception expanded.jpg, Screen 
 Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch, 
 apache-yarn-2901.1.patch, apache-yarn-2901.2.patch, apache-yarn-2901.3.patch, 
 apache-yarn-2901.4.patch, apache-yarn-2901.5.patch


 It would be really useful to have statistics on the number of errors and 
 warnings in the RM and NM web UI. I'm thinking about -
 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day
 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 
 hours/day
 By errors and warnings I'm referring to the log level.
 I suspect we can probably achieve this by writing a custom appender?(I'm open 
 to suggestions on alternate mechanisms for implementing this).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3365) Add support for using the 'tc' tool via container-executor

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394501#comment-14394501
 ] 

Hudson commented on YARN-3365:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #143 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/143/])
YARN-3365. Enhanced NodeManager to support using the 'tc' tool via 
container-executor for outbound network traffic control. Contributed by 
Sidharta Seethana. (vinodkv: rev b21c72777ae664b08fd1a93b4f88fa43f2478d94)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


 Add support for using the 'tc' tool via container-executor
 --

 Key: YARN-3365
 URL: https://issues.apache.org/jira/browse/YARN-3365
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Fix For: 2.8.0

 Attachments: YARN-3365.001.patch, YARN-3365.002.patch, 
 YARN-3365.003.patch


 We need the following functionality :
 1) modify network interface traffic shaping rules - to be able to attach a 
 qdisc, create child classes etc
 2) read existing rules in place 
 3) read stats for the various classes 
 Using tc requires elevated privileges - hence this functionality is to be 
 made available via container-executor. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler queue

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394522#comment-14394522
 ] 

Hudson commented on YARN-3415:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2084 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2084/])
YARN-3415. Non-AM containers can be counted towards amResourceUsage of a 
fairscheduler queue (Zhihai Xu via Sandy Ryza) (sandy: rev 
6a6a59db7f1bfda47c3c14fb49676a7b22d2eb06)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler 
 queue
 --

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical
 Fix For: 2.8.0

 Attachments: YARN-3415.000.patch, YARN-3415.001.patch, 
 YARN-3415.002.patch


 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2901) Add errors and warning metrics page to RM, NM web UI

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394514#comment-14394514
 ] 

Hudson commented on YARN-2901:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2084 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2084/])
YARN-2901. Add errors and warning metrics page to RM, NM web UI. (Varun Vasudev 
via wangda) (wangda: rev bad070fe15a642cc6f3a165612fbd272187e03cb)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMErrorsAndWarningsPage.java
* hadoop-common-project/hadoop-common/src/main/conf/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLog4jWarningErrorMetricsAppender.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMErrorsAndWarningsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/ErrorsAndWarningsBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/WebServer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Log4jWarningErrorMetricsAppender.java
* hadoop-yarn-project/CHANGES.txt


 Add errors and warning metrics page to RM, NM web UI
 

 Key: YARN-2901
 URL: https://issues.apache.org/jira/browse/YARN-2901
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: Exception collapsed.png, Exception expanded.jpg, Screen 
 Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch, 
 apache-yarn-2901.1.patch, apache-yarn-2901.2.patch, apache-yarn-2901.3.patch, 
 apache-yarn-2901.4.patch, apache-yarn-2901.5.patch


 It would be really useful to have statistics on the number of errors and 
 warnings in the RM and NM web UI. I'm thinking about -
 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day
 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 
 hours/day
 By errors and warnings I'm referring to the log level.
 I suspect we can probably achieve this by writing a custom appender?(I'm open 
 to suggestions on alternate mechanisms for implementing this).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3365) Add support for using the 'tc' tool via container-executor

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394516#comment-14394516
 ] 

Hudson commented on YARN-3365:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2084 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2084/])
YARN-3365. Enhanced NodeManager to support using the 'tc' tool via 
container-executor for outbound network traffic control. Contributed by 
Sidharta Seethana. (vinodkv: rev b21c72777ae664b08fd1a93b4f88fa43f2478d94)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h


 Add support for using the 'tc' tool via container-executor
 --

 Key: YARN-3365
 URL: https://issues.apache.org/jira/browse/YARN-3365
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Fix For: 2.8.0

 Attachments: YARN-3365.001.patch, YARN-3365.002.patch, 
 YARN-3365.003.patch


 We need the following functionality :
 1) modify network interface traffic shaping rules - to be able to attach a 
 qdisc, create child classes etc
 2) read existing rules in place 
 3) read stats for the various classes 
 Using tc requires elevated privileges - hence this functionality is to be 
 made available via container-executor. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3444) Fixed typo (capability)

2015-04-03 Thread Gabor Liptak (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Liptak updated YARN-3444:
---
Attachment: YARN-3444.patch

 Fixed typo (capability)
 ---

 Key: YARN-3444
 URL: https://issues.apache.org/jira/browse/YARN-3444
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications/distributed-shell
Reporter: Gabor Liptak
Priority: Minor
 Attachments: YARN-3444.patch


 Fixed typo (capability)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2015-04-03 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-1680:

Component/s: capacityscheduler

 availableResources sent to applicationMaster in heartbeat should exclude 
 blacklistedNodes free memory.
 --

 Key: YARN-1680
 URL: https://issues.apache.org/jira/browse/YARN-1680
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Affects Versions: 2.2.0, 2.3.0
 Environment: SuSE 11 SP2 + Hadoop-2.3 
Reporter: Rohith
Assignee: Chen He
 Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
 YARN-1680-v2.patch, YARN-1680.patch


 There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
 slow start is set to 1.
 Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
 become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
 NodeManager(NM-4). All reducer task are running in cluster now.
 MRAppMaster does not preempt the reducers because for Reducer preemption 
 calculation, headRoom is considering blacklisted nodes memory. This makes 
 jobs to hang forever(ResourceManager does not assing any new containers on 
 blacklisted nodes but returns availableResouce considers cluster free 
 memory). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler queue

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394822#comment-14394822
 ] 

Hudson commented on YARN-3415:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #153 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/153/])
YARN-3415. Non-AM containers can be counted towards amResourceUsage of a 
fairscheduler queue (Zhihai Xu via Sandy Ryza) (sandy: rev 
6a6a59db7f1bfda47c3c14fb49676a7b22d2eb06)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


 Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler 
 queue
 --

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical
 Fix For: 2.8.0

 Attachments: YARN-3415.000.patch, YARN-3415.001.patch, 
 YARN-3415.002.patch


 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3365) Add support for using the 'tc' tool via container-executor

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394816#comment-14394816
 ] 

Hudson commented on YARN-3365:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #153 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/153/])
YARN-3365. Enhanced NodeManager to support using the 'tc' tool via 
container-executor for outbound network traffic control. Contributed by 
Sidharta Seethana. (vinodkv: rev b21c72777ae664b08fd1a93b4f88fa43f2478d94)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* hadoop-yarn-project/CHANGES.txt


 Add support for using the 'tc' tool via container-executor
 --

 Key: YARN-3365
 URL: https://issues.apache.org/jira/browse/YARN-3365
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Fix For: 2.8.0

 Attachments: YARN-3365.001.patch, YARN-3365.002.patch, 
 YARN-3365.003.patch


 We need the following functionality :
 1) modify network interface traffic shaping rules - to be able to attach a 
 qdisc, create child classes etc
 2) read existing rules in place 
 3) read stats for the various classes 
 Using tc requires elevated privileges - hence this functionality is to be 
 made available via container-executor. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2901) Add errors and warning metrics page to RM, NM web UI

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394814#comment-14394814
 ] 

Hudson commented on YARN-2901:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #153 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/153/])
YARN-2901. Add errors and warning metrics page to RM, NM web UI. (Varun Vasudev 
via wangda) (wangda: rev bad070fe15a642cc6f3a165612fbd272187e03cb)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Log4jWarningErrorMetricsAppender.java
* hadoop-common-project/hadoop-common/src/main/conf/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/WebServer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/ErrorsAndWarningsBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLog4jWarningErrorMetricsAppender.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMErrorsAndWarningsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMErrorsAndWarningsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java


 Add errors and warning metrics page to RM, NM web UI
 

 Key: YARN-2901
 URL: https://issues.apache.org/jira/browse/YARN-2901
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: Exception collapsed.png, Exception expanded.jpg, Screen 
 Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch, 
 apache-yarn-2901.1.patch, apache-yarn-2901.2.patch, apache-yarn-2901.3.patch, 
 apache-yarn-2901.4.patch, apache-yarn-2901.5.patch


 It would be really useful to have statistics on the number of errors and 
 warnings in the RM and NM web UI. I'm thinking about -
 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day
 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 
 hours/day
 By errors and warnings I'm referring to the log level.
 I suspect we can probably achieve this by writing a custom appender?(I'm open 
 to suggestions on alternate mechanisms for implementing this).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3444) Fixed typo (capability)

2015-04-03 Thread Gabor Liptak (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Liptak updated YARN-3444:
---
Target Version/s: 2.6.1

 Fixed typo (capability)
 ---

 Key: YARN-3444
 URL: https://issues.apache.org/jira/browse/YARN-3444
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications/distributed-shell
Reporter: Gabor Liptak
Priority: Minor
 Attachments: YARN-3444.patch


 Fixed typo (capability)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2140) Add support for network IO isolation/scheduling for containers

2015-04-03 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2140:
--
Assignee: Sidharta Seethana  (was: Wei Yan)

 Add support for network IO isolation/scheduling for containers
 --

 Key: YARN-2140
 URL: https://issues.apache.org/jira/browse/YARN-2140
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Wei Yan
Assignee: Sidharta Seethana
 Attachments: NetworkAsAResourceDesign.pdf






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-04-03 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-3411:
-
Attachment: ATSv2BackendHBaseSchemaproposal.pdf


Attaching the schema proposal for storing ATS information in hbase. I also have 
example queries listed and a basic UI design explanation. Feedback is welcome! 

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-81) Make sure YARN declares correct set of dependencies

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-81:
---
Attachment: YARN-81.patch

Upload a patch to fix most of warnings. Some Unused declared dependencies 
found warnings still there because maven failed to detect the usage of 
dependencies and removing them will cause compile/test failure.

 Make sure YARN declares correct set of dependencies
 ---

 Key: YARN-81
 URL: https://issues.apache.org/jira/browse/YARN-81
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Tom White
Assignee: Junping Du
 Attachments: YARN-81.patch


 This is the equivalent of HADOOP-8278 for YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2666) TestFairScheduler.testContinuousScheduling fails Intermittently

2015-04-03 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394748#comment-14394748
 ] 

zhihai xu commented on YARN-2666:
-

thanks [~ozawa]!

 TestFairScheduler.testContinuousScheduling fails Intermittently
 ---

 Key: YARN-2666
 URL: https://issues.apache.org/jira/browse/YARN-2666
 Project: Hadoop YARN
  Issue Type: Test
  Components: scheduler
Reporter: Tsuyoshi Ozawa
Assignee: zhihai xu
 Attachments: YARN-2666.000.patch


 The test fails on trunk.
 {code}
 Tests run: 79, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 8.698 sec 
  FAILURE! - in 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
 testContinuousScheduling(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler)
   Time elapsed: 0.582 sec   FAILURE!
 java.lang.AssertionError: expected:2 but was:1
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testContinuousScheduling(TestFairScheduler.java:3372)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3365) Add support for using the 'tc' tool via container-executor

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394869#comment-14394869
 ] 

Hudson commented on YARN-3365:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2102 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2102/])
YARN-3365. Enhanced NodeManager to support using the 'tc' tool via 
container-executor for outbound network traffic control. Contributed by 
Sidharta Seethana. (vinodkv: rev b21c72777ae664b08fd1a93b4f88fa43f2478d94)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h


 Add support for using the 'tc' tool via container-executor
 --

 Key: YARN-3365
 URL: https://issues.apache.org/jira/browse/YARN-3365
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Fix For: 2.8.0

 Attachments: YARN-3365.001.patch, YARN-3365.002.patch, 
 YARN-3365.003.patch


 We need the following functionality :
 1) modify network interface traffic shaping rules - to be able to attach a 
 qdisc, create child classes etc
 2) read existing rules in place 
 3) read stats for the various classes 
 Using tc requires elevated privileges - hence this functionality is to be 
 made available via container-executor. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler queue

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394876#comment-14394876
 ] 

Hudson commented on YARN-3415:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2102 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2102/])
YARN-3415. Non-AM containers can be counted towards amResourceUsage of a 
fairscheduler queue (Zhihai Xu via Sandy Ryza) (sandy: rev 
6a6a59db7f1bfda47c3c14fb49676a7b22d2eb06)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java


 Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler 
 queue
 --

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical
 Fix For: 2.8.0

 Attachments: YARN-3415.000.patch, YARN-3415.001.patch, 
 YARN-3415.002.patch


 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-81) Make sure YARN declares correct set of dependencies

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-81:
---
Attachment: YARN-81-v2.patch

Fix minor format issue in v2 patch.

 Make sure YARN declares correct set of dependencies
 ---

 Key: YARN-81
 URL: https://issues.apache.org/jira/browse/YARN-81
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Tom White
Assignee: Junping Du
 Attachments: YARN-81-v2.patch, YARN-81.patch


 This is the equivalent of HADOOP-8278 for YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-81) Make sure YARN declares correct set of dependencies

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du reassigned YARN-81:
--

Assignee: Junping Du

 Make sure YARN declares correct set of dependencies
 ---

 Key: YARN-81
 URL: https://issues.apache.org/jira/browse/YARN-81
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Tom White
Assignee: Junping Du

 This is the equivalent of HADOOP-8278 for YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-81) Make sure YARN declares correct set of dependencies

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394619#comment-14394619
 ] 

Junping Du commented on YARN-81:


It is open for a long time. Assign to myself to work on it.

 Make sure YARN declares correct set of dependencies
 ---

 Key: YARN-81
 URL: https://issues.apache.org/jira/browse/YARN-81
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Tom White

 This is the equivalent of HADOOP-8278 for YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3447) Dodgy code Warnings in org.apache.hadoop.yarn.util.Log4jWarningErrorMetricsAppender

2015-04-03 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula moved MAPREDUCE-6306 to YARN-3447:
---

Key: YARN-3447  (was: MAPREDUCE-6306)
Project: Hadoop YARN  (was: Hadoop Map/Reduce)

 Dodgy code Warnings in  
 org.apache.hadoop.yarn.util.Log4jWarningErrorMetricsAppender
 

 Key: YARN-3447
 URL: https://issues.apache.org/jira/browse/YARN-3447
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula

  *Dodgy code Warnings* 
 UrF   Unread public/protected field: 
 org.apache.hadoop.yarn.util.Log4jWarningErrorMetricsAppender$Element.count
 Bug type URF_UNREAD_PUBLIC_OR_PROTECTED_FIELD (click for details) 
 In class org.apache.hadoop.yarn.util.Log4jWarningErrorMetricsAppender$Element
 Field 
 org.apache.hadoop.yarn.util.Log4jWarningErrorMetricsAppender$Element.count
 At Log4jWarningErrorMetricsAppender.java:[line 44]
 UrF   Unread public/protected field: 
 org.apache.hadoop.yarn.util.Log4jWarningErrorMetricsAppender$Element.timestampSeconds
 Bug type URF_UNREAD_PUBLIC_OR_PROTECTED_FIELD (click for details) 
 In class org.apache.hadoop.yarn.util.Log4jWarningErrorMetricsAppender$Element
 Field 
 org.apache.hadoop.yarn.util.Log4jWarningErrorMetricsAppender$Element.timestampSeconds
 At Log4jWarningErrorMetricsAppender.java:[line 45]
 Please find more details here...
 https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5371//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2901) Add errors and warning metrics page to RM, NM web UI

2015-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394867#comment-14394867
 ] 

Hudson commented on YARN-2901:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2102 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2102/])
YARN-2901. Add errors and warning metrics page to RM, NM web UI. (Varun Vasudev 
via wangda) (wangda: rev bad070fe15a642cc6f3a165612fbd272187e03cb)
* hadoop-common-project/hadoop-common/src/main/conf/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMErrorsAndWarningsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLog4jWarningErrorMetricsAppender.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/ErrorsAndWarningsBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/WebServer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMErrorsAndWarningsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Log4jWarningErrorMetricsAppender.java


 Add errors and warning metrics page to RM, NM web UI
 

 Key: YARN-2901
 URL: https://issues.apache.org/jira/browse/YARN-2901
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: Exception collapsed.png, Exception expanded.jpg, Screen 
 Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch, 
 apache-yarn-2901.1.patch, apache-yarn-2901.2.patch, apache-yarn-2901.3.patch, 
 apache-yarn-2901.4.patch, apache-yarn-2901.5.patch


 It would be really useful to have statistics on the number of errors and 
 warnings in the RM and NM web UI. I'm thinking about -
 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day
 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 
 hours/day
 By errors and warnings I'm referring to the log level.
 I suspect we can probably achieve this by writing a custom appender?(I'm open 
 to suggestions on alternate mechanisms for implementing this).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2003) Support to process Job priority from Submission Context in AppAttemptAddedSchedulerEvent [RM side]

2015-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394891#comment-14394891
 ] 

Hadoop QA commented on YARN-2003:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708208/0005-YARN-2003.patch
  against trunk revision db80e42.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7213//console

This message is automatically generated.

 Support to process Job priority from Submission Context in 
 AppAttemptAddedSchedulerEvent [RM side]
 --

 Key: YARN-2003
 URL: https://issues.apache.org/jira/browse/YARN-2003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-2003.patch, 0002-YARN-2003.patch, 
 0003-YARN-2003.patch, 0004-YARN-2003.patch, 0005-YARN-2003.patch


 AppAttemptAddedSchedulerEvent should be able to receive the Job Priority from 
 Submission Context and store.
 Later this can be used by Scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2004) Priority scheduling support in Capacity scheduler

2015-04-03 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-2004:
--
Attachment: 0005-YARN-2004.patch

Uploading CS changes.

Hi [~leftnoteasy]
YARN-2004 need to have some changes in CS and LeafQueue. But same methods dummy 
impl is added in YARN-2003. Hence it will be dependable with YARN-2003, but not 
opposite. Kindly share your opinion.

 Priority scheduling support in Capacity scheduler
 -

 Key: YARN-2004
 URL: https://issues.apache.org/jira/browse/YARN-2004
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-2004.patch, 0002-YARN-2004.patch, 
 0003-YARN-2004.patch, 0004-YARN-2004.patch, 0005-YARN-2004.patch


 Based on the priority of the application, Capacity Scheduler should be able 
 to give preference to application while doing scheduling.
 ComparatorFiCaSchedulerApp applicationComparator can be changed as below.   
 
 1.Check for Application priority. If priority is available, then return 
 the highest priority job.
 2.Otherwise continue with existing logic such as App ID comparison and 
 then TimeStamp comparison.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-329) yarn CHANGES.txt link missing from docs Reference

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-329.
-
   Resolution: Fixed
Fix Version/s: 2.6.0

This get fixed since 2.6.0 release. Mark it as resolved.

 yarn CHANGES.txt link missing from docs Reference
 -

 Key: YARN-329
 URL: https://issues.apache.org/jira/browse/YARN-329
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 0.23.5
Reporter: Thomas Graves
Priority: Minor
 Fix For: 2.6.0


 Looking at the hadoop 0.23 docs: http://hadoop.apache.org/docs/r0.23.5/
 There is no link to the yarn CHANGES.txt in the Reference menu on the left 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-329) yarn CHANGES.txt link missing from docs Reference

2015-04-03 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-329:
--
Fix Version/s: (was: 2.6.0)

 yarn CHANGES.txt link missing from docs Reference
 -

 Key: YARN-329
 URL: https://issues.apache.org/jira/browse/YARN-329
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 0.23.5
Reporter: Thomas Graves
Priority: Minor

 Looking at the hadoop 0.23 docs: http://hadoop.apache.org/docs/r0.23.5/
 There is no link to the yarn CHANGES.txt in the Reference menu on the left 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2969) allocate resource on different nodes for task

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-2969.
--
Resolution: Duplicate

 allocate resource on different nodes for task
 -

 Key: YARN-2969
 URL: https://issues.apache.org/jira/browse/YARN-2969
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Yang Hao

 At the help of slider, YARN will be a common resource managing OS and some 
 application would like to apply container( or component on slider) on 
 different nodes, so a configuration of allocating resource on different will 
 be helpful



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-81) Make sure YARN declares correct set of dependencies

2015-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395043#comment-14395043
 ] 

Hadoop QA commented on YARN-81:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12709287/YARN-81.patch
  against trunk revision db80e42.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy:

  
org.apache.hadoop.yarn.util.TestLog4jWarningErrorMetricsAppender
  
org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices
  
org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesContainers
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesApps
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification
  org.apache.hadoop.yarn.server.resourcemanager.TestRM
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler

  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher
 

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7214//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7214//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7214//console

This message is automatically generated.

 Make sure YARN declares correct set of dependencies
 ---

 Key: YARN-81
 URL: https://issues.apache.org/jira/browse/YARN-81
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Tom White

[jira] [Commented] (YARN-81) Make sure YARN declares correct set of dependencies

2015-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395056#comment-14395056
 ] 

Hadoop QA commented on YARN-81:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12709290/YARN-81-v2.patch
  against trunk revision db80e42.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy:

  
org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices
  
org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesApps
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesContainers
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler
  org.apache.hadoop.yarn.server.resourcemanager.TestRM
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens

  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher
 

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7215//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7215//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7215//console

This message is automatically generated.

 Make sure YARN declares correct set of dependencies
 ---

 Key: YARN-81
 URL: https://issues.apache.org/jira/browse/YARN-81
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Tom 

[jira] [Resolved] (YARN-374) Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-374.
-
Resolution: Not a Problem

 Job History Server doesn't show jobs which killed by 
 ClientRMProtocol.forceKillApplication
 --

 Key: YARN-374
 URL: https://issues.apache.org/jira/browse/YARN-374
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Affects Versions: 2.0.1-alpha
Reporter: Nemon Lou

 After i kill a app by typing bin/yarn rmadmin app -kill APP_ID,
 no job info is kept on JHS web page.
 However, when i kill a job by typing  bin/mapred  job -kill JOB_ID ,
 i can see a killed job left on JHS.
 Some hive users are confused by that their jobs been killed but nothing left 
 on JHS ,and killed app's info on RM web page is not enough.(They kill job by 
 clientRMProtocol)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-374) Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395158#comment-14395158
 ] 

Junping Du commented on YARN-374:
-

Given we already have generic history server (now is timeline server) which 
will track YARN applications that get killed, I will resolve this issue.

 Job History Server doesn't show jobs which killed by 
 ClientRMProtocol.forceKillApplication
 --

 Key: YARN-374
 URL: https://issues.apache.org/jira/browse/YARN-374
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Affects Versions: 2.0.1-alpha
Reporter: Nemon Lou

 After i kill a app by typing bin/yarn rmadmin app -kill APP_ID,
 no job info is kept on JHS web page.
 However, when i kill a job by typing  bin/mapred  job -kill JOB_ID ,
 i can see a killed job left on JHS.
 Some hive users are confused by that their jobs been killed but nothing left 
 on JHS ,and killed app's info on RM web page is not enough.(They kill job by 
 clientRMProtocol)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2969) allocate resource on different nodes for task

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395175#comment-14395175
 ] 

Junping Du commented on YARN-2969:
--

Duplicated with YARN-1042: add ability to specify affinity/anti-affinity in 
container requests.

 allocate resource on different nodes for task
 -

 Key: YARN-2969
 URL: https://issues.apache.org/jira/browse/YARN-2969
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Yang Hao

 At the help of slider, YARN will be a common resource managing OS and some 
 application would like to apply container( or component on slider) on 
 different nodes, so a configuration of allocating resource on different will 
 be helpful



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-197) Add a separate log server

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394939#comment-14394939
 ] 

Junping Du commented on YARN-197:
-

Hi [~seth.siddha...@gmail.com], given we already have generic Application 
History Serve (deprecated) and timeline server (v1) is there and timeline 
service v2 is in development, it sounds no necessary to have a separate log 
server now. Can we close it?

 Add a separate log server
 -

 Key: YARN-197
 URL: https://issues.apache.org/jira/browse/YARN-197
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Siddharth Seth

 Currently, the job history server is being used for log serving. A separate 
 log server can be added which can deal with serving logs, along with other 
 functionality like log retention, merging, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-329) yarn CHANGES.txt link missing from docs Reference

2015-04-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395063#comment-14395063
 ] 

Allen Wittenauer commented on YARN-329:
---

Removing the fix version because we need an actual patch to point to...

 yarn CHANGES.txt link missing from docs Reference
 -

 Key: YARN-329
 URL: https://issues.apache.org/jira/browse/YARN-329
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 0.23.5
Reporter: Thomas Graves
Priority: Minor

 Looking at the hadoop 0.23 docs: http://hadoop.apache.org/docs/r0.23.5/
 There is no link to the yarn CHANGES.txt in the Reference menu on the left 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-463) Show explicitly excluded nodes on the UI

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-463.
-
Resolution: Implemented

We already have decommission nodes in UI page, so resolve this JIRA.

 Show explicitly excluded nodes on the UI
 

 Key: YARN-463
 URL: https://issues.apache.org/jira/browse/YARN-463
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Vinod Kumar Vavilapalli
  Labels: usability

 Nodes can be explicitly excluded via the config 
 yarn.resourcemanager.nodes.exclude-path. We should have a way of displaying 
 this list via web and command line UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a FairOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Summary: Implement a FairOrderingPolicy  (was: Implement a Fair 
SchedulerOrderingPolicy)

 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch


 Implement a Fair Comparator for the Scheduler Comparator Ordering Policy 
 which prefers to allocate to SchedulerProcesses with least current usage, 
 very similar to the FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 An implementation of a Scheduler Comparator for use with the Scheduler 
 Comparator Ordering Policy will be built with the below comparison for 
 ordering applications for container assignment (ascending) and for preemption 
 (descending)
 Current resource usage - less usage is lesser
 Submission time - earlier is lesser
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 name, which is lexically FIFO for that comparison (first submitted is lesser)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Summary: Create Initial OrderingPolicy Framework and FifoOrderingPolicy  
(was: Create Initial OrderingPolicy Framework)

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Description: Create the initial framework required for using 
OrderingPolicies and an initial FifoOrderingPolicy  (was: Create the initial 
framework required for using OrderingPolicies)

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395508#comment-14395508
 ] 

Craig Welch commented on YARN-3318:
---

[~vinodkv]

bq. ...We can strictly focus on the policy framework here...

Sure, limited patch to framework

bq. ...You could also say SchedulableProcess...

SchedulableProcess it is, done

bq. I agree to this, but we are not in a position to support the APIs, CLI, 
config names in a supportable manner yet. They may or may not change depending 
on how parent queue policies, limit policies evolve. For that reason alone, I 
am saying that (1) Don't make the configurations public yet, or put a warning 
saying that they are unstable and (2) don't expose them in CLI , REST APIs yet. 
It's okay to put in the web UI, web UI scraping is not a contract.

You can't see it, because it's part of Capacity Scheduler Integration, but 
removed CLI and proto related change.  There was no rest api change, the web UI 
change is still present.  Will warn unstable when added to config files in the 
scheduler integration patch

bq. SchedulerApplicationAttempt.getDemand() should be private

Done

bq. updateCaches() - updateState() / updateSchedulingState() as that is what 
it is doing?  getCachedConsumption() / getCachedDemand(): simply getCurrent*() 
? What is the need for reorderOnContainerAllocate () / 
reorderOnContainerRelease()?

Is now getSchedulingConsumption(); getSchedulingDemand(); 
updateSchedulingState();

This is needed because mutable values which are used for ordering cannot be 
allowed to change for an item in the tree, else it will not be found in some 
cases during the delete before reinsert process which occurs when a 
schedulable's mutable values used in comparison change (for fairness, changes 
to consumption and potentially demand)  Not all OrderingPolicies require 
reordering on these events, for efficiency they get to decide if they do or 
not, hence the reorderOn.  The reorderOn are now   
reorderForContainerAllocation  reorderForContainerRelease

bq. Move all the comparator related classed into their own package
No longer needed as comparators are now just a property of policies, see below 
for details

bq. This is really a ComparatorBasedOrderingPolicy. Do we really see 
non-comparator based ordering-policy. We are unnecessarily adding two 
abstractions - adding policies and comparators

Originally, there was a perceived need to be able to support a more flexible 
interface than the comparator one, but also a desire to build up a simpler, 
composible abstraction to be used with an instance of the former which had 
most of the hard stuff done.  Given that all of the policies we've 
contemplated building fit the latter abstraction and the level of flexibility 
does not appear to actually be that different, I think it's fair to say that we 
only need what was previously the SchedulerComparator abstraction as a 
plugin-point.  Given that, a slightly refactored version of the 
SchedulerComparator abstraction is now the only plugin point and is now what 
goes by the name of OrderingPolicy.  What was previously the OrderingPolicy 
is now a single concrete class implementing the surrounding logic, meant to be 
usable from any scheduler, named SchedulingOrder.  So, one abstraction, a 
comparator-based ordering-policy.  If we really do find we need a flexibility 
we don't have some day, the SchedulingOrder class could be abstracted to 
provide that higher level abstraction - but as we see no need for it now, and 
it appears probably never will, there's no reason to do so at present

bq. ...Use className.getName()...

Done

[~leftnoteasy]

bq. ...I prefer what Vinod suggested, split SchedulerProcess to be 
QueueSchedulable and AppSchedulable ...

I don't see that he has suggested that.  In any case, with the removal of 
*Serial* and the move to compareInputOrderTo() I don't at present see a need 
to have separate subtypes for app and queue to avoid dangling properties.  
And, I think if we do it right we won't end up introducing them.  By splitting 
in the suggested way we commit ourselves to either multiple comparators (to use 
the differing functionality) or awkward testing of subtype/etc logic in one 
comparator - so it basically moves the complexity/awkwardness, it doesn't 
eliminate it.  I've refactored such that the Policy now provides a Comparator 
as opposed to extending it, so there is now room for it to provide multiple 
comparators and handle subtypes if need be, but I think we should wait until we 
see that we must do that before doing so, as I don't believe we will end up 
needing to (but if we do, existing code should need little change, and 
implementing what you suggest should be essentially additive...)

bq. ...About inherit relationships between interfaces/classes...

Policies will be composed to achieve combined capabilities yet the collection 
of 

[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-04-03 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395412#comment-14395412
 ] 

zhihai xu commented on YARN-2893:
-

Hi [~jira.shegalov], I can catch the exception for all the code. 
try {
  Credentials credentials = parseCredentials(submissionContext);
  if (UserGroupInformation.isSecurityEnabled()) {
this.rmContext.getDelegationTokenRenewer().addApplicationAsync(appId,
credentials,
submissionContext.getCancelTokensWhenComplete(),
application.getUser())
  } else {
this.rmContext.getDispatcher().getEventHandler()
.handle(new RMAppEvent(applicationId, RMAppEventType.START));
  }
} catch (Exception e) {
LOG.warn(Unable to parse credentials., e);
// Sending APP_REJECTED is fine, since we assume that the
// RMApp is in NEW state and thus we haven't yet informed the
// scheduler about the existence of the application
assert application.getState() == RMAppState.NEW;
this.rmContext.getDispatcher().getEventHandler()
  .handle(new RMAppRejectedEvent(applicationId, e.getMessage()));
throw RPCUtil.getRemoteException(e);
}
{code}
Are you ok with above change?
I think it will be better to parseCredentials and catch the exception for 
Security not Enabled case, So we can find corrupted credentials from Client 
earlier.

 AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
 --

 Key: YARN-2893
 URL: https://issues.apache.org/jira/browse/YARN-2893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: zhihai xu
 Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
 YARN-2893.002.patch


 MapReduce jobs on our clusters experience sporadic failures due to corrupt 
 tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-505) NPE at AsyncDispatcher$GenericEventHandler

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-505.
-
Resolution: Won't Fix

 NPE at AsyncDispatcher$GenericEventHandler
 --

 Key: YARN-505
 URL: https://issues.apache.org/jira/browse/YARN-505
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 3.0.0
Reporter: Przemyslaw Pretki
Priority: Minor

 Steps to reproduce:
 {code}
 @Test
 public void testAsyncDispatcher() 
 {
 AsyncDispatcher dispatcher = new AsyncDispatcher();
 EventHandler handler = dispatcher.getEventHandler();
 handler.handle(null);
 }
 {code}
 Moreover, event taken from *BlockingQueue* will never be *null*, so that it 
 seems that the following condition is not necessary 
 (AsyncDispatcher.createThread() method): 
 {code}
 if (event != null) {
 dispatch(event);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-551) The option shell_command of DistributedShell had better support compound command

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395475#comment-14395475
 ] 

Junping Du commented on YARN-551:
-

DistributedShell support option of --shell_args to put extra args after shell 
command. Resolve this JIRA as not a problem.

 The option shell_command of DistributedShell had better support  compound 
 command
 -

 Key: YARN-551
 URL: https://issues.apache.org/jira/browse/YARN-551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: rainy Yu

 The option shell_command of DistributedShell must be such  single command as 
 'ls', not be compound command such as 'ps -ef' that including blank character.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-551) The option shell_command of DistributedShell had better support compound command

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-551.
-
Resolution: Not a Problem

 The option shell_command of DistributedShell had better support  compound 
 command
 -

 Key: YARN-551
 URL: https://issues.apache.org/jira/browse/YARN-551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: rainy Yu

 The option shell_command of DistributedShell must be such  single command as 
 'ls', not be compound command such as 'ps -ef' that including blank character.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-375) FIFO scheduler may crash due to bugg app

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395503#comment-14395503
 ] 

Junping Du commented on YARN-375:
-

This won't happen because after AM sending resource requests to RM (via 
ApplicationMasterProtocol) in allocate(AllocateRequest request), the RM will do 
sanity check against it which include checking memory  0.
Related code pieces:
In ApplicationMasterService.java,
{code}
RMServerUtils.validateResourceRequests(ask,
rScheduler.getMaximumResourceCapability());
{code}

In RMServerUtils.java,
{code}
 public static void validateResourceRequest(ResourceRequest resReq,
  Resource maximumResource) throws InvalidResourceRequestException {
if (resReq.getCapability().getMemory()  0 ||
resReq.getCapability().getMemory()  maximumResource.getMemory()) {
  throw new InvalidResourceRequestException(Invalid resource request
  + , requested memory  0
  + , or requested memory  max configured
  + , requestedMemory= + resReq.getCapability().getMemory()
  + , maxMemory= + maximumResource.getMemory());
}
...
{code}
Will resolve this JIRA as not a problem.

 FIFO scheduler may crash due to bugg app  
 --

 Key: YARN-375
 URL: https://issues.apache.org/jira/browse/YARN-375
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Arun C Murthy
Priority: Critical

 The following code should check for a 0 return value rather than crash!
 {code}
 int availableContainers = 
   node.getAvailableResource().getMemory() / capability.getMemory(); // 
 TODO: A buggy
 // 
 application
 // 
 with this
 // 
 zero would
 // 
 crash the
 // 
 scheduler.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-375) FIFO scheduler may crash due to bugg app

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-375.
-
Resolution: Not a Problem

 FIFO scheduler may crash due to bugg app  
 --

 Key: YARN-375
 URL: https://issues.apache.org/jira/browse/YARN-375
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Arun C Murthy
Priority: Critical

 The following code should check for a 0 return value rather than crash!
 {code}
 int availableContainers = 
   node.getAvailableResource().getMemory() / capability.getMemory(); // 
 TODO: A buggy
 // 
 application
 // 
 with this
 // 
 zero would
 // 
 crash the
 // 
 scheduler.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3448) Add Rolling Time To Lives Level DB Plugin Capabilities

2015-04-03 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-3448:
--
Description: 
For large applications, the majority of the time in LeveldbTimelineStore is 
spent deleting old entities record at a time. An exclusive write lock is held 
during the entire deletion phase which in practice can be hours. If we are to 
relax some of the consistency constraints, other performance enhancing 
techniques can be employed to maximize the throughput and minimize locking time.

Split the 5 sections of the leveldb database (domain, owner, start time, 
entity, index) into 5 separate databases. This allows each database to maximize 
the read cache effectiveness based on the unique usage patterns of each 
database. With 5 separate databases each lookup is much faster. This can also 
help with I/O to have the entity and index databases on separate disks.

Rolling DBs for entity and index DBs. 99.9% of the data are in these two 
sections 4:1 ration (index to entity) at least for tez. We replace DB record 
removal with file system removal if we create a rolling set of databases that 
age out and can be efficiently removed. To do this we must place a constraint 
to always place an entity's events into it's correct rolling db instance based 
on start time. This allows us to stitching the data back together while reading 
and artificial paging.

Relax the synchronous writes constraints. If we are willing to accept losing 
some records that we not flushed in the operating system during a crash, we can 
use async writes that can be much faster.

Prefer Sequential writes. sequential writes can be several times faster than 
random writes. Spend some small effort arranging the writes in such a way that 
will trend towards sequential write performance over random write performance.

  was:For large applications, the majority of the time in LeveldbTimelineStore 
is spent deleting old entities record at a time. A write lock is held during 
the entire deletion phase which in practice can be hours. An alternative is to 
create a rolling set of databases that age out and can be efficiently removed 
via a recursive directory delete. The removes the lock in the deletion thread 
and clients and servers can share access to the underlying database which 
already implements its only internal locking mechanism.


 Add Rolling Time To Lives Level DB Plugin Capabilities
 --

 Key: YARN-3448
 URL: https://issues.apache.org/jira/browse/YARN-3448
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jonathan Eagles

 For large applications, the majority of the time in LeveldbTimelineStore is 
 spent deleting old entities record at a time. An exclusive write lock is held 
 during the entire deletion phase which in practice can be hours. If we are to 
 relax some of the consistency constraints, other performance enhancing 
 techniques can be employed to maximize the throughput and minimize locking 
 time.
 Split the 5 sections of the leveldb database (domain, owner, start time, 
 entity, index) into 5 separate databases. This allows each database to 
 maximize the read cache effectiveness based on the unique usage patterns of 
 each database. With 5 separate databases each lookup is much faster. This can 
 also help with I/O to have the entity and index databases on separate disks.
 Rolling DBs for entity and index DBs. 99.9% of the data are in these two 
 sections 4:1 ration (index to entity) at least for tez. We replace DB record 
 removal with file system removal if we create a rolling set of databases that 
 age out and can be efficiently removed. To do this we must place a constraint 
 to always place an entity's events into it's correct rolling db instance 
 based on start time. This allows us to stitching the data back together while 
 reading and artificial paging.
 Relax the synchronous writes constraints. If we are willing to accept losing 
 some records that we not flushed in the operating system during a crash, we 
 can use async writes that can be much faster.
 Prefer Sequential writes. sequential writes can be several times faster than 
 random writes. Spend some small effort arranging the writes in such a way 
 that will trend towards sequential write performance over random write 
 performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3448) Add Rolling Time To Lives Level DB Plugin Capabilities

2015-04-03 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles reassigned YARN-3448:
-

Assignee: Jonathan Eagles

 Add Rolling Time To Lives Level DB Plugin Capabilities
 --

 Key: YARN-3448
 URL: https://issues.apache.org/jira/browse/YARN-3448
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles

 For large applications, the majority of the time in LeveldbTimelineStore is 
 spent deleting old entities record at a time. An exclusive write lock is held 
 during the entire deletion phase which in practice can be hours. If we are to 
 relax some of the consistency constraints, other performance enhancing 
 techniques can be employed to maximize the throughput and minimize locking 
 time.
 Split the 5 sections of the leveldb database (domain, owner, start time, 
 entity, index) into 5 separate databases. This allows each database to 
 maximize the read cache effectiveness based on the unique usage patterns of 
 each database. With 5 separate databases each lookup is much faster. This can 
 also help with I/O to have the entity and index databases on separate disks.
 Rolling DBs for entity and index DBs. 99.9% of the data are in these two 
 sections 4:1 ration (index to entity) at least for tez. We replace DB record 
 removal with file system removal if we create a rolling set of databases that 
 age out and can be efficiently removed. To do this we must place a constraint 
 to always place an entity's events into it's correct rolling db instance 
 based on start time. This allows us to stitching the data back together while 
 reading and artificial paging.
 Relax the synchronous writes constraints. If we are willing to accept losing 
 some records that we not flushed in the operating system during a crash, we 
 can use async writes that can be much faster.
 Prefer Sequential writes. sequential writes can be several times faster than 
 random writes. Spend some small effort arranging the writes in such a way 
 that will trend towards sequential write performance over random write 
 performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Summary: Create Initial OrderingPolicy Framework  (was: Create Initial 
OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting 
present behavior)

 Create Initial OrderingPolicy Framework
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Description: Create the initial framework required for using 
OrderingPolicies  (was: Create the initial framework required for using 
OrderingPolicies with SchedulerApplicaitonAttempts and integrate with the 
CapacityScheduler.   This will include an implementation which is compatible 
with current FIFO behavior.)

 Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
 LeafQueue supporting present behavior
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.45.patch

 Create Initial OrderingPolicy Framework
 ---

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-505) NPE at AsyncDispatcher$GenericEventHandler

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395456#comment-14395456
 ] 

Junping Du commented on YARN-505:
-

We should never handle null event for AsyncDispatcher, and this is a basic 
assumption for every call of handle(event) on AsyncDispatcher in YARN. We do 
have such a practice that: we don't check null if this object is not supposed 
to be null. If it (become null) do happen in some situations, then we fix these 
situations because that is not we expect. In this case, NPE is a warn for us 
that something unexpected happens. I will resolve this JIRA as won't fix.

 NPE at AsyncDispatcher$GenericEventHandler
 --

 Key: YARN-505
 URL: https://issues.apache.org/jira/browse/YARN-505
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 3.0.0
Reporter: Przemyslaw Pretki
Priority: Minor

 Steps to reproduce:
 {code}
 @Test
 public void testAsyncDispatcher() 
 {
 AsyncDispatcher dispatcher = new AsyncDispatcher();
 EventHandler handler = dispatcher.getEventHandler();
 handler.handle(null);
 }
 {code}
 Moreover, event taken from *BlockingQueue* will never be *null*, so that it 
 seems that the following condition is not necessary 
 (AsyncDispatcher.createThread() method): 
 {code}
 if (event != null) {
 dispatch(event);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3435) AM container to be allocated Appattempt AM container shown as null

2015-04-03 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3435:
---
Component/s: resourcemanager

 AM container to be allocated Appattempt AM container shown as null
 --

 Key: YARN-3435
 URL: https://issues.apache.org/jira/browse/YARN-3435
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
 Environment: 1RM,1DN
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Trivial
 Attachments: Screenshot.png, YARN-3435.001.patch


 Submit yarn application
 Open http://rm:8088/cluster/appattempt/appattempt_1427984982805_0003_01 
 Before the AM container is allocated 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3436) Doc WebServicesIntro.html Example Rest API url wrong

2015-04-03 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3436:
---
Component/s: resourcemanager
 documentation

 Doc WebServicesIntro.html Example Rest API url wrong
 

 Key: YARN-3436
 URL: https://issues.apache.org/jira/browse/YARN-3436
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation, resourcemanager
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: YARN-3436.001.patch


 /docs/current/hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html
 {quote}
 Response Examples
 JSON response with single resource
 HTTP Request: GET 
 http://rmhost.domain:8088/ws/v1/cluster/{color:red}app{color}/application_1324057493980_0001
 Response Status Line: HTTP/1.1 200 OK
 {quote}
 Url should be ws/v1/cluster/{color:red}apps{color} .
 2 examples on same page are wrong



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-800) Clicking on an AM link for a running app leads to a HTTP 500

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-800.
-
Resolution: Duplicate

 Clicking on an AM link for a running app leads to a HTTP 500
 

 Key: YARN-800
 URL: https://issues.apache.org/jira/browse/YARN-800
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Arpit Gupta
Priority: Minor

 Clicking the AM link tries to open up a page with url like
 http://hostname:8088/proxy/application_1370886527995_0645/
 and this leads to an HTTP 500



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395518#comment-14395518
 ] 

Hadoop QA commented on YARN-3318:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12709373/YARN-3318.47.patch
  against trunk revision ef591b1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1149 javac 
compiler warnings (more than the trunk's current 1148 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7217//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7217//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7217//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7217//console

This message is automatically generated.

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, YARN-3318.47.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-04-03 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395255#comment-14395255
 ] 

Gera Shegalov commented on YARN-2893:
-

Thanks [~zxu] for the patch, and apologies for the delay. I skimmed over the 
patch, and it looks good overall.

Can you keep your logic in {{RMAppManager#submitApplicationmove}} with 
parseCredentials but put it back under {{if 
(UserGroupInformation.isSecurityEnabled()) {}}

 AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
 --

 Key: YARN-2893
 URL: https://issues.apache.org/jira/browse/YARN-2893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: zhihai xu
 Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
 YARN-2893.002.patch


 MapReduce jobs on our clusters experience sporadic failures due to corrupt 
 tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a FairOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.45.patch

 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch, YARN-3319.45.patch


 Implement a FairOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 id, which is generally lexically FIFO for that comparison



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-520) webservices API ws/v1/cluster/nodes doesn't return LOST nodes

2015-04-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-520.
-
Resolution: Duplicate

 webservices API ws/v1/cluster/nodes doesn't return LOST nodes
 -

 Key: YARN-520
 URL: https://issues.apache.org/jira/browse/YARN-520
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 0.23.6
Reporter: Nathan Roberts

 webservices API ws/v1/cluster/nodes doesn't return LOST nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-520) webservices API ws/v1/cluster/nodes doesn't return LOST nodes

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395465#comment-14395465
 ] 

Junping Du commented on YARN-520:
-

This is already addressed and resolved in YARN-642. Marked it as duplicated.

 webservices API ws/v1/cluster/nodes doesn't return LOST nodes
 -

 Key: YARN-520
 URL: https://issues.apache.org/jira/browse/YARN-520
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 0.23.6
Reporter: Nathan Roberts

 webservices API ws/v1/cluster/nodes doesn't return LOST nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a FairOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Description: 
Implement a FairOrderingPolicy which prefers to allocate to SchedulerProcesses 
with least current usage, very similar to the FairScheduler's FairSharePolicy.  

The Policy will offer allocations to applications in a queue in order of least 
resources used, and preempt applications in reverse order (from most resources 
used). This will include conditional support for sizeBasedWeight style 
adjustment

Optionally, based on a conditional configuration to enable sizeBasedWeight 
(default false), an adjustment to boost larger applications (to offset the 
natural preference for smaller applications) will adjust the resource usage 
value based on demand, dividing it by the below value:

Math.log1p(app memory demand) / Math.log(2);

In cases where the above is indeterminate (two applications are equal after 
this comparison), behavior falls back to comparison based on the application 
id, which is generally lexically FIFO for that comparison



  was:
Implement a Fair Comparator for the Scheduler Comparator Ordering Policy which 
prefers to allocate to SchedulerProcesses with least current usage, very 
similar to the FairScheduler's FairSharePolicy.  

The Policy will offer allocations to applications in a queue in order of least 
resources used, and preempt applications in reverse order (from most resources 
used). This will include conditional support for sizeBasedWeight style 
adjustment

An implementation of a Scheduler Comparator for use with the Scheduler 
Comparator Ordering Policy will be built with the below comparison for ordering 
applications for container assignment (ascending) and for preemption 
(descending)

Current resource usage - less usage is lesser
Submission time - earlier is lesser

Optionally, based on a conditional configuration to enable sizeBasedWeight 
(default false), an adjustment to boost larger applications (to offset the 
natural preference for smaller applications) will adjust the resource usage 
value based on demand, dividing it by the below value:

Math.log1p(app memory demand) / Math.log(2);

In cases where the above is indeterminate (two applications are equal after 
this comparison), behavior falls back to comparison based on the application 
name, which is lexically FIFO for that comparison (first submitted is lesser)




 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch


 Implement a FairOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 id, which is generally lexically FIFO for that comparison



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-402) Dispatcher warn message is too late

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395402#comment-14395402
 ] 

Junping Du commented on YARN-402:
-

Thanks [~lohit] for reporting this issue.
I think it could be a little too allergic to give a warn when half full of the 
queue. By default, the size of LinkedBlockingQueue is: Interger.MAX_VALUE which 
is 2^31-1. Half full means: still ~2^30 available for use so it could be too 
early.
Do we want a configurable value here? I think it could be a little overkill. If 
so, we may need to pick up a more reasonable fixed value here.
IMO, rmDispatcher could be the most busy AsynDispatcher in YARN today, 
RMNodeEvent, SchedulerEvent, RMAppEvent, RMAppAttemptEvent, 
NodeListManagerEvent, AMLauncherEvent, etc. are all get broadcasted on this 
single dispatcher. Within these events, SchedulerEvent seems to be the most 
active events: let's assume thousands of nodes events and thousands of 
application attempt events generated in 1 second (default heartbeat interval 
for NM-RM heartbeat and AMRMClientAsync heartbeat to RM) in large cluster, then 
we assume 10*1000 scheduler events could happens on rmDispatcher, then we can 
estimate up to 10*(10*1000) events (include other events than SchedulerEvent) 
could happens per second there. Based on this assumption, if we want to warn 
ahead of 10 seconds before queue get full (assume peek operations get slow), so 
may be 10 (seconds) * 10 (event types on rmScheduler) * (10*1000) (scale of 
Nodes and Apps / interval) sounds like a reasonable value here? 
In addition, I think we should fix tiny issue in below code (qSize % 1000 == 0) 
doesn't make sense as qSize default to be 2^32 -1:
{code}
  int qSize = eventQueue.size();
  if (qSize !=0  qSize %1000 == 0) {
LOG.info(Size of event-queue is  + qSize);
  }
  int remCapacity = eventQueue.remainingCapacity();
  if (remCapacity  1000) {
LOG.warn(Very low remaining capacity in the event-queue: 
+ remCapacity);
  }
{code}

 Dispatcher warn message is too late
 ---

 Key: YARN-402
 URL: https://issues.apache.org/jira/browse/YARN-402
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Lohit Vijayarenu
Priority: Minor

 AsyncDispatcher throws out Warn when capacity remaining is less than 1000
 {noformat}
 if (remCapacity  1000) {
 LOG.warn(Very low remaining capacity in the event-queue: 
 + remCapacity);
   }
 {noformat}
 What would be useful is to warn much before that, may be half full instead of 
 when queue is completely full. I see that eventQueue capacity is int value. 
 So, if one warn's queue has only 1000 capacity left, then service definitely 
 has serious problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395427#comment-14395427
 ] 

Hadoop QA commented on YARN-3318:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12709343/YARN-3318.45.patch
  against trunk revision 023133c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1149 javac 
compiler warnings (more than the trunk's current 1148 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7216//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7216//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7216//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7216//console

This message is automatically generated.

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3449) Recover appTokenKeepAliveMap upon nodemanager restart

2015-04-03 Thread Junping Du (JIRA)
Junping Du created YARN-3449:


 Summary: Recover appTokenKeepAliveMap upon nodemanager restart
 Key: YARN-3449
 URL: https://issues.apache.org/jira/browse/YARN-3449
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.6.0, 2.7.0
Reporter: Junping Du
Assignee: Junping Du


appTokenKeepAliveMap in NodeStatusUpdaterImpl is used to keep application alive 
after application is finished but NM still need app token to do log aggregation 
(when enable security and log aggregation). 
The applications are only inserted into this map when receiving 
getApplicationsToCleanup() from RM heartbeat response. And RM only send this 
info one time in RMNodeImpl.updateNodeHeartbeatResponseForCleanup(). NM restart 
work preserving should put appTokenKeepAliveMap into NMStateStore and get 
recovered after restart. Without doing this, RM could terminate application 
earlier, so log aggregation could be failed if security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a FairOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.47.patch

 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch, 
 YARN-3319.45.patch, YARN-3319.47.patch


 Implement a FairOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 id, which is generally lexically FIFO for that comparison



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.47.patch

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, YARN-3318.47.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3448) Add Rolling Time To Lives Level DB Plugin Capabilities

2015-04-03 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created YARN-3448:
-

 Summary: Add Rolling Time To Lives Level DB Plugin Capabilities
 Key: YARN-3448
 URL: https://issues.apache.org/jira/browse/YARN-3448
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jonathan Eagles


For large applications, the majority of the time in LeveldbTimelineStore is 
spent deleting old entities record at a time. A write lock is held during the 
entire deletion phase which in practice can be hours. An alternative is to 
create a rolling set of databases that age out and can be efficiently removed 
via a recursive directory delete. The removes the lock in the deletion thread 
and clients and servers can share access to the underlying database which 
already implements its only internal locking mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-402) Dispatcher warn message is too late

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395408#comment-14395408
 ] 

Junping Du commented on YARN-402:
-

Forget that the queue can be constructed with other queue type other than 
LinkedBlockingQueue. So may be it could be the smaller value within half of 
queue size (if queue is not LinkedBlockingQueue by default) and 1000* 1000 
(estimated as above).

 Dispatcher warn message is too late
 ---

 Key: YARN-402
 URL: https://issues.apache.org/jira/browse/YARN-402
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Lohit Vijayarenu
Priority: Minor

 AsyncDispatcher throws out Warn when capacity remaining is less than 1000
 {noformat}
 if (remCapacity  1000) {
 LOG.warn(Very low remaining capacity in the event-queue: 
 + remCapacity);
   }
 {noformat}
 What would be useful is to warn much before that, may be half full instead of 
 when queue is completely full. I see that eventQueue capacity is int value. 
 So, if one warn's queue has only 1000 capacity left, then service definitely 
 has serious problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-04-03 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395409#comment-14395409
 ] 

zhihai xu commented on YARN-2893:
-

[~jira.shegalov], thanks for the review.
I can put back catching the exception for {{if 
(UserGroupInformation.isSecurityEnabled()) {}}. I will keep the change to 
parseCredentials for Security not Enabled case, So we can reject an application 
with corrupted credentials for none-secure one. 
Are you ok with it?

 AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
 --

 Key: YARN-2893
 URL: https://issues.apache.org/jira/browse/YARN-2893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: zhihai xu
 Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
 YARN-2893.002.patch


 MapReduce jobs on our clusters experience sporadic failures due to corrupt 
 tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3448) Add Rolling Time To Lives Level DB Plugin Capabilities

2015-04-03 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-3448:
--
Attachment: YARN-3448.1.patch

 Add Rolling Time To Lives Level DB Plugin Capabilities
 --

 Key: YARN-3448
 URL: https://issues.apache.org/jira/browse/YARN-3448
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Attachments: YARN-3448.1.patch


 For large applications, the majority of the time in LeveldbTimelineStore is 
 spent deleting old entities record at a time. An exclusive write lock is held 
 during the entire deletion phase which in practice can be hours. If we are to 
 relax some of the consistency constraints, other performance enhancing 
 techniques can be employed to maximize the throughput and minimize locking 
 time.
 Split the 5 sections of the leveldb database (domain, owner, start time, 
 entity, index) into 5 separate databases. This allows each database to 
 maximize the read cache effectiveness based on the unique usage patterns of 
 each database. With 5 separate databases each lookup is much faster. This can 
 also help with I/O to have the entity and index databases on separate disks.
 Rolling DBs for entity and index DBs. 99.9% of the data are in these two 
 sections 4:1 ration (index to entity) at least for tez. We replace DB record 
 removal with file system removal if we create a rolling set of databases that 
 age out and can be efficiently removed. To do this we must place a constraint 
 to always place an entity's events into it's correct rolling db instance 
 based on start time. This allows us to stitching the data back together while 
 reading and artificial paging.
 Relax the synchronous writes constraints. If we are willing to accept losing 
 some records that we not flushed in the operating system during a crash, we 
 can use async writes that can be much faster.
 Prefer Sequential writes. sequential writes can be several times faster than 
 random writes. Spend some small effort arranging the writes in such a way 
 that will trend towards sequential write performance over random write 
 performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.48.patch

javac error looks bogus, existing error has simply moved
findbugs looks bogus, class it's complaining about is static. uploading new 
version so see if it notices now
TestFairScheduler passes on my box with the patch, and can't see any way it 
would be effected.  Tests will rerun with new patch, so we'll see.

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework and FifoOrderingPolicy

2015-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395566#comment-14395566
 ] 

Hadoop QA commented on YARN-3318:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12709391/YARN-3318.48.patch
  against trunk revision ef591b1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1149 javac 
compiler warnings (more than the trunk's current 1148 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7218//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7218//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7218//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7218//console

This message is automatically generated.

 Create Initial OrderingPolicy Framework and FifoOrderingPolicy
 --

 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3318.13.patch, YARN-3318.14.patch, 
 YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, 
 YARN-3318.36.patch, YARN-3318.39.patch, YARN-3318.45.patch, 
 YARN-3318.47.patch, YARN-3318.48.patch


 Create the initial framework required for using OrderingPolicies and an 
 initial FifoOrderingPolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3437) convert load test driver to timeline service v.2

2015-04-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394437#comment-14394437
 ] 

Junping Du commented on YARN-3437:
--

Thanks [~sjlee0] for delivering a patch here! 
Just quickly go through the patch, looks like we are generating one app 
collector per map task. I think this is good for scalability test on backend 
storage which can be a bottleneck in mainstream cases. In addition, do we want 
to address some extreme cases, e.g. a huge applications will have hundreds of 
thousands or even millions tasks? If so, then may be we want to know a single 
app collector's bottleneck as well for accepting/forwarding messages from 
hundreds of thousands maps. Also, in a real cluster, the mapping from cluster 
to app, and app to tasks are all 1-N mapping. May be making app aggregator 
number configurable (just like map task number, and byte per map, etc.) is 
something we can do for next step?
BTW, it has some duplicated code with YARN-2556 (like 
TimelineServerPerformance.java). Looks like YARN-2556 is in pretty good shape 
and possible to go to trunk and branch-2 quickly. I would remind to keep 
watching that JIRA status and do necessary rebase work if that patch go in and 
we may want to merge it into YARN-2928 branch soon.

 convert load test driver to timeline service v.2
 

 Key: YARN-3437
 URL: https://issues.apache.org/jira/browse/YARN-3437
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-3437.001.patch


 This subtask covers the work for converting the proposed patch for the load 
 test driver (YARN-2556) to work with the timeline service v.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3444) Fixed typo (capability)

2015-04-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394583#comment-14394583
 ] 

Hadoop QA commented on YARN-3444:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12709238/YARN-3444.patch
  against trunk revision 72f6bd4.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7212//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7212//console

This message is automatically generated.

 Fixed typo (capability)
 ---

 Key: YARN-3444
 URL: https://issues.apache.org/jira/browse/YARN-3444
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications/distributed-shell
Reporter: Gabor Liptak
Priority: Minor
 Attachments: YARN-3444.patch


 Fixed typo (capability)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)