[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery

2015-03-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366950#comment-14366950
 ] 

Junping Du commented on YARN-3039:
--

Thanks [~zjshen] and [~sjlee0] for review!

 [Aggregator wireup] Implement ATS app-appgregator service discovery
 ---

 Key: YARN-3039
 URL: https://issues.apache.org/jira/browse/YARN-3039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Fix For: YARN-2928

 Attachments: Service Binding for applicationaggregator of ATS 
 (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, 
 YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, 
 YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, 
 YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch, YARN-3039.9.patch


 Per design in YARN-2928, implement ATS writer service discovery. This is 
 essential for off-node clients to send writes to the right ATS writer. This 
 should also handle the case of AM failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366917#comment-14366917
 ] 

Varun Saxena commented on YARN-3047:


Thanks a lot [~zjshen] for the review.

bq. 1. No need to change timeline/TimelineEvents.java.
Ok.

bq. 2. In YarnConfiguration, how about we still reusing the existing timeline 
service config. I propose config reuse because there doesn't exist the use case 
that we start old timeline server and the new timeline reader server together. 
And change in WebAppUtils should be not necessary too.
Same config has been used by aggregator as well. Thats why kept a new config. I 
guess it is possible that reader runs on the same node as aggregator

bq. 3. NameValuePair is for internal usage only. Let's keep it in the timeline 
service module?
Its in timeline service package itself i.e. 
{{hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/NameValuePair.java}}.
 Did you mean something else ?

bq. Rename TimelineReaderStore to TimelineReader. 
Ok.

bq. I think we don't need to have NullTimelineReader. Instead, we should have a 
POC implementation based on local FS like FileSystemTimelineWriterImpl. But we 
can defer this work in a separate jira if the implementation is not 
straightforward.
Yes NullTimelineReader was just to compile the code as TimelineReader store 
would be an interface. Plan to have FS based implementation as part of 
YARN-3051. Will update a patch for it once this goes in. Probably store related 
code can be removed from this JIRA and handled completely as part of YARN-3051 
to have a focussed review. Thoughts ?

bq. 5. TimelineReaderServer - TimelineWebServer? For 
startTimelineReaderWebApp, can we do something similar to 
TimelineAggregatorsCollection#startWebApp.
The intention for TimelineReaderServer was not to have it merely act as a REST 
endpoint. Hence not the name TimelineWebServer. TimelineReaderServer would use 
RPC as well for instance to serve request coming from YARN CLI. Commands such 
as yarn application used to contact AHS if app was not found in RM. This 
should now be handled by Timeline Reader. For this, I plan to raise another 
JIRA, once this one goes in. 
 
bq. 6. Add the command in yarn and yarn.cmd to start the server.
This as per discussion with Sangjin will be done as part of YARN-3048.

I will probably update a document regarding TimelineReader as soon as possible.



 [Data Serving] Set up ATS reader with basic request serving structure and 
 lifecycle
 ---

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3047.001.patch, YARN-3047.02.patch


 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3181) FairScheduler: Fix up outdated findbugs issues

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366993#comment-14366993
 ] 

Hudson commented on YARN-3181:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #136 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/136/])
Revert YARN-3181. FairScheduler: Fix up outdated findbugs issues. (kasha) 
(kasha: rev 32b43304563c2430c00bc3e142a962d2bc5f4d58)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSOpDurations.java
* hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java


 FairScheduler: Fix up outdated findbugs issues
 --

 Key: YARN-3181
 URL: https://issues.apache.org/jira/browse/YARN-3181
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Brahma Reddy Battula
 Attachments: yarn-3181-1.patch


 In FairScheduler, we have excluded some findbugs-reported errors. Some of 
 them aren't applicable anymore, and there are a few that can be easily fixed 
 without needing an exclusion. It would be nice to fix them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366995#comment-14366995
 ] 

Hudson commented on YARN-3243:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #136 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/136/])
YARN-3243. CapacityScheduler should pass headroom from parent to children to 
make sure ParentQueue obey its capacity limits. Contributed by Wangda Tan. 
(jianhe: rev 487374b7fe0c92fc7eb1406c568952722b5d5b15)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java


 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.8.0

 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch, YARN-3243.5.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource 

[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366996#comment-14366996
 ] 

Hudson commented on YARN-3273:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #136 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/136/])
YARN-3273. Improve scheduler UI to facilitate scheduling analysis and 
debugging. Contributed Rohith Sharmaks (jianhe: rev 
658097d6da1b1aac8e01db459f0c3b456e99652f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestContinuousScheduling.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/MetricsOverviewTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/UserInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java


 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for 

[jira] [Commented] (YARN-3197) Confusing log generated by CapacityScheduler

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366998#comment-14366998
 ] 

Hudson commented on YARN-3197:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #136 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/136/])
YARN-3197. Confusing log generated by CapacityScheduler. Contributed by 
(devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* hadoop-yarn-project/CHANGES.txt


 Confusing log generated by CapacityScheduler
 

 Key: YARN-3197
 URL: https://issues.apache.org/jira/browse/YARN-3197
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, 
 YARN-3197.003.patch, YARN-3197.004.patch


 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366997#comment-14366997
 ] 

Hudson commented on YARN-3205:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #136 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/136/])
YARN-3205. FileSystemRMStateStore should disable FileSystem Cache to avoid get 
a Filesystem with an old configuration. Contributed by Zhihai Xu. (ozawa: rev 
3bc72cc16d8c7b8addd8f565523001dfcc32b891)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* hadoop-yarn-project/CHANGES.txt


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3357) Move TestFifoScheduler to FIFO package

2015-03-18 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3357:
-
Attachment: 0001-YARN-3357.patch

 Move TestFifoScheduler to FIFO package
 --

 Key: YARN-3357
 URL: https://issues.apache.org/jira/browse/YARN-3357
 Project: Hadoop YARN
  Issue Type: Task
  Components: scheduler
Reporter: Rohith
Assignee: Rohith
 Attachments: 0001-YARN-3357.patch


 There are 2 test classes are found for fifo scheduler i.e 
 # org.apache.hadoop.yarn.server.resourcemanager.TestFifoScheduler
 # 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler.
 In these some test cases are common in both that does same functionality has 
 been verified i.e testBlackListNodes. Tests from package 
 org.apache.hadoop.yarn.server.resourcemanager can be merged with  package 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo. And eliminate 
 duplicate tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3034) [Aggregator wireup] Implement RM starting its ATS writer

2015-03-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367031#comment-14367031
 ] 

Junping Du commented on YARN-3034:
--

bq. As a further clarification, my problem is mainly on the test distributed 
shell. Right now we're using very ad hoc ways to set which version of timeline 
service we're using. Currently we're using test names to distinguish timeline 
V1 and V2, and since both versions work on the same port, we need to explicitly 
disable one version to use the other. Instead of doing this in the test script 
each time, I'd hope that there are some global settings/logic on the server 
side to decide which exact version of timeline service to launch. All the tests 
need to do is to check (and set) the version of active timeline service and 
launch the mini YARN cluster. It's a little bit off topic here so let move the 
rest discussion to YARN-3352. 
Thanks [~gtCarrera9] for clarifying more on this. Agree that we should have a 
more clean way to launch v1 and v2 service in unit test. May be launch both on 
different ports? Anyway, let's continue the discussion on YARN-3352.

Back to the latest patch, mostly looks fine to me. Two minor comments:
{code}
+  public static final String TIMELINE_SERVICE_VERSION = YARN_PREFIX
+  + timeline-service.version;
{code} 
Can we replace this with TIMELINE_SERVICE_PREFIX + version ?

{code}
+YarnConfiguration.DEFAULT_TIMELINE_SERVICE_ENABLED)
+ conf.getBoolean(
+YarnConfiguration.SYSTEM_METRICS_PUBLISHER_ENABLED,
+YarnConfiguration.DEFAULT_SYSTEM_METRICS_PUBLISHER_ENABLED)
+ YarnConfiguration.TIMELINE_SERVICE_VERSION_ONE.equals(conf.get(
+YarnConfiguration.TIMELINE_SERVICE_VERSION,
+YarnConfiguration.DEFAULT_TIMELINE_SERVICE_VERSION));
{code}
equals = equalsIgnoreCase as user may input v1 or v2 (in lower case) which 
should also be accepted. Also, we should add a warning message log if user put 
something illegal here or it just get silient without any warn.

BTW, [~sjlee0] has a refactor patch on YARN- which should get in quickly. 
This patch may need to rebase when that one is in.

 [Aggregator wireup] Implement RM starting its ATS writer
 

 Key: YARN-3034
 URL: https://issues.apache.org/jira/browse/YARN-3034
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
 Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
 YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch


 Per design in YARN-2928, implement resource managers starting their own ATS 
 writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3181) FairScheduler: Fix up outdated findbugs issues

2015-03-18 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-3181:
---
Attachment: YARN-3181-002.patch

 FairScheduler: Fix up outdated findbugs issues
 --

 Key: YARN-3181
 URL: https://issues.apache.org/jira/browse/YARN-3181
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Brahma Reddy Battula
 Attachments: YARN-3181-002.patch, yarn-3181-1.patch


 In FairScheduler, we have excluded some findbugs-reported errors. Some of 
 them aren't applicable anymore, and there are a few that can be easily fixed 
 without needing an exclusion. It would be nice to fix them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367023#comment-14367023
 ] 

Varun Saxena commented on YARN-3047:


Did you mean NameValuePair can have package level access instead of public?

 [Data Serving] Set up ATS reader with basic request serving structure and 
 lifecycle
 ---

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3047.001.patch, YARN-3047.02.patch


 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3357) Move TestFifoScheduler to FIFO package

2015-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367108#comment-14367108
 ] 

Hadoop QA commented on YARN-3357:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705333/0001-YARN-3357.patch
  against trunk revision 3411732.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7011//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7011//console

This message is automatically generated.

 Move TestFifoScheduler to FIFO package
 --

 Key: YARN-3357
 URL: https://issues.apache.org/jira/browse/YARN-3357
 Project: Hadoop YARN
  Issue Type: Task
  Components: scheduler
Reporter: Rohith
Assignee: Rohith
 Attachments: 0001-YARN-3357.patch


 There are 2 test classes are found for fifo scheduler i.e 
 # org.apache.hadoop.yarn.server.resourcemanager.TestFifoScheduler
 # 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler.
 In these some test cases are common in both that does same functionality has 
 been verified i.e testBlackListNodes. Tests from package 
 org.apache.hadoop.yarn.server.resourcemanager can be merged with  package 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo. And eliminate 
 duplicate tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367007#comment-14367007
 ] 

Hudson commented on YARN-3243:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #870 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/870/])
YARN-3243. CapacityScheduler should pass headroom from parent to children to 
make sure ParentQueue obey its capacity limits. Contributed by Wangda Tan. 
(jianhe: rev 487374b7fe0c92fc7eb1406c568952722b5d5b15)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java


 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.8.0

 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch, YARN-3243.5.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource need to be 

[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367009#comment-14367009
 ] 

Hudson commented on YARN-3205:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #870 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/870/])
YARN-3205. FileSystemRMStateStore should disable FileSystem Cache to avoid get 
a Filesystem with an old configuration. Contributed by Zhihai Xu. (ozawa: rev 
3bc72cc16d8c7b8addd8f565523001dfcc32b891)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* hadoop-yarn-project/CHANGES.txt


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3357) Move TestFifoScheduler to FIFO package

2015-03-18 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367030#comment-14367030
 ] 

Rohith commented on YARN-3357:
--

Attaching the patch with following changes
# Moved all the tests from 
org.apache.hadoop.yarn.server.resourcemanager.TestFifoScheduler to  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler
# Removed duplicated test which was testing same functionality in both the 
classes i.e {{testBlackListNodes}}
# Other 2 test classes were using TestFifoScheduler as class name for logging 
in test. I just corrected to its own classes.

Kindly review the patch

 Move TestFifoScheduler to FIFO package
 --

 Key: YARN-3357
 URL: https://issues.apache.org/jira/browse/YARN-3357
 Project: Hadoop YARN
  Issue Type: Task
  Components: scheduler
Reporter: Rohith
Assignee: Rohith
 Attachments: 0001-YARN-3357.patch


 There are 2 test classes are found for fifo scheduler i.e 
 # org.apache.hadoop.yarn.server.resourcemanager.TestFifoScheduler
 # 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler.
 In these some test cases are common in both that does same functionality has 
 been verified i.e testBlackListNodes. Tests from package 
 org.apache.hadoop.yarn.server.resourcemanager can be merged with  package 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo. And eliminate 
 duplicate tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367077#comment-14367077
 ] 

Hudson commented on YARN-3243:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2068 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2068/])
YARN-3243. CapacityScheduler should pass headroom from parent to children to 
make sure ParentQueue obey its capacity limits. Contributed by Wangda Tan. 
(jianhe: rev 487374b7fe0c92fc7eb1406c568952722b5d5b15)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java


 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.8.0

 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch, YARN-3243.5.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource need to be 

[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367079#comment-14367079
 ] 

Hudson commented on YARN-3205:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2068 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2068/])
YARN-3205. FileSystemRMStateStore should disable FileSystem Cache to avoid get 
a Filesystem with an old configuration. Contributed by Zhihai Xu. (ozawa: rev 
3bc72cc16d8c7b8addd8f565523001dfcc32b891)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3305) AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367076#comment-14367076
 ] 

Hudson commented on YARN-3305:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2068 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2068/])
YARN-3305. Normalize AM resource request on app submission. Contributed by 
Rohith Sharmaks (jianhe: rev 968425e9f7b850ff9c2ab8ca37a64c3fdbe77dbf)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java


 AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is 
 less than minimumAllocation
 

 Key: YARN-3305
 URL: https://issues.apache.org/jira/browse/YARN-3305
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3305.patch, 0001-YARN-3305.patch, 
 0002-YARN-3305.patch, 0003-YARN-3305.patch


 For given any ResourceRequest, {{CS#allocate}} normalizes request to 
 minimumAllocation if requested memory is less than minimumAllocation.
 But AM-used resource is updated with actual ResourceRequest made by user. 
 This results in AM container allocation more than Max ApplicationMaster 
 Resource.
 This is because AM-Used is updated with actual ResourceRequest made by user 
 while activating the applications. But during allocation of container, 
 ResourceRequest is normalized to minimumAllocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367078#comment-14367078
 ] 

Hudson commented on YARN-3273:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2068 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2068/])
YARN-3273. Improve scheduler UI to facilitate scheduling analysis and 
debugging. Contributed Rohith Sharmaks (jianhe: rev 
658097d6da1b1aac8e01db459f0c3b456e99652f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/MetricsOverviewTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestContinuousScheduling.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/UserInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java


 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons 

[jira] [Commented] (YARN-3197) Confusing log generated by CapacityScheduler

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367010#comment-14367010
 ] 

Hudson commented on YARN-3197:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #870 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/870/])
YARN-3197. Confusing log generated by CapacityScheduler. Contributed by 
(devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java


 Confusing log generated by CapacityScheduler
 

 Key: YARN-3197
 URL: https://issues.apache.org/jira/browse/YARN-3197
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, 
 YARN-3197.003.patch, YARN-3197.004.patch


 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367008#comment-14367008
 ] 

Hudson commented on YARN-3273:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #870 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/870/])
YARN-3273. Improve scheduler UI to facilitate scheduling analysis and 
debugging. Contributed Rohith Sharmaks (jianhe: rev 
658097d6da1b1aac8e01db459f0c3b456e99652f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestContinuousScheduling.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/MetricsOverviewTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/UserInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java


 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons such 

[jira] [Commented] (YARN-3305) AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367006#comment-14367006
 ] 

Hudson commented on YARN-3305:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #870 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/870/])
YARN-3305. Normalize AM resource request on app submission. Contributed by 
Rohith Sharmaks (jianhe: rev 968425e9f7b850ff9c2ab8ca37a64c3fdbe77dbf)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java


 AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is 
 less than minimumAllocation
 

 Key: YARN-3305
 URL: https://issues.apache.org/jira/browse/YARN-3305
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3305.patch, 0001-YARN-3305.patch, 
 0002-YARN-3305.patch, 0003-YARN-3305.patch


 For given any ResourceRequest, {{CS#allocate}} normalizes request to 
 minimumAllocation if requested memory is less than minimumAllocation.
 But AM-used resource is updated with actual ResourceRequest made by user. 
 This results in AM container allocation more than Max ApplicationMaster 
 Resource.
 This is because AM-Used is updated with actual ResourceRequest made by user 
 while activating the applications. But during allocation of container, 
 ResourceRequest is normalized to minimumAllocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3181) FairScheduler: Fix up outdated findbugs issues

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367005#comment-14367005
 ] 

Hudson commented on YARN-3181:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #870 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/870/])
Revert YARN-3181. FairScheduler: Fix up outdated findbugs issues. (kasha) 
(kasha: rev 32b43304563c2430c00bc3e142a962d2bc5f4d58)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSOpDurations.java
* hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml


 FairScheduler: Fix up outdated findbugs issues
 --

 Key: YARN-3181
 URL: https://issues.apache.org/jira/browse/YARN-3181
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Brahma Reddy Battula
 Attachments: yarn-3181-1.patch


 In FairScheduler, we have excluded some findbugs-reported errors. Some of 
 them aren't applicable anymore, and there are a few that can be easily fixed 
 without needing an exclusion. It would be nice to fix them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3364) Clarify Naming of yarn.client.nodemanager-connect.max-wait-ms and yarn.resourcemanager.connect.max-wait.ms

2015-03-18 Thread Andrew Johnson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367167#comment-14367167
 ] 

Andrew Johnson commented on YARN-3364:
--

No, I did not have YARN-3238 applied.  Thanks for that!

Given that and HADOOP-11398 I think this can can be closed.

 Clarify Naming of yarn.client.nodemanager-connect.max-wait-ms and 
 yarn.resourcemanager.connect.max-wait.ms 
 ---

 Key: YARN-3364
 URL: https://issues.apache.org/jira/browse/YARN-3364
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Andrew Johnson

 I encountered an issue recently where the ApplicationMaster for MapReduce 
 jobs would spend hours attempting to connect to a node in my cluster that had 
 died due to a hardware fault.  After debugging this, I found that the 
 yarn.client.nodemanager-connect.max-wait-ms property did not behave as I had 
 expected.  Based on the name I had thought this would set a maximum time 
 limit for attempting to connect to a NodeManager.  The code in 
 org.apache.hadoop.yarn.client.NMProxy corroborated this thought - it used a 
 RetryUpToMaximumTimeWithFixedSleep policy when a  ConnectTimeoutException was 
 thrown, as it was in my case with a dead node.
 However, the RetryUpToMaximumTimeWithFixedSleep policy doesn't actually set a 
 time limit, but instead divides the maximum time by the sleep period to set a 
 total number of retries, regardless of how long those retries take.  As such 
 I was seeing the ApplicationMaster spend much longer attempting to make a 
 connection than I had anticipated.
 The yarn.resourcemanager.connect.max-wait.ms would have the same behavior.  
 These properties would be better named like 
 yarn.client.nodemanager-connect.max.retries and 
 yarn.resourcemanager.connect.max.retries to better align with the actual 
 behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3305) AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367300#comment-14367300
 ] 

Hudson commented on YARN-3305:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2086 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2086/])
YARN-3305. Normalize AM resource request on app submission. Contributed by 
Rohith Sharmaks (jianhe: rev 968425e9f7b850ff9c2ab8ca37a64c3fdbe77dbf)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java


 AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is 
 less than minimumAllocation
 

 Key: YARN-3305
 URL: https://issues.apache.org/jira/browse/YARN-3305
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3305.patch, 0001-YARN-3305.patch, 
 0002-YARN-3305.patch, 0003-YARN-3305.patch


 For given any ResourceRequest, {{CS#allocate}} normalizes request to 
 minimumAllocation if requested memory is less than minimumAllocation.
 But AM-used resource is updated with actual ResourceRequest made by user. 
 This results in AM container allocation more than Max ApplicationMaster 
 Resource.
 This is because AM-Used is updated with actual ResourceRequest made by user 
 while activating the applications. But during allocation of container, 
 ResourceRequest is normalized to minimumAllocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3181) FairScheduler: Fix up outdated findbugs issues

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367240#comment-14367240
 ] 

Hudson commented on YARN-3181:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #127 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/127/])
Revert YARN-3181. FairScheduler: Fix up outdated findbugs issues. (kasha) 
(kasha: rev 32b43304563c2430c00bc3e142a962d2bc5f4d58)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSOpDurations.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java


 FairScheduler: Fix up outdated findbugs issues
 --

 Key: YARN-3181
 URL: https://issues.apache.org/jira/browse/YARN-3181
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Brahma Reddy Battula
 Attachments: YARN-3181-002.patch, yarn-3181-1.patch


 In FairScheduler, we have excluded some findbugs-reported errors. Some of 
 them aren't applicable anymore, and there are a few that can be easily fixed 
 without needing an exclusion. It would be nice to fix them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3364) Clarify Naming of yarn.client.nodemanager-connect.max-wait-ms and yarn.resourcemanager.connect.max-wait.ms

2015-03-18 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved YARN-3364.
--
Resolution: Duplicate

Closing this as a duplicate of HADOOP-11398.

 Clarify Naming of yarn.client.nodemanager-connect.max-wait-ms and 
 yarn.resourcemanager.connect.max-wait.ms 
 ---

 Key: YARN-3364
 URL: https://issues.apache.org/jira/browse/YARN-3364
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Andrew Johnson

 I encountered an issue recently where the ApplicationMaster for MapReduce 
 jobs would spend hours attempting to connect to a node in my cluster that had 
 died due to a hardware fault.  After debugging this, I found that the 
 yarn.client.nodemanager-connect.max-wait-ms property did not behave as I had 
 expected.  Based on the name I had thought this would set a maximum time 
 limit for attempting to connect to a NodeManager.  The code in 
 org.apache.hadoop.yarn.client.NMProxy corroborated this thought - it used a 
 RetryUpToMaximumTimeWithFixedSleep policy when a  ConnectTimeoutException was 
 thrown, as it was in my case with a dead node.
 However, the RetryUpToMaximumTimeWithFixedSleep policy doesn't actually set a 
 time limit, but instead divides the maximum time by the sleep period to set a 
 total number of retries, regardless of how long those retries take.  As such 
 I was seeing the ApplicationMaster spend much longer attempting to make a 
 connection than I had anticipated.
 The yarn.resourcemanager.connect.max-wait.ms would have the same behavior.  
 These properties would be better named like 
 yarn.client.nodemanager-connect.max.retries and 
 yarn.resourcemanager.connect.max.retries to better align with the actual 
 behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367242#comment-14367242
 ] 

Hudson commented on YARN-3243:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #127 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/127/])
YARN-3243. CapacityScheduler should pass headroom from parent to children to 
make sure ParentQueue obey its capacity limits. Contributed by Wangda Tan. 
(jianhe: rev 487374b7fe0c92fc7eb1406c568952722b5d5b15)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java


 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.8.0

 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch, YARN-3243.5.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource 

[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367244#comment-14367244
 ] 

Hudson commented on YARN-3205:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #127 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/127/])
YARN-3205. FileSystemRMStateStore should disable FileSystem Cache to avoid get 
a Filesystem with an old configuration. Contributed by Zhihai Xu. (ozawa: rev 
3bc72cc16d8c7b8addd8f565523001dfcc32b891)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* hadoop-yarn-project/CHANGES.txt


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367243#comment-14367243
 ] 

Hudson commented on YARN-3273:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #127 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/127/])
YARN-3273. Improve scheduler UI to facilitate scheduling analysis and 
debugging. Contributed Rohith Sharmaks (jianhe: rev 
658097d6da1b1aac8e01db459f0c3b456e99652f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/MetricsOverviewTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestContinuousScheduling.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/UserInfo.java


 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for 

[jira] [Commented] (YARN-3305) AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367241#comment-14367241
 ] 

Hudson commented on YARN-3305:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #127 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/127/])
YARN-3305. Normalize AM resource request on app submission. Contributed by 
Rohith Sharmaks (jianhe: rev 968425e9f7b850ff9c2ab8ca37a64c3fdbe77dbf)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java


 AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is 
 less than minimumAllocation
 

 Key: YARN-3305
 URL: https://issues.apache.org/jira/browse/YARN-3305
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3305.patch, 0001-YARN-3305.patch, 
 0002-YARN-3305.patch, 0003-YARN-3305.patch


 For given any ResourceRequest, {{CS#allocate}} normalizes request to 
 minimumAllocation if requested memory is less than minimumAllocation.
 But AM-used resource is updated with actual ResourceRequest made by user. 
 This results in AM container allocation more than Max ApplicationMaster 
 Resource.
 This is because AM-Used is updated with actual ResourceRequest made by user 
 while activating the applications. But during allocation of container, 
 ResourceRequest is normalized to minimumAllocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3364) Clarify Naming of yarn.client.nodemanager-connect.max-wait-ms and yarn.resourcemanager.connect.max-wait.ms

2015-03-18 Thread Andrew Johnson (JIRA)
Andrew Johnson created YARN-3364:


 Summary: Clarify Naming of 
yarn.client.nodemanager-connect.max-wait-ms and 
yarn.resourcemanager.connect.max-wait.ms 
 Key: YARN-3364
 URL: https://issues.apache.org/jira/browse/YARN-3364
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Andrew Johnson


I encountered an issue recently where the ApplicationMaster for MapReduce jobs 
would spend hours attempting to connect to a node in my cluster that had died 
due to a hardware fault.  After debugging this, I found that the 
yarn.client.nodemanager-connect.max-wait-ms property did not behave as I had 
expected.  Based on the name I had thought this would set a maximum time limit 
for attempting to connect to a NodeManager.  The code in 
org.apache.hadoop.yarn.client.NMProxy corroborated this thought - it used a 
RetryUpToMaximumTimeWithFixedSleep policy when a  ConnectTimeoutException was 
thrown, as it was in my case with a dead node.

However, the RetryUpToMaximumTimeWithFixedSleep policy doesn't actually set a 
time limit, but instead divides the maximum time by the sleep period to set a 
total number of retries, regardless of how long those retries take.  As such I 
was seeing the ApplicationMaster spend much longer attempting to make a 
connection than I had anticipated.

The yarn.resourcemanager.connect.max-wait.ms would have the same behavior.  
These properties would be better named like 
yarn.client.nodemanager-connect.max.retries and 
yarn.resourcemanager.connect.max.retries to better align with the actual 
behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3364) Clarify Naming of yarn.client.nodemanager-connect.max-wait-ms and yarn.resourcemanager.connect.max-wait.ms

2015-03-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367164#comment-14367164
 ] 

Jason Lowe commented on YARN-3364:
--

Does your Hadoop build have the fix for YARN-3238?  If not, that would explain 
the long retries you were seeing.  Also the it's not a maximum time but a 
hacked-up guess at a number of retries issue is being tracked in HADOOP-11398.

 Clarify Naming of yarn.client.nodemanager-connect.max-wait-ms and 
 yarn.resourcemanager.connect.max-wait.ms 
 ---

 Key: YARN-3364
 URL: https://issues.apache.org/jira/browse/YARN-3364
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Andrew Johnson

 I encountered an issue recently where the ApplicationMaster for MapReduce 
 jobs would spend hours attempting to connect to a node in my cluster that had 
 died due to a hardware fault.  After debugging this, I found that the 
 yarn.client.nodemanager-connect.max-wait-ms property did not behave as I had 
 expected.  Based on the name I had thought this would set a maximum time 
 limit for attempting to connect to a NodeManager.  The code in 
 org.apache.hadoop.yarn.client.NMProxy corroborated this thought - it used a 
 RetryUpToMaximumTimeWithFixedSleep policy when a  ConnectTimeoutException was 
 thrown, as it was in my case with a dead node.
 However, the RetryUpToMaximumTimeWithFixedSleep policy doesn't actually set a 
 time limit, but instead divides the maximum time by the sleep period to set a 
 total number of retries, regardless of how long those retries take.  As such 
 I was seeing the ApplicationMaster spend much longer attempting to make a 
 connection than I had anticipated.
 The yarn.resourcemanager.connect.max-wait.ms would have the same behavior.  
 These properties would be better named like 
 yarn.client.nodemanager-connect.max.retries and 
 yarn.resourcemanager.connect.max.retries to better align with the actual 
 behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367302#comment-14367302
 ] 

Hudson commented on YARN-3273:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2086 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2086/])
YARN-3273. Improve scheduler UI to facilitate scheduling analysis and 
debugging. Contributed Rohith Sharmaks (jianhe: rev 
658097d6da1b1aac8e01db459f0c3b456e99652f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/MetricsOverviewTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/UserInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestContinuousScheduling.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptMetrics.java


 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for 

[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367303#comment-14367303
 ] 

Hudson commented on YARN-3205:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2086 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2086/])
YARN-3205. FileSystemRMStateStore should disable FileSystem Cache to avoid get 
a Filesystem with an old configuration. Contributed by Zhihai Xu. (ozawa: rev 
3bc72cc16d8c7b8addd8f565523001dfcc32b891)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* hadoop-yarn-project/CHANGES.txt


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367301#comment-14367301
 ] 

Hudson commented on YARN-3243:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2086 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2086/])
YARN-3243. CapacityScheduler should pass headroom from parent to children to 
make sure ParentQueue obey its capacity limits. Contributed by Wangda Tan. 
(jianhe: rev 487374b7fe0c92fc7eb1406c568952722b5d5b15)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.8.0

 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch, YARN-3243.5.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource 

[jira] [Commented] (YARN-3181) FairScheduler: Fix up outdated findbugs issues

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367299#comment-14367299
 ] 

Hudson commented on YARN-3181:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2086 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2086/])
Revert YARN-3181. FairScheduler: Fix up outdated findbugs issues. (kasha) 
(kasha: rev 32b43304563c2430c00bc3e142a962d2bc5f4d58)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSOpDurations.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* hadoop-yarn-project/CHANGES.txt


 FairScheduler: Fix up outdated findbugs issues
 --

 Key: YARN-3181
 URL: https://issues.apache.org/jira/browse/YARN-3181
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Brahma Reddy Battula
 Attachments: YARN-3181-002.patch, yarn-3181-1.patch


 In FairScheduler, we have excluded some findbugs-reported errors. Some of 
 them aren't applicable anymore, and there are a few that can be easily fixed 
 without needing an exclusion. It would be nice to fix them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367317#comment-14367317
 ] 

Hudson commented on YARN-3243:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #136 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/136/])
YARN-3243. CapacityScheduler should pass headroom from parent to children to 
make sure ParentQueue obey its capacity limits. Contributed by Wangda Tan. 
(jianhe: rev 487374b7fe0c92fc7eb1406c568952722b5d5b15)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java


 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.8.0

 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch, YARN-3243.5.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 

[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367319#comment-14367319
 ] 

Hudson commented on YARN-3205:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #136 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/136/])
YARN-3205. FileSystemRMStateStore should disable FileSystem Cache to avoid get 
a Filesystem with an old configuration. Contributed by Zhihai Xu. (ozawa: rev 
3bc72cc16d8c7b8addd8f565523001dfcc32b891)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-18 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367413#comment-14367413
 ] 

Zhijie Shen commented on YARN-3047:
---

bq. Did you mean NameValuePair can have package level access instead of public?

It shouldn't be part of api module, but the timeline service module.

 [Data Serving] Set up ATS reader with basic request serving structure and 
 lifecycle
 ---

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3047.001.patch, YARN-3047.02.patch


 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3305) AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367316#comment-14367316
 ] 

Hudson commented on YARN-3305:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #136 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/136/])
YARN-3305. Normalize AM resource request on app submission. Contributed by 
Rohith Sharmaks (jianhe: rev 968425e9f7b850ff9c2ab8ca37a64c3fdbe77dbf)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java


 AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is 
 less than minimumAllocation
 

 Key: YARN-3305
 URL: https://issues.apache.org/jira/browse/YARN-3305
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3305.patch, 0001-YARN-3305.patch, 
 0002-YARN-3305.patch, 0003-YARN-3305.patch


 For given any ResourceRequest, {{CS#allocate}} normalizes request to 
 minimumAllocation if requested memory is less than minimumAllocation.
 But AM-used resource is updated with actual ResourceRequest made by user. 
 This results in AM container allocation more than Max ApplicationMaster 
 Resource.
 This is because AM-Used is updated with actual ResourceRequest made by user 
 while activating the applications. But during allocation of container, 
 ResourceRequest is normalized to minimumAllocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367318#comment-14367318
 ] 

Hudson commented on YARN-3273:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #136 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/136/])
YARN-3273. Improve scheduler UI to facilitate scheduling analysis and 
debugging. Contributed Rohith Sharmaks (jianhe: rev 
658097d6da1b1aac8e01db459f0c3b456e99652f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/MetricsOverviewTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestContinuousScheduling.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/UserInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java


 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be 

[jira] [Commented] (YARN-3181) FairScheduler: Fix up outdated findbugs issues

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367315#comment-14367315
 ] 

Hudson commented on YARN-3181:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #136 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/136/])
Revert YARN-3181. FairScheduler: Fix up outdated findbugs issues. (kasha) 
(kasha: rev 32b43304563c2430c00bc3e142a962d2bc5f4d58)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSOpDurations.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml


 FairScheduler: Fix up outdated findbugs issues
 --

 Key: YARN-3181
 URL: https://issues.apache.org/jira/browse/YARN-3181
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Brahma Reddy Battula
 Attachments: YARN-3181-002.patch, yarn-3181-1.patch


 In FairScheduler, we have excluded some findbugs-reported errors. Some of 
 them aren't applicable anymore, and there are a few that can be easily fixed 
 without needing an exclusion. It would be nice to fix them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3333) rename TimelineAggregator etc. to TimelineCollector

2015-03-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367410#comment-14367410
 ] 

Sangjin Lee commented on YARN-:
---

Back in progress.

 rename TimelineAggregator etc. to TimelineCollector
 ---

 Key: YARN-
 URL: https://issues.apache.org/jira/browse/YARN-
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-.001.patch


 Per discussions on YARN-2928, let's rename TimelineAggregator, etc. to 
 TimelineCollector, etc.
 There are also several minor issues on the current branch, which can be fixed 
 as part of this:
 - fixing some imports
 - missing license in TestTimelineServerClientIntegration.java
 - whitespaces
 - missing direct dependency



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-18 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3241:

Attachment: YARN-3241.001.patch

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3241.000.patch, YARN-3241.001.patch


 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367523#comment-14367523
 ] 

Varun Saxena commented on YARN-3047:


Yes, it is part of that. I have kept it inside 
{{hadoop-yarn-server-timelineservice}}

 [Data Serving] Set up ATS reader with basic request serving structure and 
 lifecycle
 ---

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3047.001.patch, YARN-3047.02.patch


 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework

2015-03-18 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367546#comment-14367546
 ] 

Hitesh Shah commented on YARN-2375:
---

[~jeagles] just pointed me to this jira. 

Firstly, this seems like an incompatible change for 2.6.0. 

Second, the semantics of the property yarn.timeline-service.enabled have 
changed. Earlier, this seemed like a global/admin flag at the YARN level that 
controlled whether ATS was enabled or disabled. Now, it seems like the 
assumption is that every application framework needs to check a YARN property 
config before deciding to use ATS or not? 

There is also an inconsistency in how YarnClient behaves as compared to 
TimelineClient. YarnClient obeys the yarn.timeline-service.enabled flag. But, 
TimelineClient does not. 

[~zjshen] [~jeagles] [~vinodkv] [~mitdesai] Comments? 

 Allow enabling/disabling timeline server per framework
 --

 Key: YARN-2375
 URL: https://issues.apache.org/jira/browse/YARN-2375
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Fix For: 2.7.0, 2.6.1

 Attachments: YARN-2375.1.patch, YARN-2375.patch, YARN-2375.patch, 
 YARN-2375.patch, YARN-2375.patch


 This JIRA is to remove the ats enabled flag check within the 
 TimelineClientImpl. Example where this fails is below.
 While running secure timeline server with ats flag set to disabled on 
 resource manager, Timeline delegation token renewer throws an NPE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3111) Fix ratio problem on FairScheduler page

2015-03-18 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367563#comment-14367563
 ] 

Ashwin Shankar commented on YARN-3111:
--

I dont think 1 is done in the patch. My bad, I wasn't clear. In 1 what I was 
suggesting was that - represent that resource of steady/instant/max on the bar 
which is dominant in usage or used resources, so that, 
steady/instant/max/usage on the bar all finally represent ONE dimension in that 
bar/queue. So lets say usage of the queue is (20% mem, 60 % vcore), then 
steady/instant/max/usage would all display only vcore.

bq. 3 is good, but one question is that parent queue has no tooltip now, but it 
has its own bar.
Parent queues(except root) have a tooltip, I just checked in trunk. Can you 
check again ?

bq. And think over 3  4, what about listing all resources's usage percent on 
the text on the right of each bar? Maybe color red for dominant resource? or 
just judge it by comparing percent number?
It would be nice to have that with the color, however I'm concerned that it 
might look ugly from UE perspective.

bq. And also what do you think of the issue I mentioned above? I think it still 
can happen after 1  2, cause for one queue: steady, fair, max, usage resource 
may have different dominant resource type. If I make a mistake here, please let 
me know.
I believe this is clarified in the first paragraph of this comment. Let me know 
if you still have this concern.

 Fix ratio problem on FairScheduler page
 ---

 Key: YARN-3111
 URL: https://issues.apache.org/jira/browse/YARN-3111
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Peng Zhang
Assignee: Peng Zhang
Priority: Minor
 Attachments: YARN-3111.1.patch, YARN-3111.png


 Found 3 problems on FairScheduler page:
 1. Only compute memory for ratio even when queue schedulingPolicy is DRF.
 2. When min resources is configured larger than real resources, the steady 
 fair share ratio is so long that it is out the page.
 3. When cluster resources is 0(no nodemanager start), ratio is displayed as 
 NaN% used
 Attached image shows the snapshot of above problems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2693) Priority Label Manager in RM to manage application priority based on configuration

2015-03-18 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367477#comment-14367477
 ] 

Sunil G commented on YARN-2693:
---

Thank you [~wangda] for sharing comments.

As we move in the queue specific config inside scheduler.Queue, are we also 
taking ACLs back to scheduler (ACL wrt priority). Its better to control ACL 
from outside YarnAuthorizer and config only can be kept w.r.t scheduler. Pls 
share your thoughts.
 
Regarding methods in ApplicationPrioirtyManager, it looks overall fine but I 
suggest we may need 
* getClusterApplicationPriorities (if its range, that can be sent back)


 Priority Label Manager in RM to manage application priority based on 
 configuration
 --

 Key: YARN-2693
 URL: https://issues.apache.org/jira/browse/YARN-2693
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-2693.patch, 0002-YARN-2693.patch, 
 0003-YARN-2693.patch, 0004-YARN-2693.patch, 0005-YARN-2693.patch, 
 0006-YARN-2693.patch


 Focus of this JIRA is to have a centralized service to handle priority labels.
 Support operations such as
 * Add/Delete priority label to a specified queue
 * Manage integer mapping associated with each priority label
 * Support managing default priority label of a given queue
 * Expose interface to RM to validate priority label
 TO have simplified interface, Priority Manager will support only 
 configuration file in contrast with admin cli and REST. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367628#comment-14367628
 ] 

Hadoop QA commented on YARN-3241:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705388/YARN-3241.001.patch
  against trunk revision 9d72f93.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7012//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7012//console

This message is automatically generated.

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3241.000.patch, YARN-3241.001.patch


 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3351) AppMaster tracking URL is broken in HA

2015-03-18 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3351:

Attachment: YARN-3351.002.patch

Addressed comments 1 and 3. For 2, i meant any ipaddress, made it obvious by 
changing it to 1.2.3.4
Also made the test not cause any left over mapping and restore any it might 
affect

 AppMaster tracking URL is broken in HA
 --

 Key: YARN-3351
 URL: https://issues.apache.org/jira/browse/YARN-3351
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3351.001.patch, YARN-3351.002.patch


 After YARN-2713, the AppMaster link is broken in HA.  To repro 
 a) setup RM HA and ensure the first RM is not active,
 b) run a long sleep job and view the tracking url on the RM applications page
 The log and full stack trace is shown below
 {noformat}
 2015-02-05 20:47:43,478 WARN org.mortbay.log: 
 /proxy/application_1423182188062_0002/: java.net.BindException: Cannot assign 
 requested address
 {noformat}
 {noformat}
 java.net.BindException: Cannot assign requested address
   at java.net.PlainSocketImpl.socketBind(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
   at java.net.Socket.bind(Socket.java:631)
   at java.net.Socket.init(Socket.java:423)
   at java.net.Socket.init(Socket.java:280)
   at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
   at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
   at 
 org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
   at 
 org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
   at 
 org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
   at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
   at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.proxyLink(WebAppProxyServlet.java:188)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:345)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
   at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-18 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367484#comment-14367484
 ] 

zhihai xu commented on YARN-3241:
-

Hi [~kasha], thanks for the review, your suggestion sounds reasonable to me. I 
uploaded a new patch YARN-3241.001.patch  which addressed your comment.
Also I find we need check the queue name in FairScheduler Config File to avoid 
similar issue, So I add code to check the queue name in Config File and add a 
test case for it. Please review it.

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3241.000.patch, YARN-3241.001.patch


 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-18 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3241:

Attachment: (was: YARN-3241.001.patch)

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3241.000.patch, YARN-3241.001.patch


 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-18 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367536#comment-14367536
 ] 

Yongjun Zhang commented on YARN-3021:
-

HI [~jianhe] and all,

I resumed working on this and found an obstacle here. 

See org.apache.hadoop.security.token.Token:

{code}
 private synchronized TokenRenewer getRenewer() throws IOException {
if (renewer != null) {
  return renewer;
}
renewer = TRIVIAL_RENEWER;
synchronized (renewers) {
  for (TokenRenewer canidate : renewers) {
if (canidate.handleKind(this.kind)) {
  renewer = canidate;
  return renewer;
}
  }
}
LOG.warn(No TokenRenewer defined for token kind  + this.kind);
return renewer;
  }

 public boolean isManaged() throws IOException {
return getRenewer().isManaged(this);
  }

  public long renew(Configuration conf
) throws IOException, InterruptedException {
return getRenewer().renew(this, conf);
  }
  
  public void cancel(Configuration conf
 ) throws IOException, InterruptedException {
getRenewer().cancel(this, conf);
  }

{code}

We can see that {{getRenewer()}} does more work than simply return the renewer. 
And non-null renewer is guaranteed to be returned currently. The other methods 
(listed above, called at server side) count on this behavior.

If we set the renewer to null at client side and expect the server to pick it 
up, we need to do either

1. change the behaviour of {{getRenewer()} to return whatever renewer set by 
client. 
2. or we change the token's {{kind}} to make {{getRenewer}} to return null, 
which will be really hacky.

Making this kind of change seems to be more wide impact than expected, and 
things likely will broken by this change.

Any thoughts?

Thanks a lot.


 YARN's delegation-token handling disallows certain trust setups to operate 
 properly over DistCp
 ---

 Key: YARN-3021
 URL: https://issues.apache.org/jira/browse/YARN-3021
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.3.0
Reporter: Harsh J
 Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
 YARN-3021.003.patch, YARN-3021.patch


 Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
 and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
 clusters.
 Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
 needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
 as it attempts a renewDelegationToken(…) synchronously during application 
 submission (to validate the managed token before it adds it to a scheduler 
 for automatic renewal). The call obviously fails cause B realm will not trust 
 A's credentials (here, the RM's principal is the renewer).
 In the 1.x JobTracker the same call is present, but it is done asynchronously 
 and once the renewal attempt failed we simply ceased to schedule any further 
 attempts of renewals, rather than fail the job immediately.
 We should change the logic such that we attempt the renewal but go easy on 
 the failure and skip the scheduling alone, rather than bubble back an error 
 to the client, failing the app submission. This way the old behaviour is 
 retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3366) Outbound network bandwidth : classify/shape traffic originating from YARN containers

2015-03-18 Thread Sidharta Seethana (JIRA)
Sidharta Seethana created YARN-3366:
---

 Summary: Outbound network bandwidth : classify/shape traffic 
originating from YARN containers
 Key: YARN-3366
 URL: https://issues.apache.org/jira/browse/YARN-3366
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana


In order to be able to isolate based on/enforce outbound traffic bandwidth 
limits, we need  a mechanism to classify/shape network traffic in the 
nodemanager. For more information on the design, please see the attached design 
document in the parent JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-18 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367635#comment-14367635
 ] 

Jian He commented on YARN-3021:
---

Hi [~yzhangal],  I think what we should do is in 
{{TokenCache#obtainTokensForNamenodesInternal}} change the 
{{delegTokenRenewer}} to be null for name nodes listed in 
mapreduce.job.hdfs-servers.token-renewal.exclude.  
And on server side, decode the {{identifier}} field in {{Token}} and check 
whether the {{renewer}} in {{AbstractDelegationTokenIdentifier}} is null or 
not.  make sense ?

 YARN's delegation-token handling disallows certain trust setups to operate 
 properly over DistCp
 ---

 Key: YARN-3021
 URL: https://issues.apache.org/jira/browse/YARN-3021
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.3.0
Reporter: Harsh J
 Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
 YARN-3021.003.patch, YARN-3021.patch


 Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
 and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
 clusters.
 Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
 needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
 as it attempts a renewDelegationToken(…) synchronously during application 
 submission (to validate the managed token before it adds it to a scheduler 
 for automatic renewal). The call obviously fails cause B realm will not trust 
 A's credentials (here, the RM's principal is the renewer).
 In the 1.x JobTracker the same call is present, but it is done asynchronously 
 and once the renewal attempt failed we simply ceased to schedule any further 
 attempts of renewals, rather than fail the job immediately.
 We should change the logic such that we attempt the renewal but go easy on 
 the failure and skip the scheduling alone, rather than bubble back an error 
 to the client, failing the app submission. This way the old behaviour is 
 retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-914) (Umbrella) Support graceful decommission of nodemanager

2015-03-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367873#comment-14367873
 ] 

Junping Du commented on YARN-914:
-

Hi, can someone in watch list help to review patch in sub JIRA YARN-3212? 
Thanks!

 (Umbrella) Support graceful decommission of nodemanager
 ---

 Key: YARN-914
 URL: https://issues.apache.org/jira/browse/YARN-914
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Luke Lu
Assignee: Junping Du
 Attachments: Gracefully Decommission of NodeManager (v1).pdf, 
 Gracefully Decommission of NodeManager (v2).pdf, 
 GracefullyDecommissionofNodeManagerv3.pdf


 When NMs are decommissioned for non-fault reasons (capacity change etc.), 
 it's desirable to minimize the impact to running applications.
 Currently if a NM is decommissioned, all running containers on the NM need to 
 be rescheduled on other NMs. Further more, for finished map tasks, if their 
 map output are not fetched by the reducers of the job, these map tasks will 
 need to be rerun as well.
 We propose to introduce a mechanism to optionally gracefully decommission a 
 node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-18 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367642#comment-14367642
 ] 

Jian He commented on YARN-3021:
---

Yongjun , thanks for taking this up ! just assigned the jira under your name 

 YARN's delegation-token handling disallows certain trust setups to operate 
 properly over DistCp
 ---

 Key: YARN-3021
 URL: https://issues.apache.org/jira/browse/YARN-3021
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.3.0
Reporter: Harsh J
Assignee: Yongjun Zhang
 Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
 YARN-3021.003.patch, YARN-3021.patch


 Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
 and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
 clusters.
 Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
 needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
 as it attempts a renewDelegationToken(…) synchronously during application 
 submission (to validate the managed token before it adds it to a scheduler 
 for automatic renewal). The call obviously fails cause B realm will not trust 
 A's credentials (here, the RM's principal is the renewer).
 In the 1.x JobTracker the same call is present, but it is done asynchronously 
 and once the renewal attempt failed we simply ceased to schedule any further 
 attempts of renewals, rather than fail the job immediately.
 We should change the logic such that we attempt the renewal but go easy on 
 the failure and skip the scheduling alone, rather than bubble back an error 
 to the client, failing the app submission. This way the old behaviour is 
 retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-18 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-3021:
--
Assignee: Yongjun Zhang

 YARN's delegation-token handling disallows certain trust setups to operate 
 properly over DistCp
 ---

 Key: YARN-3021
 URL: https://issues.apache.org/jira/browse/YARN-3021
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.3.0
Reporter: Harsh J
Assignee: Yongjun Zhang
 Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
 YARN-3021.003.patch, YARN-3021.patch


 Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
 and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
 clusters.
 Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
 needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
 as it attempts a renewDelegationToken(…) synchronously during application 
 submission (to validate the managed token before it adds it to a scheduler 
 for automatic renewal). The call obviously fails cause B realm will not trust 
 A's credentials (here, the RM's principal is the renewer).
 In the 1.x JobTracker the same call is present, but it is done asynchronously 
 and once the renewal attempt failed we simply ceased to schedule any further 
 attempts of renewals, rather than fail the job immediately.
 We should change the logic such that we attempt the renewal but go easy on 
 the failure and skip the scheduling alone, rather than bubble back an error 
 to the client, failing the app submission. This way the old behaviour is 
 retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-18 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367692#comment-14367692
 ] 

Yongjun Zhang commented on YARN-3021:
-

Hi Jian, looking closer at what you suggested, I think I was wrong about 
setting the TokenRenewer object in the token to null, instead, we want to set 
the renewer string to null. :-) thanks.


 YARN's delegation-token handling disallows certain trust setups to operate 
 properly over DistCp
 ---

 Key: YARN-3021
 URL: https://issues.apache.org/jira/browse/YARN-3021
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.3.0
Reporter: Harsh J
Assignee: Yongjun Zhang
 Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
 YARN-3021.003.patch, YARN-3021.patch


 Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
 and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
 clusters.
 Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
 needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
 as it attempts a renewDelegationToken(…) synchronously during application 
 submission (to validate the managed token before it adds it to a scheduler 
 for automatic renewal). The call obviously fails cause B realm will not trust 
 A's credentials (here, the RM's principal is the renewer).
 In the 1.x JobTracker the same call is present, but it is done asynchronously 
 and once the renewal attempt failed we simply ceased to schedule any further 
 attempts of renewals, rather than fail the job immediately.
 We should change the logic such that we attempt the renewal but go easy on 
 the failure and skip the scheduling alone, rather than bubble back an error 
 to the client, failing the app submission. This way the old behaviour is 
 retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework

2015-03-18 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367697#comment-14367697
 ] 

Zhijie Shen commented on YARN-2375:
---

bq. Firstly, this seems like an incompatible change for 2.6.0.

Do you mean the semantic incompatible?

bq. Second, the semantics of the property yarn.timeline-service.enabled have 
changed. 

IMHO, yarn.timeline-service.enabled is still the global config. The 
difference is that previously, it's checked inside TimelineClient, but now it 
is checked by the user.

Jon commented the reason of doing this: 
https://issues.apache.org/jira/browse/YARN-2375?focusedCommentId=14212964page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14212964

It sounds reasonable. Does it break any depending project?

 Allow enabling/disabling timeline server per framework
 --

 Key: YARN-2375
 URL: https://issues.apache.org/jira/browse/YARN-2375
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Fix For: 2.7.0, 2.6.1

 Attachments: YARN-2375.1.patch, YARN-2375.patch, YARN-2375.patch, 
 YARN-2375.patch, YARN-2375.patch


 This JIRA is to remove the ats enabled flag check within the 
 TimelineClientImpl. Example where this fails is below.
 While running secure timeline server with ats flag set to disabled on 
 resource manager, Timeline delegation token renewer throws an NPE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-03-18 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3363:

Labels: metrics supportability  (was: )

 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability

 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-03-18 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3363:

Component/s: nodemanager

 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability

 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3040) [Data Model] Implement client-side API for handling flows

2015-03-18 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368040#comment-14368040
 ] 

Zhijie Shen commented on YARN-3040:
---

Take it over. Thanks! - Zhijie

 [Data Model] Implement client-side API for handling flows
 -

 Key: YARN-3040
 URL: https://issues.apache.org/jira/browse/YARN-3040
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Zhijie Shen
 Attachments: YARN-3040.1.patch


 Per design in YARN-2928, implement client-side API for handling *flows*. 
 Frameworks should be able to define and pass in all attributes of flows and 
 flow runs to YARN, and they should be passed into ATS writers.
 YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.

2015-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368122#comment-14368122
 ] 

Hadoop QA commented on YARN-3356:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705446/YARN-3356.4.patch
  against trunk revision c239b6d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7018//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7018//console

This message is automatically generated.

 Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track 
 used-resources-by-label.
 --

 Key: YARN-3356
 URL: https://issues.apache.org/jira/browse/YARN-3356
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3356.1.patch, YARN-3356.2.patch, YARN-3356.3.patch, 
 YARN-3356.4.patch


 Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp 
 should use ResourceRequest to track resource-usage/pending by label for 
 better resource tracking and preemption. 
 And also, when application's pending resource changed (container allocated, 
 app completed, moved, etc.), we need update ResourceUsage of queue 
 hierarchies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3284) Expose more ApplicationMetrics and ApplicationAttemptMetrics through YARN command

2015-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368208#comment-14368208
 ] 

Hadoop QA commented on YARN-3284:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705440/YARN-3284.4.patch
  against trunk revision c239b6d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 13 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.mapreduce.v2.TestMROldApiJobs
  org.apache.hadoop.mapreduce.v2.TestMRJobs
  org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService
  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart

  The following test timeouts occurred in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.mapreduce.v2.TestNonExistentJob
org.apache.hadoop.mapreduce.v2.TestRMNMInfo
org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution
org.apache.hadoop.mapreduce.v2.TestUberAM

  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7017//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7017//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7017//console

This message is automatically generated.

 Expose more ApplicationMetrics and ApplicationAttemptMetrics through YARN 
 command
 -

 Key: YARN-3284
 URL: https://issues.apache.org/jira/browse/YARN-3284
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-3284.1.patch, YARN-3284.1.patch, YARN-3284.2.patch, 
 YARN-3284.3.patch, YARN-3284.3.rebase.patch, YARN-3284.4.patch


 Current, we have some extra metrics about the application and current attempt 
 in RM Web UI. We should expose that information through YARN Command, too.
 1. Preemption metrics
 2. application outstanding resource requests
 3. container locality info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3351) AppMaster tracking URL is broken in HA

2015-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368209#comment-14368209
 ] 

Hudson commented on YARN-3351:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7365 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7365/])
YARN-3351. AppMaster tracking URL is broken in HA. (Anubhav Dhoot via kasha) 
(kasha: rev 20b49224eb90c796f042ac4251508f3979fd4787)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestWebAppUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NavBlock.java


 AppMaster tracking URL is broken in HA
 --

 Key: YARN-3351
 URL: https://issues.apache.org/jira/browse/YARN-3351
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Fix For: 2.8.0

 Attachments: YARN-3351.001.patch, YARN-3351.002.patch, 
 YARN-3351.003.patch


 After YARN-2713, the AppMaster link is broken in HA.  To repro 
 a) setup RM HA and ensure the first RM is not active,
 b) run a long sleep job and view the tracking url on the RM applications page
 The log and full stack trace is shown below
 {noformat}
 2015-02-05 20:47:43,478 WARN org.mortbay.log: 
 /proxy/application_1423182188062_0002/: java.net.BindException: Cannot assign 
 requested address
 {noformat}
 {noformat}
 java.net.BindException: Cannot assign requested address
   at java.net.PlainSocketImpl.socketBind(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
   at java.net.Socket.bind(Socket.java:631)
   at java.net.Socket.init(Socket.java:423)
   at java.net.Socket.init(Socket.java:280)
   at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
   at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
   at 
 org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
   at 
 org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
   at 
 org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
   at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
   at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.proxyLink(WebAppProxyServlet.java:188)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:345)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
   at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3368) Improve YARN web UI

2015-03-18 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368269#comment-14368269
 ] 

Jian He commented on YARN-3368:
---

we may expose information through the web service and have the client make rest 
call to retrieve the data and render that on the UI.

 Improve YARN web UI
 ---

 Key: YARN-3368
 URL: https://issues.apache.org/jira/browse/YARN-3368
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He

 The goal is to improve YARN UI for better usability.
 We may take advantage of some existing front-end frameworks to build a 
 fancier, easier-to-use UI. 
 The old UI continue to exist until  we feel it's ready to flip to the new UI.
 This serves as an umbrella jira to track the tasks. we can do this in a 
 branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3371) TTL for YARN Registry SRV records

2015-03-18 Thread Gopal V (JIRA)
Gopal V created YARN-3371:
-

 Summary: TTL for YARN Registry SRV records
 Key: YARN-3371
 URL: https://issues.apache.org/jira/browse/YARN-3371
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Reporter: Gopal V


YARN service records do not have any stale indicators.

The SRV records need a TTL equivalent for ephemeral services which tend to be 
reconfigured occasionally, to allow for clients to hold onto it without 
authoritative lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (YARN-3040) [Data Model] Implement client-side API for handling flows

2015-03-18 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368064#comment-14368064
 ] 

Zhijie Shen edited comment on YARN-3040 at 3/18/15 10:42 PM:
-

I've just uploaded a patch. It's an e2e modification to make the context 
information can be passed from the client to the backend storage. The context 
information includes *clusterId*, *userId*, *flowId*, *flowRunId* and *appId*. 
According to YARN-3240, new TimelineClient is constructed per application, and 
in the context of one application, we can reasonably assume this context 
information should be unchanged. Therefore, they just need to be specified when 
the client is constructed. The context information should be gathered or passed 
to AM and NM to construct timeline client  properly. For example, for AM, this 
information can be passed via env inside CLC. Anyway, it's out of the scope of 
this Jira, we will cover that integration once we make some particular 
framework AM to use new timeline client.

Back to the context information, some of them can be null, and some of them 
doesn't need to be specified explicitly:

*  *clusterId*: The application should specify the a unique cluster ID, or by 
default the cluster ID will be cluster_start timestamp of RM.
* *userId*: The user doesn't need to specify this information. Instead, it will 
be obtained by the current ugi of the client.
* *flowId*: The user either pass in a flowID or if it is an orphan application, 
the flowId will be the appId by replace the prefix with flow.
* *flowRunId*: If it is an orphan application, it's 0. The reason why it should 
be 0 instead of a current timestamp when creating the timeline client is that 
their may have multiple clients in AM and NMs to be constructed at different 
time. They need to be synced on the same flowRunId.
* *appId*: It's the only mandatory context information as we defined before. 
The client is constructed to only work with one application.

I changed the web service endpoint accordingly to make it restful, and change 
the writer interface accordingly to pass in the context information when 
putting the entity. In addition, I've modified the FS-based writer 
implementation to reflect the change. The entity file will be put in the dir 
{{root/entities/clusterId/userId/flowId/flowRunId/appId/entityType/entityId.thist}}.
 It has been verified by TestDistributedShell and 
TestFileSystemTimelineWriterImpl.



was (Author: zjshen):
I've just uploaded a patch. It's an e2e modification to make the context 
information can be passed from the client to the backend storage. The context 
information includes *clusterId*, *userId*, *flowId*, *flowRunId* and *appId*. 
According to YARN-3240, new TimelineClient is constructed per application, and 
in the context of one application, we can reasonably assume this context 
information should be unchanged. Therefore, they just need to be specified when 
the client is constructed. The context information should be gathered or passed 
to AM and NM to construct timeline client  properly. For example, for AM, this 
information can be passed via env inside CLC. Anyway, it's out of the scope of 
this Jira, we will cover that integration once we make some particular 
framework AM to use new timeline client.

Back to the context information, some of them can be null, and some of them 
doesn't need to be specified explicitly:

*  *clusterId*: The application should specify the a unique cluster ID, or by 
default the cluster ID will be cluster_start timestamp of RM.
* *userId*: The user doesn't need to specify this information. Instead, it will 
be obtained by the current ugi of the client.
* *flowId*: The user either pass in a flowID or if it is an orphan application, 
the flowId will be the appId by replace the prefix with flow.
* *flowRunId: If it is an orphan application, it's 0. The reason why it should 
be 0 instead of a current timestamp when creating the timeline client is that 
their may have multiple clients in AM and NMs to be constructed at different 
time. They need to be synced on the same flowRunId.
* *appId*: It's the only mandatory context information as we defined before. 
The client is constructed to only work with one application.

I changed the web service endpoint accordingly to make it restful, and change 
the writer interface accordingly to pass in the context information when 
putting the entity. In addition, I've modified the FS-based writer 
implementation to reflect the change. The entity file will be put in the dir 
{{root/entities/clusterId/userId/flowId/flowRunId/appId/entityType/entityId.thist}}.
 It has been verified by TestDistributedShell and 
TestFileSystemTimelineWriterImpl.


 [Data Model] Implement client-side API for handling flows
 -

 Key: YARN-3040
 

[jira] [Commented] (YARN-3351) AppMaster tracking URL is broken in HA

2015-03-18 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368126#comment-14368126
 ] 

Karthik Kambatla commented on YARN-3351:


+1

 AppMaster tracking URL is broken in HA
 --

 Key: YARN-3351
 URL: https://issues.apache.org/jira/browse/YARN-3351
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3351.001.patch, YARN-3351.002.patch, 
 YARN-3351.003.patch


 After YARN-2713, the AppMaster link is broken in HA.  To repro 
 a) setup RM HA and ensure the first RM is not active,
 b) run a long sleep job and view the tracking url on the RM applications page
 The log and full stack trace is shown below
 {noformat}
 2015-02-05 20:47:43,478 WARN org.mortbay.log: 
 /proxy/application_1423182188062_0002/: java.net.BindException: Cannot assign 
 requested address
 {noformat}
 {noformat}
 java.net.BindException: Cannot assign requested address
   at java.net.PlainSocketImpl.socketBind(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
   at java.net.Socket.bind(Socket.java:631)
   at java.net.Socket.init(Socket.java:423)
   at java.net.Socket.init(Socket.java:280)
   at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
   at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
   at 
 org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
   at 
 org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
   at 
 org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
   at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
   at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.proxyLink(WebAppProxyServlet.java:188)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:345)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
   at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3370) don't show the exception message before showing container logs in UI

2015-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368193#comment-14368193
 ] 

Sergey Shelukhin commented on YARN-3370:


[~vinodkv] fyi

 don't show the exception message before showing container logs in UI
 

 Key: YARN-3370
 URL: https://issues.apache.org/jira/browse/YARN-3370
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin

 When you click on e.g. AM attempt logs, Exception: Unknown container ... 
 message is shown, then the page refreshes to logs. The message should not be 
 shown by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3370) don't show the exception message before showing container logs in UI

2015-03-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created YARN-3370:
--

 Summary: don't show the exception message before showing container 
logs in UI
 Key: YARN-3370
 URL: https://issues.apache.org/jira/browse/YARN-3370
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin


When you click on e.g. AM attempt logs, Exception: Unknown container ... 
message is shown, then the page refreshes to logs. The message should not be 
shown by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3369) Missing NullPointer check in AppSchedulingInfo causes RM to die

2015-03-18 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-3369:

Description: 
In AppSchedulingInfo.java the method checkForDeactivation() has these 2 
consecutive lines:
{code}
ResourceRequest request = getResourceRequest(priority, ResourceRequest.ANY);
if (request.getNumContainers()  0) {
{code}
the first line calls getResourceRequest and it can return null.
{code}
synchronized public ResourceRequest getResourceRequest(
Priority priority, String resourceName) {
MapString, ResourceRequest nodeRequests = requests.get(priority);
return  (nodeRequests == null) ? {color:red} null : 
nodeRequests.get(resourceName);
}
{code}
The second line dereferences the pointer directly without a check.
If the pointer is null, the RM dies. 

{quote}2015-03-17 14:14:04,757 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type NODE_UPDATE to the scheduler
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1384)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1263)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:816)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:588)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:449)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1017)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1059)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:114)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:739)
at java.lang.Thread.run(Thread.java:722)
{color:red} *2015-03-17 14:14:04,758 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, 
bbye..*{color} {quote}

  was:
In AppSchedulingInfo.java the method checkForDeactivation() has these 2 
consecutive lines:
{quote} 
{color:red}  ResourceRequest request = getResourceRequest(priority, 
ResourceRequest.ANY);
  if (request.getNumContainers()  0) {
{color}
{quote}
the first line calls getResourceRequest and it can return null.
{quote}
synchronized public ResourceRequest getResourceRequest(
Priority priority, String resourceName) {
MapString, ResourceRequest nodeRequests = requests.get(priority);
{color:red} *return* {color}  (nodeRequests == null) ? {color:red} *null* 
{color} : nodeRequests.get(resourceName);
}
{quote}
The second line dereferences the pointer directly without a check.
If the pointer is null, the RM dies. 

{quote}2015-03-17 14:14:04,757 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type NODE_UPDATE to the scheduler
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559)
at 

[jira] [Commented] (YARN-3212) RMNode State Transition Update with DECOMMISSIONING state

2015-03-18 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368118#comment-14368118
 ] 

Ming Ma commented on YARN-3212:
---

bq. Do we want to consider DECOMMISSIONING nodes as not active? There are 
containers actively running on them, and in that sense they are participating 
in the cluster (and contributing to the overall cluster resource). I think they 
should still be considered active, but I could be persuaded otherwise.

Do we need to support the scenario where NM becomes dead when it is being 
decommissioned? Say decommission timeout is 30 minutes larger than the NM 
liveness timeout.  The node drops out of the cluster for some time and rejoin 
later all within the decommission time out. Will Yarn show the status as just 
dead node, or {dead, decommissioning}? Seems useful for admins to know about 
it. If we need that,  we can consider two types of NodeState. One is liveness 
state, one is admin state. Then you will have different combinations.

 RMNode State Transition Update with DECOMMISSIONING state
 -

 Key: YARN-3212
 URL: https://issues.apache.org/jira/browse/YARN-3212
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Junping Du
Assignee: Junping Du
 Attachments: RMNodeImpl - new.png, YARN-3212-v1.patch, 
 YARN-3212-v2.patch


 As proposed in YARN-914, a new state of “DECOMMISSIONING” will be added and 
 can transition from “running” state triggered by a new event - 
 “decommissioning”. 
 This new state can be transit to state of “decommissioned” when 
 Resource_Update if no running apps on this NM or NM reconnect after restart. 
 Or it received DECOMMISSIONED event (after timeout from CLI).
 In addition, it can back to “running” if user decides to cancel previous 
 decommission by calling recommission on the same node. The reaction to other 
 events is similar to RUNNING state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3369) Missing NullPointer check in AppSchedulingInfo causes RM to die

2015-03-18 Thread Giovanni Matteo Fumarola (JIRA)
Giovanni Matteo Fumarola created YARN-3369:
--

 Summary: Missing NullPointer check in AppSchedulingInfo causes RM 
to die 
 Key: YARN-3369
 URL: https://issues.apache.org/jira/browse/YARN-3369
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Giovanni Matteo Fumarola


In AppSchedulingInfo.java the method checkForDeactivation() has these 2 
consecutive lines:
{quote} 
{color:red}  ResourceRequest request = getResourceRequest(priority, 
ResourceRequest.ANY);
  if (request.getNumContainers()  0) {
{color}
{quote}
the first line calls getResourceRequest and it can return null.
{quote}
synchronized public ResourceRequest getResourceRequest(
Priority priority, String resourceName) {
MapString, ResourceRequest nodeRequests = requests.get(priority);
{color:red} *return* {color}  (nodeRequests == null) ? {color:red} *null* 
{color} : nodeRequests.get(resourceName);
}
{quote}
The second line dereferences the pointer directly without a check.
If the pointer is null, the RM dies. 

{quote}2015-03-17 14:14:04,757 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type NODE_UPDATE to the scheduler
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1384)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1263)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:816)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:588)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:449)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1017)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1059)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:114)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:739)
at java.lang.Thread.run(Thread.java:722)
{color:red} *2015-03-17 14:14:04,758 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, 
bbye..*{color} {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3345) Add non-exclusive node label RMAdmin CLI/API

2015-03-18 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368089#comment-14368089
 ] 

Jian He commented on YARN-3345:
---

- public/unstable annotations for the newly added records, e.g. 
SetNodeLabelsAttributesRequest, NodeLabelAttributes#getAttributes,getNodeLabel
- NodeLabelAttributes - NodeLabel, so that  AddToClusterNodeLabelsRequest can 
later on use the same data structure.
- for node exclusiveness - I think we may use NodeLabel#(get/set)IsExclusive
- “ an un existed node-label=%s” - “non-existing node-label”
- throw YarnException instead of IOException
- below code, how about user wants to set the attributes to be empty
{code}
if (attr.getAttributes().isEmpty()) {
  // simply ignore
  continue;
}
{code}
- add a newInstance method in SetNodeLabelsAttributesResponse and use that 
{code}
SetNodeLabelsAttributesResponse response =

recordFactory.newRecordInstance(SetNodeLabelsAttributesResponse.class);
{code}
- revert RMNodeLabelsManager change

 Add non-exclusive node label RMAdmin CLI/API
 

 Key: YARN-3345
 URL: https://issues.apache.org/jira/browse/YARN-3345
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3345.1.patch, YARN-3345.2.patch, YARN-3345.3.patch, 
 YARN-3345.4.patch


 As described in YARN-3214 (see design doc attached to that JIRA), we need add 
 non-exclusive node label RMAdmin API and CLI implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3368) Improve YARN web UI

2015-03-18 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-3368:
--
Issue Type: Improvement  (was: Bug)

 Improve YARN web UI
 ---

 Key: YARN-3368
 URL: https://issues.apache.org/jira/browse/YARN-3368
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He

 The goal is to improve YARN UI for better usability.
 We may take advantage of some existing front-end frameworks to build a 
 fancier, easier-to-use UI. 
 The old UI continue to exist until  we feel it's ready to flip to the new UI.
 This serves as an umbrella jira to track the tasks. we can do this in a 
 branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3369) Missing NullPointer check in AppSchedulingInfo causes RM to die

2015-03-18 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368347#comment-14368347
 ] 

Brahma Reddy Battula commented on YARN-3369:


[~giovanni.fumarola] thanks for reporting..I would like to work on this jira, 
If you have patch, you can reassign to yourself...thanks

 Missing NullPointer check in AppSchedulingInfo causes RM to die 
 

 Key: YARN-3369
 URL: https://issues.apache.org/jira/browse/YARN-3369
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Giovanni Matteo Fumarola
Assignee: Brahma Reddy Battula

 In AppSchedulingInfo.java the method checkForDeactivation() has these 2 
 consecutive lines:
 {code}
 ResourceRequest request = getResourceRequest(priority, ResourceRequest.ANY);
 if (request.getNumContainers()  0) {
 {code}
 the first line calls getResourceRequest and it can return null.
 {code}
 synchronized public ResourceRequest getResourceRequest(
 Priority priority, String resourceName) {
 MapString, ResourceRequest nodeRequests = requests.get(priority);
 return  (nodeRequests == null) ? {color:red} null : 
 nodeRequests.get(resourceName);
 }
 {code}
 The second line dereferences the pointer directly without a check.
 If the pointer is null, the RM dies. 
 {quote}2015-03-17 14:14:04,757 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1384)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1263)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:816)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:588)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:449)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1017)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1059)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:114)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:739)
 at java.lang.Thread.run(Thread.java:722)
 {color:red} *2015-03-17 14:14:04,758 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, 
 bbye..*{color} {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3040) [Data Model] Implement client-side API for handling flows

2015-03-18 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-3040:
--
Assignee: Zhijie Shen  (was: Robert Kanter)

 [Data Model] Implement client-side API for handling flows
 -

 Key: YARN-3040
 URL: https://issues.apache.org/jira/browse/YARN-3040
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Zhijie Shen
 Attachments: YARN-3040.1.patch


 Per design in YARN-2928, implement client-side API for handling *flows*. 
 Frameworks should be able to define and pass in all attributes of flows and 
 flow runs to YARN, and they should be passed into ATS writers.
 YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3040) [Data Model] Implement client-side API for handling flows

2015-03-18 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-3040:
--
Attachment: YARN-3040.1.patch

 [Data Model] Implement client-side API for handling flows
 -

 Key: YARN-3040
 URL: https://issues.apache.org/jira/browse/YARN-3040
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter
 Attachments: YARN-3040.1.patch


 Per design in YARN-2928, implement client-side API for handling *flows*. 
 Frameworks should be able to define and pass in all attributes of flows and 
 flow runs to YARN, and they should be passed into ATS writers.
 YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3351) AppMaster tracking URL is broken in HA

2015-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368090#comment-14368090
 ] 

Hadoop QA commented on YARN-3351:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705449/YARN-3351.003.patch
  against trunk revision c239b6d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7019//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7019//console

This message is automatically generated.

 AppMaster tracking URL is broken in HA
 --

 Key: YARN-3351
 URL: https://issues.apache.org/jira/browse/YARN-3351
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3351.001.patch, YARN-3351.002.patch, 
 YARN-3351.003.patch


 After YARN-2713, the AppMaster link is broken in HA.  To repro 
 a) setup RM HA and ensure the first RM is not active,
 b) run a long sleep job and view the tracking url on the RM applications page
 The log and full stack trace is shown below
 {noformat}
 2015-02-05 20:47:43,478 WARN org.mortbay.log: 
 /proxy/application_1423182188062_0002/: java.net.BindException: Cannot assign 
 requested address
 {noformat}
 {noformat}
 java.net.BindException: Cannot assign requested address
   at java.net.PlainSocketImpl.socketBind(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
   at java.net.Socket.bind(Socket.java:631)
   at java.net.Socket.init(Socket.java:423)
   at java.net.Socket.init(Socket.java:280)
   at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
   at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
   at 
 org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
   at 
 org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
   at 
 org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
   at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
   at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.proxyLink(WebAppProxyServlet.java:188)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:345)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
   at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2828) Enable auto refresh of web pages (using http parameter)

2015-03-18 Thread Vijay Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay Bhat updated YARN-2828:
-
Attachment: YARN-2828.005.patch

 Enable auto refresh of web pages (using http parameter)
 ---

 Key: YARN-2828
 URL: https://issues.apache.org/jira/browse/YARN-2828
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Tim Robertson
Assignee: Vijay Bhat
Priority: Minor
 Attachments: YARN-2828.001.patch, YARN-2828.002.patch, 
 YARN-2828.003.patch, YARN-2828.004.patch, YARN-2828.005.patch


 The MR1 Job Tracker had a useful HTTP parameter of e.g. refresh=3 that 
 could be appended to URLs which enabled a page reload.  This was very useful 
 when developing mapreduce jobs, especially to watch counters changing.  This 
 is lost in the the Yarn interface.
 Could be implemented as a page element (e.g. drop down or so), but I'd 
 recommend that the page not be more cluttered, and simply bring back the 
 optional refresh HTTP param.  It worked really nicely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-18 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Description: 
Implement a Fair Comparator for the Scheduler Comparator Ordering Policy which 
prefers to allocate to SchedulerProcesses with least current usage, very 
similar to the FairScheduler's FairSharePolicy.  

The Policy will offer allocations to applications in a queue in order of least 
resources used, and preempt applications in reverse order (from most resources 
used). This will include conditional support for sizeBasedWeight style 
adjustment

An implementation of a Scheduler Comparator for use with the Scheduler 
Comparator Ordering Policy will be built with the below comparison for ordering 
applications for container assignment (ascending) and for preemption 
(descending)

Current resource usage - less usage is lesser
Submission time - earlier is lesser

Optionally, based on a conditional configuration to enable sizeBasedWeight 
(default false), an adjustment to boost larger applications (to offset the 
natural preference for smaller applications) will adjust the resource usage 
value based on demand, dividing it by the below value:

Math.log1p(app memory demand) / Math.log(2);

In cases where the above is indeterminate (two applications are equal after 
this comparison), behavior falls back to comparison based on the application 
name, which is lexically FIFO for that comparison (first submitted is lesser)



  was:Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
SchedulerProcesses with least current usage, very similar to the 
FairScheduler's FairSharePolicy.  


 Implement a Fair SchedulerOrderingPolicy
 

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch


 Implement a Fair Comparator for the Scheduler Comparator Ordering Policy 
 which prefers to allocate to SchedulerProcesses with least current usage, 
 very similar to the FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 An implementation of a Scheduler Comparator for use with the Scheduler 
 Comparator Ordering Policy will be built with the below comparison for 
 ordering applications for container assignment (ascending) and for preemption 
 (descending)
 Current resource usage - less usage is lesser
 Submission time - earlier is lesser
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 name, which is lexically FIFO for that comparison (first submitted is lesser)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3368) Improve YARN web UI

2015-03-18 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-3368:
--
Description: 
The goal is to improve YARN UI for better usability.

We may take advantage of some existing front-end frameworks to build a fancier, 
easier-to-use UI. 

The old UI continue to exist until  we feel it's ready to flip to the new UI.
This serves as an umbrella jira to track the tasks. we can do this in a branch.

  was:
The goal is to improve YARN UI for better usability.

We may take advantage of some existing front-end frameworks to build a fancier, 
easier-to-use UI. 

The old UI continue to exist until  we feel it's ready to flip to the new UI.


 Improve YARN web UI
 ---

 Key: YARN-3368
 URL: https://issues.apache.org/jira/browse/YARN-3368
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He

 The goal is to improve YARN UI for better usability.
 We may take advantage of some existing front-end frameworks to build a 
 fancier, easier-to-use UI. 
 The old UI continue to exist until  we feel it's ready to flip to the new UI.
 This serves as an umbrella jira to track the tasks. we can do this in a 
 branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3369) Missing NullPointer check in AppSchedulingInfo causes RM to die

2015-03-18 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned YARN-3369:
--

Assignee: Brahma Reddy Battula

 Missing NullPointer check in AppSchedulingInfo causes RM to die 
 

 Key: YARN-3369
 URL: https://issues.apache.org/jira/browse/YARN-3369
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Giovanni Matteo Fumarola
Assignee: Brahma Reddy Battula

 In AppSchedulingInfo.java the method checkForDeactivation() has these 2 
 consecutive lines:
 {code}
 ResourceRequest request = getResourceRequest(priority, ResourceRequest.ANY);
 if (request.getNumContainers()  0) {
 {code}
 the first line calls getResourceRequest and it can return null.
 {code}
 synchronized public ResourceRequest getResourceRequest(
 Priority priority, String resourceName) {
 MapString, ResourceRequest nodeRequests = requests.get(priority);
 return  (nodeRequests == null) ? {color:red} null : 
 nodeRequests.get(resourceName);
 }
 {code}
 The second line dereferences the pointer directly without a check.
 If the pointer is null, the RM dies. 
 {quote}2015-03-17 14:14:04,757 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.NullPointerException
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1384)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1263)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:816)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:588)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:449)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1017)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1059)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:114)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:739)
 at java.lang.Thread.run(Thread.java:722)
 {color:red} *2015-03-17 14:14:04,758 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, 
 bbye..*{color} {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2828) Enable auto refresh of web pages (using http parameter)

2015-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368339#comment-14368339
 ] 

Hadoop QA commented on YARN-2828:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705458/YARN-2828.005.patch
  against trunk revision 20b4922.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebApp

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7020//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7020//console

This message is automatically generated.

 Enable auto refresh of web pages (using http parameter)
 ---

 Key: YARN-2828
 URL: https://issues.apache.org/jira/browse/YARN-2828
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Tim Robertson
Assignee: Vijay Bhat
Priority: Minor
 Attachments: YARN-2828.001.patch, YARN-2828.002.patch, 
 YARN-2828.003.patch, YARN-2828.004.patch, YARN-2828.005.patch


 The MR1 Job Tracker had a useful HTTP parameter of e.g. refresh=3 that 
 could be appended to URLs which enabled a page reload.  This was very useful 
 when developing mapreduce jobs, especially to watch counters changing.  This 
 is lost in the the Yarn interface.
 Could be implemented as a page element (e.g. drop down or so), but I'd 
 recommend that the page not be more cluttered, and simply bring back the 
 optional refresh HTTP param.  It worked really nicely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3372) Collision-free unique bindings refresh APIs for service records

2015-03-18 Thread Gopal V (JIRA)
Gopal V created YARN-3372:
-

 Summary: Collision-free unique bindings  refresh APIs for service 
records
 Key: YARN-3372
 URL: https://issues.apache.org/jira/browse/YARN-3372
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Reporter: Gopal V


The current bind() operation binds to a hard entry name for the service record, 
which makes it impossible for a truly distributed application without a 
centralized service to register without pre-determined naming conventions.

The uniqueness does not need to guarantee ordering or any other leakage of 
abstractions, merely that each bind() returns a unique path the record was 
bound to. And that the TTL refresh can periodically update that exact record as 
an active API.

These are state-less auto-configuration mechanisms inspired by the IPv6 
improvements over DNS for resolution. Instead of relying ICMPv6, this uses the 
registry to keep a collective memory of unique identities to which endpoints 
are delegated to.

This is only obliquely related to the Slider registration as even those do not 
track the generational ids for restarted daemons from the same container-id.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3373) TTL identity aware read cache for the SRV records

2015-03-18 Thread Gopal V (JIRA)
Gopal V created YARN-3373:
-

 Summary: TTL  identity aware read cache for the SRV records
 Key: YARN-3373
 URL: https://issues.apache.org/jira/browse/YARN-3373
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Reporter: Gopal V


The freshness/staleness checks of the SRV record should be an abstracted 
implementation detail of the service registry.

This implies that every client is asked to listServiceRecords each time they 
require a list of the records, which would be incredibly expensive if it 
involved network round-trips during normal tight-loop operations.

The combination of unique binding records and the TTL provides the equivalent 
of the DNS (fixed CNAME - unique A) roll-over mechanisms used to cache-bust 
effectively on the client-side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3331) NodeManager should use directory other than tmp for extracting and loading leveldbjni

2015-03-18 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3331:

Attachment: YARN-3331.002.patch

Addressed feedback. Thanks [~aw] for the very specific feedback

 NodeManager should use directory other than tmp for extracting and loading 
 leveldbjni
 -

 Key: YARN-3331
 URL: https://issues.apache.org/jira/browse/YARN-3331
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3331.001.patch, YARN-3331.002.patch


 /tmp can be  required to be noexec in many environments. This causes a 
 problem when  nodemanager tries to load the leveldbjni library which can get 
 unpacked and executed from /tmp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3284) Expose more ApplicationMetrics and ApplicationAttemptMetrics through YARN command

2015-03-18 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368482#comment-14368482
 ] 

Rohith commented on YARN-3284:
--

Thanks [~xgong] for working this jira.. [~leftnoteasy] for review..
Since patch size is huge, I think this task can be logically divide into 3 sub 
tasks which would help for reviewer for granular review and for implementer to 
rebase the code.
# API changes includes proto's
# Web UI  includes updating metrics
# Application CLI 
Any thoughts ? 

 Expose more ApplicationMetrics and ApplicationAttemptMetrics through YARN 
 command
 -

 Key: YARN-3284
 URL: https://issues.apache.org/jira/browse/YARN-3284
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-3284.1.patch, YARN-3284.1.patch, YARN-3284.2.patch, 
 YARN-3284.3.patch, YARN-3284.3.rebase.patch, YARN-3284.4.patch


 Current, we have some extra metrics about the application and current attempt 
 in RM Web UI. We should expose that information through YARN Command, too.
 1. Preemption metrics
 2. application outstanding resource requests
 3. container locality info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-03-18 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368463#comment-14368463
 ] 

Naganarasimha G R commented on YARN-3362:
-

Thanks [~wangda],
Regarding the approach to display i had few concerns : 
* There will be some common queue metrics across the labels, wont it get 
repeated across for each label if a queue is mapped to multiple labels ?
* IIUC most of the queue Metrics might not be specific to a label, like 
Capacity, Absolute max capacity, Max apps, Max AM's per user etc... . Correct 
me if my understanding on this is wrong.
* Apart from the label specific queue metrics like (label capacity, label abs 
capacity,used) are there any new Label specific queue metrics you have in your 
mind ?
* would it be better to list like
{noformat}
+ root [=] 30% used
  + a  [===] 75% used
+ a1 [=]  30% used
   -
  |  Queue Metrics |
  ||
  |   metrics1   |value1   |
  |   metrics2   |value2   |
   -
  |  Active Users info  (yarn-3273)|
  ||
  |   user1   |info|
  |   user2   |info|
   -
  | Label Resource usage info  |
  ||
  | label_x  [=] 30% used  |
  | label_y  [] 20% used   |
  --
+ a2 [=]  30% used
...
{noformat}
* Also if required we can have seperate page (/in the labels page/append at the 
end of CS page) like :
{noformat}
+ label_x  [=] 30% used [Actual Resource - Used resource ]
+ root [=] 30% used [Actual Resource - Used 
resource ]
  + a  [===] 75% used [Actual 
Resource - Used resource ]
+ a1 [=]  30% used [Actual Resource - Used resource 
]
+ label_y
+ root [...]
+ ...
+ label_z
+ root [...]
{noformat}

YARN-3273, has added more info to the CS page so we need to consider the size 
of page and its usability.
Please provide your thoughts on the same 

 Add node label usage in RM CapacityScheduler web UI
 ---

 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan
Assignee: Naganarasimha G R

 We don't have node label usage in RM CapacityScheduler web UI now, without 
 this, user will be hard to understand what happened to nodes have labels 
 assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-18 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367669#comment-14367669
 ] 

Yongjun Zhang commented on YARN-3021:
-

Hi [~jianhe],

Thanks for your comment. I'm actually aligned with what you suggested. 

The problem I was trying to point out is, we will have to change the behavior 
of the code I pasted above to deal with null renewer. E.g., the 
{{getRenewer()}} method will return a non-null based on current implementation 
(if not set or found, TRIVIAL_RENEWER will be returned); after making the 
suggested change for this jira,  the renewer can be null, so we should return 
null from {{getRenewer()}}.

My question was, I'm not sure about the impact of this behavior change. I 
expect some application does count on the current behavior.

More comments?

Thanks.



 YARN's delegation-token handling disallows certain trust setups to operate 
 properly over DistCp
 ---

 Key: YARN-3021
 URL: https://issues.apache.org/jira/browse/YARN-3021
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.3.0
Reporter: Harsh J
Assignee: Yongjun Zhang
 Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
 YARN-3021.003.patch, YARN-3021.patch


 Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
 and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
 clusters.
 Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
 needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
 as it attempts a renewDelegationToken(…) synchronously during application 
 submission (to validate the managed token before it adds it to a scheduler 
 for automatic renewal). The call obviously fails cause B realm will not trust 
 A's credentials (here, the RM's principal is the renewer).
 In the 1.x JobTracker the same call is present, but it is done asynchronously 
 and once the renewal attempt failed we simply ceased to schedule any further 
 attempts of renewals, rather than fail the job immediately.
 We should change the logic such that we attempt the renewal but go easy on 
 the failure and skip the scheduling alone, rather than bubble back an error 
 to the client, failing the app submission. This way the old behaviour is 
 retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367668#comment-14367668
 ] 

Hadoop QA commented on YARN-3241:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705391/YARN-3241.001.patch
  against trunk revision 9d72f93.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7013//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7013//console

This message is automatically generated.

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3241.000.patch, YARN-3241.001.patch


 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3334) [Event Producers] NM start to posting some app related metrics in early POC stage of phase 2.

2015-03-18 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3334:
-
Attachment: YARN-3334-demo.patch

Update a demo patch for putting some metrics info to new TimelineService. 
Haven't include any test now but will add it soon.

 [Event Producers] NM start to posting some app related metrics in early POC 
 stage of phase 2.
 -

 Key: YARN-3334
 URL: https://issues.apache.org/jira/browse/YARN-3334
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: YARN-2928
Reporter: Junping Du
Assignee: Junping Du
 Attachments: YARN-3334-demo.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3351) AppMaster tracking URL is broken in HA

2015-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367719#comment-14367719
 ] 

Hadoop QA commented on YARN-3351:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705406/YARN-3351.002.patch
  against trunk revision 402817c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  org.apache.hadoop.yarn.util.TestWebAppUtils

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7015//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7015//console

This message is automatically generated.

 AppMaster tracking URL is broken in HA
 --

 Key: YARN-3351
 URL: https://issues.apache.org/jira/browse/YARN-3351
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3351.001.patch, YARN-3351.002.patch


 After YARN-2713, the AppMaster link is broken in HA.  To repro 
 a) setup RM HA and ensure the first RM is not active,
 b) run a long sleep job and view the tracking url on the RM applications page
 The log and full stack trace is shown below
 {noformat}
 2015-02-05 20:47:43,478 WARN org.mortbay.log: 
 /proxy/application_1423182188062_0002/: java.net.BindException: Cannot assign 
 requested address
 {noformat}
 {noformat}
 java.net.BindException: Cannot assign requested address
   at java.net.PlainSocketImpl.socketBind(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
   at java.net.Socket.bind(Socket.java:631)
   at java.net.Socket.init(Socket.java:423)
   at java.net.Socket.init(Socket.java:280)
   at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
   at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
   at 
 org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
   at 
 org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
   at 
 org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
   at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
   at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.proxyLink(WebAppProxyServlet.java:188)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:345)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
   at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2003) Support to process Job priority from Submission Context in AppAttemptAddedSchedulerEvent [RM side]

2015-03-18 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367465#comment-14367465
 ] 

Sunil G commented on YARN-2003:
---

Thank you [~leftnoteasy] for sharing the comments

Yes,  YARN-2003 will focus on RM related changes excluding changes from 
Scheduler.
I will rearrange code as per same and update.

 Support to process Job priority from Submission Context in 
 AppAttemptAddedSchedulerEvent [RM side]
 --

 Key: YARN-2003
 URL: https://issues.apache.org/jira/browse/YARN-2003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-2003.patch, 0002-YARN-2003.patch, 
 0003-YARN-2003.patch, 0004-YARN-2003.patch


 AppAttemptAddedSchedulerEvent should be able to receive the Job Priority from 
 Submission Context and store.
 Later this can be used by Scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-18 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3241:

Attachment: YARN-3241.001.patch

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3241.000.patch, YARN-3241.001.patch


 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-18 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367539#comment-14367539
 ] 

Yongjun Zhang commented on YARN-3021:
-

Possibly introduce a dummy renewer class and make its methods no op, instead of 
setting renewer to null?

I wonder whether this would be compatible change ...


 YARN's delegation-token handling disallows certain trust setups to operate 
 properly over DistCp
 ---

 Key: YARN-3021
 URL: https://issues.apache.org/jira/browse/YARN-3021
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.3.0
Reporter: Harsh J
 Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
 YARN-3021.003.patch, YARN-3021.patch


 Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
 and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
 clusters.
 Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
 needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
 as it attempts a renewDelegationToken(…) synchronously during application 
 submission (to validate the managed token before it adds it to a scheduler 
 for automatic renewal). The call obviously fails cause B realm will not trust 
 A's credentials (here, the RM's principal is the renewer).
 In the 1.x JobTracker the same call is present, but it is done asynchronously 
 and once the renewal attempt failed we simply ceased to schedule any further 
 attempts of renewals, rather than fail the job immediately.
 We should change the logic such that we attempt the renewal but go easy on 
 the failure and skip the scheduling alone, rather than bubble back an error 
 to the client, failing the app submission. This way the old behaviour is 
 retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3365) Add support for using the 'tc' tool via container-executor

2015-03-18 Thread Sidharta Seethana (JIRA)
Sidharta Seethana created YARN-3365:
---

 Summary: Add support for using the 'tc' tool via container-executor
 Key: YARN-3365
 URL: https://issues.apache.org/jira/browse/YARN-3365
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana


We need the following functionality :

1) modify network interface traffic shaping rules - to be able to attach a 
qdisc, create child classes etc
2) read existing rules in place 
3) read stats for the various classes 

Using tc requires elevated privileges - hence this functionality is to be made 
available via container-executor. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >