date:20140904


[ 
https://issues.apache.org/jira/browse/YARN-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121493#comment-14121493
 ] 

Jonathan Eagles commented on YARN-2511:
---

[~zjshen], Can you give a review? This just makes the default cross-origin 
pattern allow all origins (same as jetty 7 Cross origin filter) making us more 
compatible and giving the users a better default option.

 Allow All Origins by default when Cross Origin Filter is enabled
 

 Key: YARN-2511
 URL: https://issues.apache.org/jira/browse/YARN-2511
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Attachments: YARN-2511-v1.patch


 This is the default for jetty 7 cross origin filter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2509) Enable Cross Origin Filter for timeline server only and not all Yarn servers


[ 
https://issues.apache.org/jira/browse/YARN-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121512#comment-14121512
 ] 

Hadoop QA commented on YARN-2509:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12666497/YARN-2509.patch
  against trunk revision 8f1a668.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4824//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4824//console

This message is automatically generated.

 Enable Cross Origin Filter for timeline server only and not all Yarn servers
 

 Key: YARN-2509
 URL: https://issues.apache.org/jira/browse/YARN-2509
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2509.patch, YARN-2509.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-415) Capture aggregate memory allocation at the app-level for chargeback


[ 
https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121514#comment-14121514
 ] 

Hadoop QA commented on YARN-415:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12666493/YARN-415.201409040036.txt
  against trunk revision 8f1a668.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4823//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4823//console

This message is automatically generated.

 Capture aggregate memory allocation at the app-level for chargeback
 ---

 Key: YARN-415
 URL: https://issues.apache.org/jira/browse/YARN-415
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: Kendall Thrapp
Assignee: Andrey Klochkov
 Attachments: YARN-415--n10.patch, YARN-415--n2.patch, 
 YARN-415--n3.patch, YARN-415--n4.patch, YARN-415--n5.patch, 
 YARN-415--n6.patch, YARN-415--n7.patch, YARN-415--n8.patch, 
 YARN-415--n9.patch, YARN-415.201405311749.txt, YARN-415.201406031616.txt, 
 YARN-415.201406262136.txt, YARN-415.201407042037.txt, 
 YARN-415.201407071542.txt, YARN-415.201407171553.txt, 
 YARN-415.201407172144.txt, YARN-415.201407232237.txt, 
 YARN-415.201407242148.txt, YARN-415.201407281816.txt, 
 YARN-415.201408062232.txt, YARN-415.201408080204.txt, 
 YARN-415.201408092006.txt, YARN-415.201408132109.txt, 
 YARN-415.201408150030.txt, YARN-415.201408181938.txt, 
 YARN-415.201408181938.txt, YARN-415.201408212033.txt, 
 YARN-415.201409040036.txt, YARN-415.patch


 For the purpose of chargeback, I'd like to be able to compute the cost of an
 application in terms of cluster resource usage.  To start out, I'd like to 
 get the memory utilization of an application.  The unit should be MB-seconds 
 or something similar and, from a chargeback perspective, the memory amount 
 should be the memory reserved for the application, as even if the app didn't 
 use all that memory, no one else was able to use it.
 (reserved ram for container 1 * lifetime of container 1) + (reserved ram for
 container 2 * lifetime of container 2) + ... + (reserved ram for container n 
 * lifetime of container n)
 It'd be nice to have this at the app level instead of the job level because:
 1. We'd still be able to get memory usage for jobs that crashed (and wouldn't 
 appear on the job history server).
 2. We'd be able to get memory usage for future non-MR jobs (e.g. Storm).
 This new metric should be available both through the RM UI and RM Web 
 Services REST API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2394) FairScheduler: Configure fairSharePreemptionThreshold per queue

2014-09-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121541#comment-14121541
 ] 

Hudson commented on YARN-2394:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1886 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1886/])
YARN-2394. FairScheduler: Configure fairSharePreemptionThreshold per queue. 
(Wei Yan via kasha) (kasha: rev 1dcaba9a7aa27f7ca4ba693e3abb56ab3c59c8a7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSLeafQueue.java


 FairScheduler: Configure fairSharePreemptionThreshold per queue
 ---

 Key: YARN-2394
 URL: https://issues.apache.org/jira/browse/YARN-2394
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Ashwin Shankar
Assignee: Wei Yan
 Fix For: 2.6.0

 Attachments: YARN-2394-1.patch, YARN-2394-2.patch, YARN-2394-3.patch, 
 YARN-2394-4.patch, YARN-2394-5.patch, YARN-2394-6.patch, YARN-2394-7.patch


 Preemption based on fair share starvation happens when usage of a queue is 
 less than 50% of its fair share. This 50% is hardcoded. We'd like to make 
 this configurable on a per queue basis, so that we can choose the threshold 
 at which we want to preempt. Calling this config 
 fairSharePreemptionThreshold. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2509) Enable Cross Origin Filter for timeline server only and not all Yarn servers


[ 
https://issues.apache.org/jira/browse/YARN-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121544#comment-14121544
 ] 

Jonathan Eagles commented on YARN-2509:
---

+1. Thanks, Mit. Committing this to trunk and branch-2

 Enable Cross Origin Filter for timeline server only and not all Yarn servers
 

 Key: YARN-2509
 URL: https://issues.apache.org/jira/browse/YARN-2509
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2509.patch, YARN-2509.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-2512) Allow for origin pattern matching in cross origin filter

Jonathan Eagles created YARN-2512:
-

 Summary: Allow for origin pattern matching in cross origin filter
 Key: YARN-2512
 URL: https://issues.apache.org/jira/browse/YARN-2512
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2509) Enable Cross Origin Filter for timeline server only and not all Yarn servers


 [ 
https://issues.apache.org/jira/browse/YARN-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-2509:
--
Fix Version/s: 2.6.0

 Enable Cross Origin Filter for timeline server only and not all Yarn servers
 

 Key: YARN-2509
 URL: https://issues.apache.org/jira/browse/YARN-2509
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Fix For: 2.6.0

 Attachments: YARN-2509.patch, YARN-2509.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-2513) Host framework UIs in YARN for use with the ATS

Jonathan Eagles created YARN-2513:
-

 Summary: Host framework UIs in YARN for use with the ATS
 Key: YARN-2513
 URL: https://issues.apache.org/jira/browse/YARN-2513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles


Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-2513) Host framework UIs in YARN for use with the ATS


 [ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles reassigned YARN-2513:
-

Assignee: Jonathan Eagles

 Host framework UIs in YARN for use with the ATS
 ---

 Key: YARN-2513
 URL: https://issues.apache.org/jira/browse/YARN-2513
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles

 Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
 infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2509) Enable Cross Origin Filter for timeline server only and not all Yarn servers


[ 
https://issues.apache.org/jira/browse/YARN-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121600#comment-14121600
 ] 

Zhijie Shen commented on YARN-2509:
---

What if CrossOriginFilterInitializer is configured in core-site.xml, but 
http-cross-origin.enabled = false?

 Enable Cross Origin Filter for timeline server only and not all Yarn servers
 

 Key: YARN-2509
 URL: https://issues.apache.org/jira/browse/YARN-2509
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Fix For: 2.6.0

 Attachments: YARN-2509.patch, YARN-2509.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2056) Disable preemption at Queue level

2014-09-04 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121687#comment-14121687
 ] 

Eric Payne commented on YARN-2056:
--

[~leftnoteasy], have you had a chance to look at the hierarchical queue test 
that I added?
I am grateful for your help.
Thanks
Eric Payne

 Disable preemption at Queue level
 -

 Key: YARN-2056
 URL: https://issues.apache.org/jira/browse/YARN-2056
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Mayank Bansal
Assignee: Eric Payne
 Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, 
 YARN-2056.201408310117.txt, YARN-2056.201409022208.txt


 We need to be able to disable preemption at individual queue level



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-2514) The elevated WSCE LRPC should grant access to the jon to the namenode

2014-09-04 Thread Remus Rusanu (JIRA)

Remus Rusanu created YARN-2514:
--

 Summary: The elevated WSCE LRPC should grant access to the jon to 
the namenode
 Key: YARN-2514
 URL: https://issues.apache.org/jira/browse/YARN-2514
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Remus Rusanu
Assignee: Remus Rusanu


the job created by wiutils task createAsUser must be 
accessible/controllable/killable by namenode or winutils task list/kill will 
fail later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1708) Add a public API to reserve resources (part of YARN-1051)

2014-09-04 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121737#comment-14121737
 ] 

Vinod Kumar Vavilapalli commented on YARN-1708:
---

Tx for the updated patch, [~subru]! This looks so much better!

A few minor comments
 - All the newInstance methods and setters in the response objects should be 
marked as private, for e.g in ReservationSubmissionResponse. Similarly in other 
objects too. We don't expect users to call them because responses are generated 
only by the platform.
 - ReservationId: It's likely that IDEs generate a better hashCode instead of 
us doing the long-to-int conversions?
 - ReservationRequests.{set|get}Type - {set|get}Interpretor? Similarly 
ReservationRequestsProto.type.
 - Rename ReservationRequest.leaseDuration to be simply duration inline with 
ReservationRequestProto.duration

 Add a public API to reserve resources (part of YARN-1051)
 -

 Key: YARN-1708
 URL: https://issues.apache.org/jira/browse/YARN-1708
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1708.patch, YARN-1708.patch


 This JIRA tracks the definition of a new public API for YARN, which allows 
 users to reserve resources (think of time-bounded queues). This is part of 
 the admission control enhancement proposed in YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2468) Log handling for LRS

[
https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121778#comment-14121778
]

Zhijie Shen commented on YARN-2468:
---

I'm afraid it may be not fair to compare the log file creation between a single
long running service and a short-term application.

I'm thinking about the file problem in a different direction. Let's see how
many log files will be created for a YARN cluster. For example, a long running
service takes 10% resource from the cluster, and runs for 10 days. On each day,
it will spawn out 1 log file per day. On the other side, for example, a normal
application also takes 10% resource from the cluster, runs for 1 days, and
spawn out 1 log file. Suppose the application will be started every day. Over
10 days, the number of spawned logs of both the long running service and the 10
iterations of the application is 10.

So from the point of view of the cluster, the number of logs is proportional to
the resource usage instead of the application number. The similar resource
usage may result in the similar number of log files. The case may not becoming
even worse if we take the whole cluster into account. However, I agree we loose
the opportunity to even make a long running service to use a single log file,
reducing the total log file number.

To completely resolve the too-many-files problem, we make think of timeline
server, which has the store layer to deal with the real I/O on your behalf.
Another optimization may be log retention, I'm not sure the feature already
exists or have been proposed together in this solution.

Log handling for LRS

Key: YARN-2468
URL: https://issues.apache.org/jira/browse/YARN-2468
Project: Hadoop YARN
Issue Type: Sub-task
Components: log-aggregation, nodemanager, resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
Attachments: YARN-2468.1.patch

Currently, when application is finished, NM will start to do the log
aggregation. But for Long running service applications, this is not ideal.
The problems we have are:
1) LRS applications are expected to run for a long time (weeks, months).
2) Currently, all the container logs (from one NM) will be written into a
single file. The files could become larger and larger.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2448) RM should expose the name of the ResourceCalculator being used when AMs register

2014-09-04 Thread Varun Vasudev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-2448:

Attachment: apache-yarn-2448.2.patch

Uploaded a new patch to address the concerns raised by Karthik, Sandy and 
Vinod. Instead of exposing the resource calculator, expose the resources used 
by the scheduler instead.

 RM should expose the name of the ResourceCalculator being used when AMs 
 register
 

 Key: YARN-2448
 URL: https://issues.apache.org/jira/browse/YARN-2448
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-2448.0.patch, apache-yarn-2448.1.patch, 
 apache-yarn-2448.2.patch


 The RM should expose the name of the ResourceCalculator being used when AMs 
 register, as part of the RegisterApplicationMasterResponse.
 This will allow applications to make better decisions when scheduling. 
 MapReduce for example, only looks at memory when deciding it's scheduling, 
 even though the RM could potentially be using the DominantResourceCalculator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2448) RM should expose the resource types considered during scheduling when AMs register

2014-09-04 Thread Varun Vasudev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-2448:

Summary: RM should expose the resource types considered during scheduling 
when AMs register  (was: RM should expose the name of the ResourceCalculator 
being used when AMs register)

 RM should expose the resource types considered during scheduling when AMs 
 register
 --

 Key: YARN-2448
 URL: https://issues.apache.org/jira/browse/YARN-2448
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-2448.0.patch, apache-yarn-2448.1.patch, 
 apache-yarn-2448.2.patch


 The RM should expose the name of the ResourceCalculator being used when AMs 
 register, as part of the RegisterApplicationMasterResponse.
 This will allow applications to make better decisions when scheduling. 
 MapReduce for example, only looks at memory when deciding it's scheduling, 
 even though the RM could potentially be using the DominantResourceCalculator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2448) RM should expose the resource types considered during scheduling when AMs register

2014-09-04 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121786#comment-14121786
 ] 

Varun Vasudev commented on YARN-2448:
-

Updated title to reflect what the latest patch is fixing.

 RM should expose the resource types considered during scheduling when AMs 
 register
 --

 Key: YARN-2448
 URL: https://issues.apache.org/jira/browse/YARN-2448
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-2448.0.patch, apache-yarn-2448.1.patch, 
 apache-yarn-2448.2.patch


 The RM should expose the name of the ResourceCalculator being used when AMs 
 register, as part of the RegisterApplicationMasterResponse.
 This will allow applications to make better decisions when scheduling. 
 MapReduce for example, only looks at memory when deciding it's scheduling, 
 even though the RM could potentially be using the DominantResourceCalculator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2431) NM restart: cgroup is not removed for reacquired containers

2014-09-04 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121796#comment-14121796
 ] 

Thomas Graves commented on YARN-2431:
-

+1. Thanks Jason! Feel free to check it in.

 NM restart: cgroup is not removed for reacquired containers
 ---

 Key: YARN-2431
 URL: https://issues.apache.org/jira/browse/YARN-2431
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-2431.patch


 The cgroup for a reacquired container is not being removed when the container 
 exits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2391) Windows Secure Container Executor helper service should assign launched process to the NM job

2014-09-04 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-2391:
---
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-2198

 Windows Secure Container Executor helper service should assign launched 
 process to the NM job
 -

 Key: YARN-2391
 URL: https://issues.apache.org/jira/browse/YARN-2391
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Critical
  Labels: security, windows

 The YARN-2198 NM helper service needs to make sure the launched process is 
 added to the NM job ('job' as in Windows NT job objects, not Hadoop jobs). 
 This ensures that NM termination ensures launched process termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2448) RM should expose the resource types considered during scheduling when AMs register


[ 
https://issues.apache.org/jira/browse/YARN-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121927#comment-14121927
 ] 

Hadoop QA commented on YARN-2448:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12666543/apache-yarn-2448.2.patch
  against trunk revision b44b2ee.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4826//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4826//console

This message is automatically generated.

 RM should expose the resource types considered during scheduling when AMs 
 register
 --

 Key: YARN-2448
 URL: https://issues.apache.org/jira/browse/YARN-2448
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-2448.0.patch, apache-yarn-2448.1.patch, 
 apache-yarn-2448.2.patch


 The RM should expose the name of the ResourceCalculator being used when AMs 
 register, as part of the RegisterApplicationMasterResponse.
 This will allow applications to make better decisions when scheduling. 
 MapReduce for example, only looks at memory when deciding it's scheduling, 
 even though the RM could potentially be using the DominantResourceCalculator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2431) NM restart: cgroup is not removed for reacquired containers

2014-09-04 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121980#comment-14121980
 ] 

Jason Lowe commented on YARN-2431:
--

Thanks for the reviews, Nathan and Tom!  Committing this.

 NM restart: cgroup is not removed for reacquired containers
 ---

 Key: YARN-2431
 URL: https://issues.apache.org/jira/browse/YARN-2431
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-2431.patch


 The cgroup for a reacquired container is not being removed when the container 
 exits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1712) Admission Control: plan follower

[
https://issues.apache.org/jira/browse/YARN-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Subramaniam Krishnan updated YARN-1712:
---
Attachment: YARN-1712.2.patch

[~leftnoteasy] , good to hear that you got the full context. Thanks for
reviewing the patch. I am uploading a new patch that has the following changes:
* Fix the Log message.
* Replace stale references to sessions with reservations, good catch.

The currentReservations might have new reservations which just start now so
were not active before. These will not yet have corresponding reservation
queues in CapacityScheduler as we create them after sorting. This is done to
ensure the what you highlighted earlier - we never exceed total capacity.

Admission Control: plan follower

Key: YARN-1712
URL: https://issues.apache.org/jira/browse/YARN-1712
Project: Hadoop YARN
Issue Type: Sub-task
Components: capacityscheduler, resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
Labels: reservations, scheduler
Attachments: YARN-1712.1.patch, YARN-1712.2.patch, YARN-1712.patch

This JIRA tracks a thread that continuously propagates the current state of
an inventory subsystem to the scheduler. As the inventory subsystem store the
plan of how the resources should be subdivided, the work we propose in this
JIRA realizes such plan by dynamically instructing the CapacityScheduler to
add/remove/resize queues to follow the plan.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1712) Admission Control: plan follower

[
https://issues.apache.org/jira/browse/YARN-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122007#comment-14122007
]

Hadoop QA commented on YARN-1712:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12666581/YARN-1712.2.patch
against trunk revision b44b2ee.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 1 new
or modified test files.

{color:red}-1 javac{color:red}. The patch appears to cause the build to
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4827//console

This message is automatically generated.

Admission Control: plan follower

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-611) Add an AM retry count reset window to YARN RM

[
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122011#comment-14122011
]

Zhijie Shen commented on YARN-611:
--

Almost good, just one minor thing:

1. You may want to mark this method \@Stable because the setter/getter is
marked \@Stable.
{code}
+ @Unstable
+ public static ApplicationSubmissionContext newInstance(
{code}

Add an AM retry count reset window to YARN RM
-

Key: YARN-611
URL: https://issues.apache.org/jira/browse/YARN-611
Project: Hadoop YARN
Issue Type: Sub-task
Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Chris Riccomini
Assignee: Xuan Gong
Attachments: YARN-611.1.patch, YARN-611.2.patch, YARN-611.3.patch,
YARN-611.4.patch, YARN-611.4.rebase.patch, YARN-611.5.patch,
YARN-611.6.patch, YARN-611.7.patch

YARN currently has the following config:
yarn.resourcemanager.am.max-retries
This config defaults to 2, and defines how many times to retry a failed AM
before failing the whole YARN job. YARN counts an AM as failed if the node
that it was running on dies (the NM will timeout, which counts as a failure
for the AM), or if the AM dies.
This configuration is insufficient for long running (or infinitely running)
YARN jobs, since the machine (or NM) that the AM is running on will
eventually need to be restarted (or the machine/NM will fail). In such an
event, the AM has not done anything wrong, but this is counted as a failure
by the RM. Since the retry count for the AM is never reset, eventually, at
some point, the number of machine/NM failures will result in the AM failure
count going above the configured value for
yarn.resourcemanager.am.max-retries. Once this happens, the RM will mark the
job as failed, and shut it down. This behavior is not ideal.
I propose that we add a second configuration:
yarn.resourcemanager.am.retry-count-window-ms
This configuration would define a window of time that would define when an AM
is well behaved, and it's safe to reset its failure count back to zero.
Every time an AM fails the RmAppImpl would check the last time that the AM
failed. If the last failure was less than retry-count-window-ms ago, and the
new failure count is max-retries, then the job should fail. If the AM has
never failed, the retry count is max-retries, or if the last failure was
OUTSIDE the retry-count-window-ms, then the job should be restarted.
Additionally, if the last failure was outside the retry-count-window-ms, then
the failure count should be set back to 0.
This would give developers a way to have well-behaved AMs run forever, while
still failing mis-behaving AMs after a short period of time.
I think the work to be done here is to change the RmAppImpl to actually look
at app.attempts, and see if there have been more than max-retries failures in
the last retry-count-window-ms milliseconds. If there have, then the job
should fail, if not, then the job should go forward. Additionally, we might
also need to add an endTime in either RMAppAttemptImpl or
RMAppFailedAttemptEvent, so that the RmAppImpl can check the time of the
failure.
Thoughts?

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2511) Allow All Origins by default when Cross Origin Filter is enabled


[ 
https://issues.apache.org/jira/browse/YARN-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122036#comment-14122036
 ] 

Zhijie Shen commented on YARN-2511:
---

+1, will commit the patch.

 Allow All Origins by default when Cross Origin Filter is enabled
 

 Key: YARN-2511
 URL: https://issues.apache.org/jira/browse/YARN-2511
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Attachments: YARN-2511-v1.patch


 This is the default for jetty 7 cross origin filter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-611) Add an AM retry count reset window to YARN RM

2014-09-04 Thread Xuan Gong (JIRA)

[
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xuan Gong updated YARN-611:
---
Attachment: YARN-611.8.patch

Add an AM retry count reset window to YARN RM
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-611) Add an AM retry count reset window to YARN RM

2014-09-04 Thread Xuan Gong (JIRA)

[
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122047#comment-14122047
]

Xuan Gong commented on YARN-611:

Thanks for the review. Uploaded a new patch to address the latest comment.
[~vinodkv] Do you have any other comments ?

Add an AM retry count reset window to YARN RM
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1514) Utility to benchmark ZKRMStateStore#loadState for ResourceManager-HA

2014-09-04 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122073#comment-14122073
 ] 

Tsuyoshi OZAWA commented on YARN-1514:
--

Confirmed that v5 patch can be applied to trunk.

 Utility to benchmark ZKRMStateStore#loadState for ResourceManager-HA
 

 Key: YARN-1514
 URL: https://issues.apache.org/jira/browse/YARN-1514
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Fix For: 2.6.0

 Attachments: YARN-1514.1.patch, YARN-1514.2.patch, YARN-1514.3.patch, 
 YARN-1514.4.patch, YARN-1514.4.patch, YARN-1514.5.patch, 
 YARN-1514.wip-2.patch, YARN-1514.wip.patch


 ZKRMStateStore is very sensitive to ZNode-related operations as discussed in 
 YARN-1307, YARN-1378 and so on. Especially, ZKRMStateStore#loadState is 
 called when RM-HA cluster does failover. Therefore, its execution time 
 impacts failover time of RM-HA.
 We need utility to benchmark time execution time of ZKRMStateStore#loadStore 
 as development tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

2014-09-04 Thread Jian He (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122078#comment-14122078
]

Jian He commented on YARN-1198:
---

Craig, thanks for working on the issue. Took a look at the patch.
Does it make sense to decouple headRoom calculation from user limit
calculation? specifically, we may calculate the headRoom when the AM actually
calls getHeadRoom. This should make sure that the headRoom is always up-to-date
when AM gets the headRoom. Also, we may not need to loop all the users in
assignContainers if doing this.

Capacity Scheduler headroom calculation does not work as expected
-

Key: YARN-1198
URL: https://issues.apache.org/jira/browse/YARN-1198
Project: Hadoop YARN
Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Craig Welch
Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch,
YARN-1198.4.patch, YARN-1198.5.patch, YARN-1198.6.patch, YARN-1198.7.patch,
YARN-1198.8.patch

Today headroom calculation (for the app) takes place only when
* New node is added/removed from the cluster
* New container is getting assigned to the application.
However there are potentially lot of situations which are not considered for
this calculation
* If a container finishes then headroom for that application will change and
should be notified to the AM accordingly.
* If a single user has submitted multiple applications (app1 and app2) to the
same queue then
** If app1's container finishes then not only app1's but also app2's AM
should be notified about the change in headroom.
** Similarly if a container is assigned to any applications app1/app2 then
both AM should be notified about their headroom.
** To simplify the whole communication process it is ideal to keep headroom
per User per LeafQueue so that everyone gets the same picture (apps belonging
to same user and submitted in same queue).
* If a new user submits an application to the queue then all applications
submitted by all users in that queue should be notified of the headroom
change.
* Also today headroom is an absolute number ( I think it should be normalized
but then this is going to be not backward compatible..)
* Also when admin user refreshes queue headroom has to be updated.
These all are the potential bugs in headroom calculations

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1707) Making the CapacityScheduler more dynamic


 [ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1707:
---
Attachment: YARN-1707.9.patch

Uploading a new patch with a minor change. Renamed 
ReservationQueue#changeCapacity to ReservationQueue#setEntitlement for 
consistency.

 Making the CapacityScheduler more dynamic
 -

 Key: YARN-1707
 URL: https://issues.apache.org/jira/browse/YARN-1707
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler
 Attachments: YARN-1707.2.patch, YARN-1707.3.patch, YARN-1707.4.patch, 
 YARN-1707.5.patch, YARN-1707.6.patch, YARN-1707.7.patch, YARN-1707.8.patch, 
 YARN-1707.9.patch, YARN-1707.patch


 The CapacityScheduler is a rather static at the moment, and refreshqueue 
 provides a rather heavy-handed way to reconfigure it. Moving towards 
 long-running services (tracked in YARN-896) and to enable more advanced 
 admission control and resource parcelling we need to make the 
 CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
 YARN-1051.
 Concretely this require the following changes:
 * create queues dynamically
 * destroy queues dynamically
 * dynamically change queue parameters (e.g., capacity) 
 * modify refreshqueue validation to enforce sum(child.getCapacity())= 100% 
 instead of ==100%
 We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1708) Add a public API to reserve resources (part of YARN-1051)


 [ 
https://issues.apache.org/jira/browse/YARN-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1708:
---
Attachment: YARN-1708.patch

Thanks [~vinodkv] for reviewing the patch. I am uploading a new patch that has 
the following fixes based on your comments:
  * All the newInstance methods and setters in the Reservation*Response objects 
should be marked as private.
  * Replaced hashCode with IDE generated one in ReservationId
  * Renamed ReservationRequests.{set|get}Type - {set|get}Interpretor, also in 
ReservationRequestsProto.type.
   * Renamed ReservationRequest.leaseDuration to be simply duration to make it 
consistent with ReservationRequestProto.duration

 Add a public API to reserve resources (part of YARN-1051)
 -

 Key: YARN-1708
 URL: https://issues.apache.org/jira/browse/YARN-1708
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1708.patch, YARN-1708.patch, YARN-1708.patch


 This JIRA tracks the definition of a new public API for YARN, which allows 
 users to reserve resources (think of time-bounded queues). This is part of 
 the admission control enhancement proposed in YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1707) Making the CapacityScheduler more dynamic


[ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122127#comment-14122127
 ] 

Hadoop QA commented on YARN-1707:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12666598/YARN-1707.9.patch
  against trunk revision 51a4faf.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4829//console

This message is automatically generated.

 Making the CapacityScheduler more dynamic
 -

 Key: YARN-1707
 URL: https://issues.apache.org/jira/browse/YARN-1707
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler
 Attachments: YARN-1707.2.patch, YARN-1707.3.patch, YARN-1707.4.patch, 
 YARN-1707.5.patch, YARN-1707.6.patch, YARN-1707.7.patch, YARN-1707.8.patch, 
 YARN-1707.9.patch, YARN-1707.patch


 The CapacityScheduler is a rather static at the moment, and refreshqueue 
 provides a rather heavy-handed way to reconfigure it. Moving towards 
 long-running services (tracked in YARN-896) and to enable more advanced 
 admission control and resource parcelling we need to make the 
 CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
 YARN-1051.
 Concretely this require the following changes:
 * create queues dynamically
 * destroy queues dynamically
 * dynamically change queue parameters (e.g., capacity) 
 * modify refreshqueue validation to enforce sum(child.getCapacity())= 100% 
 instead of ==100%
 We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1708) Add a public API to reserve resources (part of YARN-1051)


[ 
https://issues.apache.org/jira/browse/YARN-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122128#comment-14122128
 ] 

Hadoop QA commented on YARN-1708:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/1201/YARN-1708.patch
  against trunk revision 51a4faf.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4830//console

This message is automatically generated.

 Add a public API to reserve resources (part of YARN-1051)
 -

 Key: YARN-1708
 URL: https://issues.apache.org/jira/browse/YARN-1708
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1708.patch, YARN-1708.patch, YARN-1708.patch


 This JIRA tracks the definition of a new public API for YARN, which allows 
 users to reserve resources (think of time-bounded queues). This is part of 
 the admission control enhancement proposed in YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-611) Add an AM retry count reset window to YARN RM

[
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122140#comment-14122140
]

Hadoop QA commented on YARN-611:

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12666595/YARN-611.8.patch
against trunk revision 3fa5f72.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 3 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 core tests{color}. The patch passed unit tests in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/4828//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4828//console

This message is automatically generated.

Add an AM retry count reset window to YARN RM
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1707) Making the CapacityScheduler more dynamic

2014-09-04 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122185#comment-14122185
 ] 

Jian He commented on YARN-1707:
---

+1 for the latest patch, thanks [~subru] and [~curino] !

 Making the CapacityScheduler more dynamic
 -

 Key: YARN-1707
 URL: https://issues.apache.org/jira/browse/YARN-1707
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler
 Attachments: YARN-1707.2.patch, YARN-1707.3.patch, YARN-1707.4.patch, 
 YARN-1707.5.patch, YARN-1707.6.patch, YARN-1707.7.patch, YARN-1707.8.patch, 
 YARN-1707.9.patch, YARN-1707.patch


 The CapacityScheduler is a rather static at the moment, and refreshqueue 
 provides a rather heavy-handed way to reconfigure it. Moving towards 
 long-running services (tracked in YARN-896) and to enable more advanced 
 admission control and resource parcelling we need to make the 
 CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
 YARN-1051.
 Concretely this require the following changes:
 * create queues dynamically
 * destroy queues dynamically
 * dynamically change queue parameters (e.g., capacity) 
 * modify refreshqueue validation to enforce sum(child.getCapacity())= 100% 
 instead of ==100%
 We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

2014-09-04 Thread Craig Welch (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122188#comment-14122188
 ] 

Craig Welch commented on YARN-1198:
---

[~jianhe], have a look at patch 7, it takes that sort of approach.  

 Capacity Scheduler headroom calculation does not work as expected
 -

 Key: YARN-1198
 URL: https://issues.apache.org/jira/browse/YARN-1198
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Craig Welch
 Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch, 
 YARN-1198.4.patch, YARN-1198.5.patch, YARN-1198.6.patch, YARN-1198.7.patch, 
 YARN-1198.8.patch


 Today headroom calculation (for the app) takes place only when
 * New node is added/removed from the cluster
 * New container is getting assigned to the application.
 However there are potentially lot of situations which are not considered for 
 this calculation
 * If a container finishes then headroom for that application will change and 
 should be notified to the AM accordingly.
 * If a single user has submitted multiple applications (app1 and app2) to the 
 same queue then
 ** If app1's container finishes then not only app1's but also app2's AM 
 should be notified about the change in headroom.
 ** Similarly if a container is assigned to any applications app1/app2 then 
 both AM should be notified about their headroom.
 ** To simplify the whole communication process it is ideal to keep headroom 
 per User per LeafQueue so that everyone gets the same picture (apps belonging 
 to same user and submitted in same queue).
 * If a new user submits an application to the queue then all applications 
 submitted by all users in that queue should be notified of the headroom 
 change.
 * Also today headroom is an absolute number ( I think it should be normalized 
 but then this is going to be not backward compatible..)
 * Also  when admin user refreshes queue headroom has to be updated.
 These all are the potential bugs in headroom calculations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1707) Making the CapacityScheduler more dynamic


[ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122234#comment-14122234
 ] 

Subramaniam Krishnan commented on YARN-1707:


Thanks [~jianhe] and [~leftnoteasy] for taking the time to do a thorough 
review. I am proxying for [~curino] also as he did most of the work for the 
patch. As discussed we will commit this to YARN-1051 branch once we have +1s 
for few other sub-JIRAs.

 Making the CapacityScheduler more dynamic
 -

 Key: YARN-1707
 URL: https://issues.apache.org/jira/browse/YARN-1707
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler
 Attachments: YARN-1707.2.patch, YARN-1707.3.patch, YARN-1707.4.patch, 
 YARN-1707.5.patch, YARN-1707.6.patch, YARN-1707.7.patch, YARN-1707.8.patch, 
 YARN-1707.9.patch, YARN-1707.patch


 The CapacityScheduler is a rather static at the moment, and refreshqueue 
 provides a rather heavy-handed way to reconfigure it. Moving towards 
 long-running services (tracked in YARN-896) and to enable more advanced 
 admission control and resource parcelling we need to make the 
 CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
 YARN-1051.
 Concretely this require the following changes:
 * create queues dynamically
 * destroy queues dynamically
 * dynamically change queue parameters (e.g., capacity) 
 * modify refreshqueue validation to enforce sum(child.getCapacity())= 100% 
 instead of ==100%
 We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1492) truly shared cache for jars (jobjar/libjar)

2014-09-04 Thread Chris Trezzo (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-1492:
---
Attachment: YARN-1492-all-trunk-v5.patch

Attached v5 to address final license and findbug issues.

 truly shared cache for jars (jobjar/libjar)
 ---

 Key: YARN-1492
 URL: https://issues.apache.org/jira/browse/YARN-1492
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.4-alpha
Reporter: Sangjin Lee
Assignee: Chris Trezzo
 Attachments: YARN-1492-all-trunk-v1.patch, 
 YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch, 
 YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch, 
 shared_cache_design.pdf, shared_cache_design_v2.pdf, 
 shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, 
 shared_cache_design_v5.pdf


 Currently there is the distributed cache that enables you to cache jars and 
 files so that attempts from the same job can reuse them. However, sharing is 
 limited with the distributed cache because it is normally on a per-job basis. 
 On a large cluster, sometimes copying of jobjars and libjars becomes so 
 prevalent that it consumes a large portion of the network bandwidth, not to 
 speak of defeating the purpose of bringing compute to where data is. This 
 is wasteful because in most cases code doesn't change much across many jobs.
 I'd like to propose and discuss feasibility of introducing a truly shared 
 cache so that multiple jobs from multiple users can share and cache jars. 
 This JIRA is to open the discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-2515) Update ConverterUtils#toContainerId to parse epoch

2014-09-04 Thread Tsuyoshi OZAWA (JIRA)

Tsuyoshi OZAWA created YARN-2515:


 Summary: Update ConverterUtils#toContainerId to parse epoch
 Key: YARN-2515
 URL: https://issues.apache.org/jira/browse/YARN-2515
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA


ContaienrId#toString was updated on YARN-2182. We should also update 
ConverterUtils#toContainerId to parse epoch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2515) Update ConverterUtils#toContainerId to parse epoch

2014-09-04 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-2515:
-
Attachment: YARN-2515.1.patch

Updated to parse epoch if it exists.

 Update ConverterUtils#toContainerId to parse epoch
 --

 Key: YARN-2515
 URL: https://issues.apache.org/jira/browse/YARN-2515
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2515.1.patch


 ContaienrId#toString was updated on YARN-2182. We should also update 
 ConverterUtils#toContainerId to parse epoch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2515) Update ConverterUtils#toContainerId to parse epoch


[ 
https://issues.apache.org/jira/browse/YARN-2515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122365#comment-14122365
 ] 

Hadoop QA commented on YARN-2515:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/1251/YARN-2515.1.patch
  against trunk revision 6104520.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4833//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4833//console

This message is automatically generated.

 Update ConverterUtils#toContainerId to parse epoch
 --

 Key: YARN-2515
 URL: https://issues.apache.org/jira/browse/YARN-2515
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2515.1.patch


 ContaienrId#toString was updated on YARN-2182. We should also update 
 ConverterUtils#toContainerId to parse epoch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)


[ 
https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122373#comment-14122373
 ] 

Hadoop QA commented on YARN-1492:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/1231/YARN-1492-all-trunk-v5.patch
  against trunk revision f7df24b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 16 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4832//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4832//console

This message is automatically generated.

 truly shared cache for jars (jobjar/libjar)
 ---

 Key: YARN-1492
 URL: https://issues.apache.org/jira/browse/YARN-1492
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.4-alpha
Reporter: Sangjin Lee
Assignee: Chris Trezzo
 Attachments: YARN-1492-all-trunk-v1.patch, 
 YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch, 
 YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch, 
 shared_cache_design.pdf, shared_cache_design_v2.pdf, 
 shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, 
 shared_cache_design_v5.pdf


 Currently there is the distributed cache that enables you to cache jars and 
 files so that attempts from the same job can reuse them. However, sharing is 
 limited with the distributed cache because it is normally on a per-job basis. 
 On a large cluster, sometimes copying of jobjars and libjars becomes so 
 prevalent that it consumes a large portion of the network bandwidth, not to 
 speak of defeating the purpose of bringing compute to where data is. This 
 is wasteful because in most cases code doesn't change much across many jobs.
 I'd like to propose and discuss feasibility of introducing a truly shared 
 cache so that multiple jobs from multiple users can share and cache jars. 
 This JIRA is to open the discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)

[
https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122375#comment-14122375
]

Hadoop QA commented on YARN-1492:
-

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment

http://issues.apache.org/jira/secure/attachment/1231/YARN-1492-all-trunk-v5.patch
against trunk revision f7df24b.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 16 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 core tests{color}. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager

hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager.

{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/4831//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4831//console

This message is automatically generated.

truly shared cache for jars (jobjar/libjar)
---

Key: YARN-1492
URL: https://issues.apache.org/jira/browse/YARN-1492
Project: Hadoop YARN
Issue Type: New Feature
Affects Versions: 2.0.4-alpha
Reporter: Sangjin Lee
Assignee: Chris Trezzo
Attachments: YARN-1492-all-trunk-v1.patch,
YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch,
YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch,
shared_cache_design.pdf, shared_cache_design_v2.pdf,
shared_cache_design_v3.pdf, shared_cache_design_v4.pdf,
shared_cache_design_v5.pdf

Currently there is the distributed cache that enables you to cache jars and
files so that attempts from the same job can reuse them. However, sharing is
limited with the distributed cache because it is normally on a per-job basis.
On a large cluster, sometimes copying of jobjars and libjars becomes so
prevalent that it consumes a large portion of the network bandwidth, not to
speak of defeating the purpose of bringing compute to where data is. This
is wasteful because in most cases code doesn't change much across many jobs.
I'd like to propose and discuss feasibility of introducing a truly shared
cache so that multiple jobs from multiple users can share and cache jars.
This JIRA is to open the discussion.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2509) Enable Cross Origin Filter for timeline server only and not all Yarn servers


[ 
https://issues.apache.org/jira/browse/YARN-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122379#comment-14122379
 ] 

Jonathan Eagles commented on YARN-2509:
---

[~zjshen], I think this will come down to documenting this feature properly as 
part of YARN-2507. The current behavior is that CORS support will be added if 
the user does that. Let me know if you think it can be improved.

 Enable Cross Origin Filter for timeline server only and not all Yarn servers
 

 Key: YARN-2509
 URL: https://issues.apache.org/jira/browse/YARN-2509
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Fix For: 2.6.0

 Attachments: YARN-2509.patch, YARN-2509.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2460) Remove obsolete entries from yarn-default.xml

2014-09-04 Thread Ray Chiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-2460:
-
Attachment: YARN-2460-01.patch

Contains the following modifications:

1) Removing the following properties:

   yarn.ipc.serializer.type
   yarn.ipc.exception.factory.class
   yarn.resourcemanager.amliveliness-monitor.interval-ms
   yarn.resourcemanager.nm.liveness-monitor.interval-ms
   yarn.nodemanager.resourcemanager.connect.wait.secs
   yarn.nodemanager.resourcemanager.connect.retry_interval.secs

2) Renamed

   yarn.resourcemanager.application-tokens.master-key-rolling-interval-secs

   to

   yarn.resourcemanager.am-rm-tokens.master-key-rolling-interval-secs

 Remove obsolete entries from yarn-default.xml
 -

 Key: YARN-2460
 URL: https://issues.apache.org/jira/browse/YARN-2460
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
  Labels: newbie
 Attachments: YARN-2460-01.patch


 The following properties are defined in yarn-default.xml, but do not exist in 
 YarnConfiguration.
   mapreduce.job.hdfs-servers
   mapreduce.job.jar
   yarn.ipc.exception.factory.class
   yarn.ipc.serializer.type
   yarn.nodemanager.aux-services.mapreduce_shuffle.class
   yarn.nodemanager.hostname
   yarn.nodemanager.resourcemanager.connect.retry_interval.secs
   yarn.nodemanager.resourcemanager.connect.wait.secs
   yarn.resourcemanager.amliveliness-monitor.interval-ms
   yarn.resourcemanager.application-tokens.master-key-rolling-interval-secs
   yarn.resourcemanager.container.liveness-monitor.interval-ms
   yarn.resourcemanager.nm.liveness-monitor.interval-ms
   yarn.timeline-service.hostname
   yarn.timeline-service.http-authentication.simple.anonymous.allowed
   yarn.timeline-service.http-authentication.type
 Presumably, the mapreduce.* properties are okay.  Similarly, the 
 yarn.timeline-service.* properties are for the future TimelineService.  
 However, the rest are likely fully deprecated.
 Submitting bug for comment/feedback about which other properties should be 
 kept in yarn-default.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2460) Remove obsolete entries from yarn-default.xml