[jira] [Commented] (YARN-4390) Do surgical preemption based on reserved container in CapacityScheduler

2016-12-01 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712021#comment-15712021
 ] 

Eric Payne commented on YARN-4390:
--

Hi [~leftnoteasy],

I am making an effort to backport to branch-2.8 all of the prerequisite JIRAs 
needed for YARN-4945. [In a comment from 
YARN-2009,|https://issues.apache.org/jira/browse/YARN-2009?focusedCommentId=15633729=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15633729],
 we discussed this JIRA (YARN-4390) as being one of those prereqs.

The backport is not a clean one for this JIRA. If you have time, would you be 
able to provide a branch-2.8 patch?

Thanks.

> Do surgical preemption based on reserved container in CapacityScheduler
> ---
>
> Key: YARN-4390
> URL: https://issues.apache.org/jira/browse/YARN-4390
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha1
>Reporter: Eric Payne
>Assignee: Wangda Tan
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: QueueNotHittingMax.jpg, YARN-4390-design.1.pdf, 
> YARN-4390-test-results.pdf, YARN-4390.1.patch, YARN-4390.2.patch, 
> YARN-4390.3.branch-2.patch, YARN-4390.3.patch, YARN-4390.4.patch, 
> YARN-4390.5.patch, YARN-4390.6.patch, YARN-4390.7.patch, YARN-4390.8.patch
>
>
> There are multiple reasons why preemption could unnecessarily preempt 
> containers. One is that an app could be requesting a large container (say 
> 8-GB), and the preemption monitor could conceivably preempt multiple 
> containers (say 8, 1-GB containers) in order to fill the large container 
> request. These smaller containers would then be rejected by the requesting AM 
> and potentially given right back to the preempted app.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5889) Improve user-limit calculation in capacity scheduler

2016-12-01 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712097#comment-15712097
 ] 

Eric Payne commented on YARN-5889:
--

bq. Any change in resource (allocation and release of container etc) for a 
given user could set a state variable. This will set off by the computation 
thread if next cycle falls immediate.
Just so we're on the same page and what I'm suggesting is more clear.

I'm suggesting that we remove the computation thread 
({{ComputeUserLimitAsyncThread}}), and when {{LeafQueue#assignContainers}} or 
{{LeafQueue#releaseResource}} is called, we can set the flag in 
{{UserToPartitionRecord}}. This flag should also be set within queues are 
refreshed. {{getComputedUserLimit}} would then be called as it is now, and 
compute user limit resource if the flag is set.

bq. Its not ideal to ask allocation thread to hold till computation. So by 
seeing this state variable, we might need to compute user-limit in same 
allocation thread.
I recognize that my suggested approach does the calculation during the 
allocation thread. My assertion would be that
# this calculation is being done currently in the allocation thread
# my suggested approach is an optimization that would decrease (sometimes 
greatly) the number of times the calculation happens
# my suggested approach would be more accurate than with a separate computation 
thread.

> Improve user-limit calculation in capacity scheduler
> 
>
> Key: YARN-5889
> URL: https://issues.apache.org/jira/browse/YARN-5889
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5889.v0.patch, YARN-5889.v1.patch, 
> YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with 
> a write lock. To improve performance, this tickets is focussing on moving 
> user-limit calculation out of heartbeat allocation flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5956) Refactor ClientRMService

2016-12-01 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated YARN-5956:
-
Attachment: YARN-5956.01.patch

> Refactor ClientRMService
> 
>
> Key: YARN-5956
> URL: https://issues.apache.org/jira/browse/YARN-5956
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: YARN-5956.01.patch
>
>
> Some refactoring can be done in {{ClientRMService}}.
> - Remove redundant variable declaration
> - Fill in missing javadocs
> - Proper variable access modifier
> - Fix some typos in method name and exception messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5889) Improve user-limit calculation in capacity scheduler

2016-12-01 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712097#comment-15712097
 ] 

Eric Payne edited comment on YARN-5889 at 12/1/16 2:29 PM:
---

bq. Any change in resource (allocation and release of container etc) for a 
given user could set a state variable. This will set off by the computation 
thread if next cycle falls immediate.
Just so we're on the same page and what I'm suggesting is more clear.

I'm suggesting that we remove the computation thread 
({{ComputeUserLimitAsyncThread}}), and when {{LeafQueue#assignContainers}} or 
{{LeafQueue#releaseResource}} is called, we can set the flag in 
{{UserToPartitionRecord}}. This flag should also be set when queues are 
refreshed. {{getComputedUserLimit}} would then be called as it is in your 
patch, and compute user limit resource if the flag is set.

bq. Its not ideal to ask allocation thread to hold till computation. So by 
seeing this state variable, we might need to compute user-limit in same 
allocation thread.
I recognize that my suggested approach does the calculation during the 
allocation thread. My assertion would be that
# this calculation is being done currently in the allocation thread
# my suggested approach is an optimization that would decrease (sometimes 
greatly) the number of times the calculation happens
# my suggested approach would be more accurate than with a separate computation 
thread.


was (Author: eepayne):
bq. Any change in resource (allocation and release of container etc) for a 
given user could set a state variable. This will set off by the computation 
thread if next cycle falls immediate.
Just so we're on the same page and what I'm suggesting is more clear.

I'm suggesting that we remove the computation thread 
({{ComputeUserLimitAsyncThread}}), and when {{LeafQueue#assignContainers}} or 
{{LeafQueue#releaseResource}} is called, we can set the flag in 
{{UserToPartitionRecord}}. This flag should also be set within queues are 
refreshed. {{getComputedUserLimit}} would then be called as it is now, and 
compute user limit resource if the flag is set.

bq. Its not ideal to ask allocation thread to hold till computation. So by 
seeing this state variable, we might need to compute user-limit in same 
allocation thread.
I recognize that my suggested approach does the calculation during the 
allocation thread. My assertion would be that
# this calculation is being done currently in the allocation thread
# my suggested approach is an optimization that would decrease (sometimes 
greatly) the number of times the calculation happens
# my suggested approach would be more accurate than with a separate computation 
thread.

> Improve user-limit calculation in capacity scheduler
> 
>
> Key: YARN-5889
> URL: https://issues.apache.org/jira/browse/YARN-5889
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5889.v0.patch, YARN-5889.v1.patch, 
> YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with 
> a write lock. To improve performance, this tickets is focussing on moving 
> user-limit calculation out of heartbeat allocation flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5956) Refactor ClientRMService

2016-12-01 Thread Kai Sasaki (JIRA)
Kai Sasaki created YARN-5956:


 Summary: Refactor ClientRMService
 Key: YARN-5956
 URL: https://issues.apache.org/jira/browse/YARN-5956
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 3.0.0-alpha2
Reporter: Kai Sasaki
Assignee: Kai Sasaki
Priority: Minor


Some refactoring can be done in {{ClientRMService}}.
- Remove redundant variable declaration
- Fill in missing javadocs
- Proper variable access modifier
- Fix some typos in method name and exception messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5955) Use threadpool or multiple thread to recover app

2016-12-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711953#comment-15711953
 ] 

Varun Saxena edited comment on YARN-5955 at 12/1/16 1:23 PM:
-

We have actually made quite a few changes to make recovery faster internally.
And this includes using multiple threads for recovery in addition to other 
changes.

Plan to contribute along with Ajith to community. Will upload a design doc and 
initial patch in another JIRA soon.This probably can be added as a sub task 
there.


was (Author: varun_saxena):
We have actually made quite a few changes to make recovery faster internally.
And this includes using multiple threads for recovery in addition to other 
changes.

Plan to contribute along with Ajith to community. Will upload a design doc and 
initial patch soon.

> Use threadpool or multiple thread to recover app
> 
>
> Key: YARN-5955
> URL: https://issues.apache.org/jira/browse/YARN-5955
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: 5feixiang
>Assignee: Ajith S
> Fix For: 2.7.1
>
>
> current app recovery is one by one,use thead pool can make recovery faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5956) Refactor ClientRMService

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712306#comment-15712306
 ] 

Hadoop QA commented on YARN-5956:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 21s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 31 unchanged - 2 fixed = 33 total (was 33) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 20s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
|   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5956 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841295/YARN-5956.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux fb30686453d3 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e0fa492 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14144/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14144/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 

[jira] [Commented] (YARN-5734) OrgQueue for easy CapacityScheduler queue configuration management

2016-12-01 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712113#comment-15712113
 ] 

Kai Sasaki commented on YARN-5734:
--

I'm also interested in flexible Queue configuration management because 
xml-based configuration often be troublesome for us.

{quote}
We discussed an advanced feature of supporting multi-update transactions.
{quote}

I sometimes faced in-consistent state of queue while updating queue 
configuration with xml because updating cannot be done transactionally. We have 
incomplete queue state in scheduler in this case.

{quote}
Target branch-2
{quote}

Obsolete xml file can be incompatible change, so might it be better to target 
3.x later? Or does it mean only adding new backend storage impelemtation?

Anyway I want to work on after the sub-tasks are arranged. Thanks!

> OrgQueue for easy CapacityScheduler queue configuration management
> --
>
> Key: YARN-5734
> URL: https://issues.apache.org/jira/browse/YARN-5734
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Min Shen
>Assignee: Min Shen
> Attachments: OrgQueue_API-Based_Config_Management_v1.pdf, 
> OrgQueue_Design_v0.pdf
>
>
> The current xml based configuration mechanism in CapacityScheduler makes it 
> very inconvenient to apply any changes to the queue configurations. We saw 2 
> main drawbacks in the file based configuration mechanism:
> # This makes it very inconvenient to automate queue configuration updates. 
> For example, in our cluster setup, we leverage the queue mapping feature from 
> YARN-2411 to route users to their dedicated organization queues. It could be 
> extremely cumbersome to keep updating the config file to manage the very 
> dynamic mapping between users to organizations.
> # Even a user has the admin permission on one specific queue, that user is 
> unable to make any queue configuration changes to resize the subqueues, 
> changing queue ACLs, or creating new queues. All these operations need to be 
> performed in a centralized manner by the cluster administrators.
> With these current limitations, we realized the need of a more flexible 
> configuration mechanism that allows queue configurations to be stored and 
> managed more dynamically. We developed the feature internally at LinkedIn 
> which introduces the concept of MutableConfigurationProvider. What it 
> essentially does is to provide a set of configuration mutation APIs that 
> allows queue configurations to be updated externally with a set of REST APIs. 
> When performing the queue configuration changes, the queue ACLs will be 
> honored, which means only queue administrators can make configuration changes 
> to a given queue. MutableConfigurationProvider is implemented as a pluggable 
> interface, and we have one implementation of this interface which is based on 
> Derby embedded database.
> This feature has been deployed at LinkedIn's Hadoop cluster for a year now, 
> and have gone through several iterations of gathering feedbacks from users 
> and improving accordingly. With this feature, cluster administrators are able 
> to automate lots of thequeue configuration management tasks, such as setting 
> the queue capacities to adjust cluster resources between queues based on 
> established resource consumption patterns, or managing updating the user to 
> queue mappings. We have attached our design documentation with this ticket 
> and would like to receive feedbacks from the community regarding how to best 
> integrate it with the latest version of YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5955) Use threadpool or multiple thread to recover app

2016-12-01 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712530#comment-15712530
 ] 

Daniel Templeton commented on YARN-5955:


There's also YARN-4494 that backgrounds the whole recovery process.

> Use threadpool or multiple thread to recover app
> 
>
> Key: YARN-5955
> URL: https://issues.apache.org/jira/browse/YARN-5955
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: 5feixiang
>Assignee: Ajith S
> Fix For: 2.7.1
>
>
> current app recovery is one by one,use thead pool can make recovery faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5957) NMWebAppFilter currently drops end of URL during container log redirection

2016-12-01 Thread Eric Badger (JIRA)
Eric Badger created YARN-5957:
-

 Summary: NMWebAppFilter currently drops end of URL during 
container log redirection
 Key: YARN-5957
 URL: https://issues.apache.org/jira/browse/YARN-5957
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Eric Badger
Assignee: Eric Badger


When the NM redirects URLs given to it via NMWebAppFilter container log 
redirection it drops everything after the app owner. To make it more robust, we 
should append the end of the string to the redirection URL. This will allow the 
UIs to send back a more fine-grained link to the logs (e.g. directly to 
syslog/stderr) instead of going to the log aggregation page. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5922) Remove direct references of HBaseTimelineWriter/Reader in core ATS classes

2016-12-01 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712547#comment-15712547
 ] 

Haibo Chen commented on YARN-5922:
--

Thanks [~sjlee0] for your review! Does it make sense to add the new properties 
to yarn-default.xml given that they are the default values of two other 
properties (yarn.timeline-service.writer.class and 
yarn.timeline-service.reader.class) that are already in YarnConfiguration and 
yarn-default.xml? In TestYarnConfigurationFields, all timeline-service related 
configurations are skipped. I believe that's the reason why the test is not 
recognizing the new properties as default values of other existing properties.

> Remove direct references of HBaseTimelineWriter/Reader in core ATS classes
> --
>
> Key: YARN-5922
> URL: https://issues.apache.org/jira/browse/YARN-5922
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-5922-YARN-5355.01.patch, 
> YARN-5922-YARN-5355.02.patch, YARN-5922.01.patch, YARN-5922.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5956) Refactor ClientRMService

2016-12-01 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712603#comment-15712603
 ] 

Daniel Templeton commented on YARN-5956:


Thanks for the patch, [~lewuathe]!  Looks good except that the {{@param}} and 
{{@return}} messages should start with lower case letters.

> Refactor ClientRMService
> 
>
> Key: YARN-5956
> URL: https://issues.apache.org/jira/browse/YARN-5956
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: YARN-5956.01.patch
>
>
> Some refactoring can be done in {{ClientRMService}}.
> - Remove redundant variable declaration
> - Fill in missing javadocs
> - Proper variable access modifier
> - Fix some typos in method name and exception messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4675) Reorganize TimeClientImpl into TimeClientV1Impl and TimeClientV2Impl

2016-12-01 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712859#comment-15712859
 ] 

Vrushali C commented on YARN-4675:
--

As discussed in the call today, let's discuss if we need to do this *before* 
the next releases (open hadoop release as well as internal ones) coming up. 
Will be better to do the client side interfacing changes now rather than later. 

My thoughts are that this will make it cleaner to have separate client classes 
like the patch has. So I am +1 on reorganizing the TimeClientImpl into V1 and 
V2.

cc [~gtCarrera9] [~sjlee0] [~varun_saxena]

> Reorganize TimeClientImpl into TimeClientV1Impl and TimeClientV2Impl
> 
>
> Key: YARN-4675
> URL: https://issues.apache.org/jira/browse/YARN-4675
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: YARN-5355, oct16-medium
> Attachments: YARN-4675-YARN-2928.v1.001.patch
>
>
> We need to reorganize TimeClientImpl into TimeClientV1Impl ,  
> TimeClientV2Impl and if required a base class, so that its clear which part 
> of the code belongs to which version and thus better maintainable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-4990) Re-direction of a particular log file within in a container in NM UI does not redirect properly to Log Server ( history ) on container completion

2016-12-01 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong reassigned YARN-4990:
---

Assignee: Xuan Gong

> Re-direction of a particular log file within in a container in NM UI does not 
> redirect properly to Log Server ( history ) on container completion
> -
>
> Key: YARN-4990
> URL: https://issues.apache.org/jira/browse/YARN-4990
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Xuan Gong
>
> The NM does the redirection to the history server correctly. However if the 
> user is viewing or has a link to a particular specific file, the redirect 
> ends up going to the top level page for the container and not redirecting to 
> the specific file. Additionally, the start param to show logs from the offset 
> 0 also goes missing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5957) NMWebAppFilter currently drops end of URL during container log redirection

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712955#comment-15712955
 ] 

Hadoop QA commented on YARN-5957:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
41s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5957 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841326/YARN-5957.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2fd63bb40b68 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 96c5749 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14145/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/14145/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> NMWebAppFilter currently drops end of URL during container log redirection
> --
>
> Key: YARN-5957
> URL: https://issues.apache.org/jira/browse/YARN-5957
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Eric Badger
>Assignee: Eric Badger
>

[jira] [Commented] (YARN-5925) Extract hbase-backend-exclusive utility methods from TimelineStorageUtil

2016-12-01 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712811#comment-15712811
 ] 

Haibo Chen commented on YARN-5925:
--

[~sjlee0] Can you link the jira that this one needs wait for? Sorry, forgot the 
jira number you mentioned in the meeting.

> Extract hbase-backend-exclusive utility methods from TimelineStorageUtil
> 
>
> Key: YARN-5925
> URL: https://issues.apache.org/jira/browse/YARN-5925
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-5925-YARN-5355.01.patch, 
> YARN-5925-YARN-5355.02.patch, YARN-5925.01.patch, YARN-5925.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5746) The state of the parentQueue and its childQueues should be synchronized.

2016-12-01 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712866#comment-15712866
 ] 

Xuan Gong commented on YARN-5746:
-

Fix the checkstyle issue

> The state of the parentQueue and its childQueues should be synchronized.
> 
>
> Key: YARN-5746
> URL: https://issues.apache.org/jira/browse/YARN-5746
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>  Labels: oct16-easy
> Attachments: YARN-5746.1.patch, YARN-5746.2.patch, YARN-5746.3.patch, 
> YARN-5746.4.patch, YARN-5746.5.patch, YARN-5746.6.patch
>
>
> The state of the parentQueue and its childQeues need to be synchronized. 
> * If the state of the parentQueue becomes STOPPED, the state of its 
> childQueue need to become STOPPED as well. 
> * If we change the state of the queue to RUNNING, we should make sure the 
> state of all its ancestor must be RUNNING. Otherwise, we need to fail this 
> operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5746) The state of the parentQueue and its childQueues should be synchronized.

2016-12-01 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-5746:

Attachment: YARN-5746.6.patch

> The state of the parentQueue and its childQueues should be synchronized.
> 
>
> Key: YARN-5746
> URL: https://issues.apache.org/jira/browse/YARN-5746
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>  Labels: oct16-easy
> Attachments: YARN-5746.1.patch, YARN-5746.2.patch, YARN-5746.3.patch, 
> YARN-5746.4.patch, YARN-5746.5.patch, YARN-5746.6.patch
>
>
> The state of the parentQueue and its childQeues need to be synchronized. 
> * If the state of the parentQueue becomes STOPPED, the state of its 
> childQueue need to become STOPPED as well. 
> * If we change the state of the queue to RUNNING, we should make sure the 
> state of all its ancestor must be RUNNING. Otherwise, we need to fail this 
> operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5925) Extract hbase-backend-exclusive utility methods from TimelineStorageUtil

2016-12-01 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712877#comment-15712877
 ] 

Sangjin Lee commented on YARN-5925:
---

It's YARN-5739.

> Extract hbase-backend-exclusive utility methods from TimelineStorageUtil
> 
>
> Key: YARN-5925
> URL: https://issues.apache.org/jira/browse/YARN-5925
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-5925-YARN-5355.01.patch, 
> YARN-5925-YARN-5355.02.patch, YARN-5925.01.patch, YARN-5925.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5956) Refactor ClientRMService

2016-12-01 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712775#comment-15712775
 ] 

Rohith Sharma K S commented on YARN-5956:
-

As part of this JIRA, we can remove duplicate code which exist in most of the 
methods for checking UGI and getting application. 
Could we do something similar which has done in updateApplication* methods 
like below? This reduces majority of duplicated code!!
{code}
   UserGroupInformation callerUGI =
getCallerUgi(applicationId, AuditConstants.UPDATE_APP_TIMEOUTS);
RMApp application = verifyUserAccessForRMApp(applicationId, callerUGI,
AuditConstants.UPDATE_APP_TIMEOUTS);
{code}

> Refactor ClientRMService
> 
>
> Key: YARN-5956
> URL: https://issues.apache.org/jira/browse/YARN-5956
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: YARN-5956.01.patch
>
>
> Some refactoring can be done in {{ClientRMService}}.
> - Remove redundant variable declaration
> - Fill in missing javadocs
> - Proper variable access modifier
> - Fix some typos in method name and exception messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5957) NMWebAppFilter currently drops end of URL during container log redirection

2016-12-01 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-5957:
--
Attachment: YARN-5957.001.patch

Attaching patch that appends the end of the input URL to the end of the 
redirection URL instead of dropping it.

> NMWebAppFilter currently drops end of URL during container log redirection
> --
>
> Key: YARN-5957
> URL: https://issues.apache.org/jira/browse/YARN-5957
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: YARN-5957.001.patch
>
>
> When the NM redirects URLs given to it via NMWebAppFilter container log 
> redirection it drops everything after the app owner. To make it more robust, 
> we should append the end of the string to the redirection URL. This will 
> allow the UIs to send back a more fine-grained link to the logs (e.g. 
> directly to syslog/stderr) instead of going to the log aggregation page. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3409) Add constraint node labels

2016-12-01 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712803#comment-15712803
 ] 

Naganarasimha G R commented on YARN-3409:
-

Thanks for support [~devaraj.k] & [~kkaranasos],
Its encouraging to see this feature would be useful for other scenarios too.
bq. Can NodeManagers have attribute names same as some label/partition name in 
the cluster? 
this is one of the reason i want to separate this into two different 
sets(partitions and constraints). So that the even if same name it will not 
overlap as partition and constraint expression differs. 
bq. Did you think about having one expression(existing) which handles node 
label expression and constraints expression without delimiter between label and 
constraints expressions, constraints expression support implementation can be 
added without any new configurations/interfaces.
Yes Deva, had considered this option and have described in section {{Topics for 
discussion -> ResourceRequest modifications -> Option 3}}, hope you have taken 
a look at it, if required we can further discuss more about it.
bq. Can we have some details about how the NodeManager report these attributes 
to ResourceManager?
Here we are discussing 2 things from NM side, one is supporting  scripts or 
configs in the NM side which is already being supported as part of YARN-2495, 
latest documentation you can refer 
[features|http://hadoop.apache.org/docs/r3.0.0-alpha1/hadoop-yarn/hadoop-yarn-site/NodeLabel.html#Features].

bq. I also (strongly ) suggest to use these ConstraintNodeLabels in the context 
of YARN-1042 for supporting (anti-)affinity constraints as well. I think it 
will greatly avoid duplicate effort and simplify the code.
Agree, i presume it should be pluggable enough, either at the NM or the RM side 
for these app specific constraints on the nodes. May be when my initial 
prototype is ready we can further discuss about this. 

bq. On a similar note, can these ContraintNodeLabels be added/removed 
dynamically? For example, when a container starts its execution, it might 
contain some attributes (to be added – I know such attributes cannot be 
specified at the moment). Those attributes will then be added to the node's 
labels, for the time the container is running. This can be useful for 
(anti-)affinity constraints.
Yeah we were looking at some scenarios where in NM's side resource stats like 
loadaverage, swap etc.. for dynamic constraint values. But what you specified 
is also an interesting use case which can easily fit in the proposed scheme of 
things (but we need to be careful of the security scenarios for such type of 
constraints though).

bq. so we might consider using a name that denotes that 
Well actually the labels(partition) could have been named as pools and 
constraints as labels, but anyway for existing i am fine 
{{ConstraintExpression}}/{{AttributeExpression}} but would prefer the former as 
to an extent its been already in usage.

bq.  It might also be that the implementation of ConstraintNodeLabels will be 
easier at some places than that of NodeLabels/Partitions
Yes agree as there is less modifications in the scheduler node hence relatively 
simple but may be multiple places we might have to modify to make it usable 
hence we proposed for branch. Once i upload the WIP patches will create Branch 
and umbrella. Thought of cloning this jira to continue discussions here and 
jira tracking in the other.

bq. Can you please give an example of a cluster-level constraint?
Cluster-level constraint is similar to the existing {{ClusterNodeLabels}} which 
is super set of all the available node labels which will be used by RM for 
validating the expression and scheduling.  As we plane to support different 
type of Constraints, we require to maintain this super set, so that constraints 
reported for one node do not conflict with the other (if the types are 
different).

bq. Making sure I understand.. Why do we need this constraint? I think they are 
orthogonal, right? Unless you mean that if the user specifies a constraint, it 
has to be taken into account too, which I understand.
Agree its orthogonal, but partition is like a logical cluster and when user 
specifies a partition then nodes under the partition satisfying the constraint 
labels needs to be picked to ensure the compatibility of the current partition 
feature (partition capacity planning, headroom etc). Currently we are only 
supporting partition expression to have only one partition, i envisage like 
partition1||partition2  with constraint expression as (HAS_GPU && !Windows), so 
that all the nodes of Partition 1 and Parttion2 satifying the constraint 
expression will be picked for scheduling of the container.

bq. We assumed that the ConstraintNodeLabels are following the hierarchy of the 
cluster. That is, a rack was inheriting the ConstraintNodeLabels of all its 
nodes. A detail here is 

[jira] [Commented] (YARN-4675) Reorganize TimeClientImpl into TimeClientV1Impl and TimeClientV2Impl

2016-12-01 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712881#comment-15712881
 ] 

Li Lu commented on YARN-4675:
-

Yes I agree we need to decide on this issue soon. +1 for reorganizing 
TimelineClientImpl. Do we also need to distinguish v2 APIs from timeline 
clients as well? As of now we will have timeline APIs for v1, v1.5, and v2 so I 
think it may be helpful to distinguish at least v2 APIs. 

> Reorganize TimeClientImpl into TimeClientV1Impl and TimeClientV2Impl
> 
>
> Key: YARN-4675
> URL: https://issues.apache.org/jira/browse/YARN-4675
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: YARN-5355, oct16-medium
> Attachments: YARN-4675-YARN-2928.v1.001.patch
>
>
> We need to reorganize TimeClientImpl into TimeClientV1Impl ,  
> TimeClientV2Impl and if required a base class, so that its clear which part 
> of the code belongs to which version and thus better maintainable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3409) Add constraint node labels

2016-12-01 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712946#comment-15712946
 ] 

Wangda Tan commented on YARN-3409:
--

Thanks all for discussion, sorry for the delayed response, I'm still on 
vacation and I should be able to participate more discussions from next week.

[~Naganarasimha], I think POC will be helpful to understanding the scope, but 
from my POV, the biggest challenge of this task is properly design API and 
deciding how to make it work with existing features (like locality/partition) 
and future features (like YARN-4902 / global scheduling, etc.)

Following are overall thoughts in my mind:

h3. 1. For the (old) ResourceRequest API, we have a couple of choices:

*Choice a.*
Use nodeLabelExpression to specify nodePartition and nodeConstraint altogether
Pros:
- This has good semantic, since "constraint" could be considered as one special 
kind of "label"

Cons: 
- We have to add a new field for affinity/anti-affinity
- Some existing implementation assumes "nodeLabelExpression" equals to partition

*Choice b.*
Add a new field for constraint expression, and also for affnity/anti-affinity 
(Per suggested by Kostas). This should have minimum impact to existing 
features. But after this, the "nodeLabelExpression becomes a little ambiguous, 
we may need to deprecate existing nodeLabelExpression.

*My preference:*
*Personally I prefer b.* but it's better to rename nodeConstraintExpression to 
placementStrategy so we can have consistent naming and semantics after 
YARN-4902. Actually in my POC patch of YARN-1042 
(https://issues.apache.org/jira/secure/attachment/12822186/YARN-1042-global-scheduling.poc.1.patch#file-0),
 I use the {{pacementStrategy}} as the name

h3. 2. For CLI / REST API to manage (add/remove) and get constraint, we also 
have a couple of choices:

*Choice a.*
Add a set of new APIs / CLI to manage node constraints, for example, we can 
have a REST API {{POST /node-constraints/add}} to add node constraints

*Choice b.*
Extend existing {{NodeLabel}} object to support node constraint, we only need 
two additional field to support node constraint. 1) isNodeConstraint 2) Value 
(For example, we can have a constraint named jdk-verion, and value could be 
6/7/8).

*My perference:*
Personally I prefer b. Since we can reuse most of existing CLI / REST API 
implementations.

I found there're some other discussions need to be settled. Such as, support 
rack constraint, should we have node constraint added to non-ANY request, etc. 
I suggest to discuss them after we get consensus about above two points.

Please share your thoughts (Clearly stating which one you most prefer with some 
explanations will be better). 

+ [~naganarasimha...@apache.org], [~kkaranasos], [~devaraj.k], [~varun_saxena].

> Add constraint node labels
> --
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, capacityscheduler, client
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of 
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5958) Fix ASF license warnings for slider core module

2016-12-01 Thread Billie Rinaldi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-5958:
-
Attachment: YARN-5958-yarn-native-services.001.patch

> Fix ASF license warnings for slider core module
> ---
>
> Key: YARN-5958
> URL: https://issues.apache.org/jira/browse/YARN-5958
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
> Attachments: YARN-5958-yarn-native-services.001.patch
>
>
> The apache-rat-plugin is not configured properly for the 
> hadoop-yarn-slider-core module, which is resulting in erroneous license 
> warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5734) OrgQueue for easy CapacityScheduler queue configuration management

2016-12-01 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713186#comment-15713186
 ] 

Robert Kanter commented on YARN-5734:
-

Oozie has run into scalability problems with Derby, but I would imagine that 
Oozie does more frequent reads and writes to Derby than users will be doing 
with their Configurations, so it probably won't be a problem.  Just something 
to keep in mind.

> OrgQueue for easy CapacityScheduler queue configuration management
> --
>
> Key: YARN-5734
> URL: https://issues.apache.org/jira/browse/YARN-5734
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Min Shen
>Assignee: Min Shen
> Attachments: OrgQueue_API-Based_Config_Management_v1.pdf, 
> OrgQueue_Design_v0.pdf
>
>
> The current xml based configuration mechanism in CapacityScheduler makes it 
> very inconvenient to apply any changes to the queue configurations. We saw 2 
> main drawbacks in the file based configuration mechanism:
> # This makes it very inconvenient to automate queue configuration updates. 
> For example, in our cluster setup, we leverage the queue mapping feature from 
> YARN-2411 to route users to their dedicated organization queues. It could be 
> extremely cumbersome to keep updating the config file to manage the very 
> dynamic mapping between users to organizations.
> # Even a user has the admin permission on one specific queue, that user is 
> unable to make any queue configuration changes to resize the subqueues, 
> changing queue ACLs, or creating new queues. All these operations need to be 
> performed in a centralized manner by the cluster administrators.
> With these current limitations, we realized the need of a more flexible 
> configuration mechanism that allows queue configurations to be stored and 
> managed more dynamically. We developed the feature internally at LinkedIn 
> which introduces the concept of MutableConfigurationProvider. What it 
> essentially does is to provide a set of configuration mutation APIs that 
> allows queue configurations to be updated externally with a set of REST APIs. 
> When performing the queue configuration changes, the queue ACLs will be 
> honored, which means only queue administrators can make configuration changes 
> to a given queue. MutableConfigurationProvider is implemented as a pluggable 
> interface, and we have one implementation of this interface which is based on 
> Derby embedded database.
> This feature has been deployed at LinkedIn's Hadoop cluster for a year now, 
> and have gone through several iterations of gathering feedbacks from users 
> and improving accordingly. With this feature, cluster administrators are able 
> to automate lots of thequeue configuration management tasks, such as setting 
> the queue capacities to adjust cluster resources between queues based on 
> established resource consumption patterns, or managing updating the user to 
> queue mappings. We have attached our design documentation with this ticket 
> and would like to receive feedbacks from the community regarding how to best 
> integrate it with the latest version of YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5901) Fix race condition in TestGetGroups beforeclass setup()

2016-12-01 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713216#comment-15713216
 ] 

Haibo Chen commented on YARN-5901:
--

Will update the patch shortly.

> Fix race condition in TestGetGroups beforeclass setup()
> ---
>
> Key: YARN-5901
> URL: https://issues.apache.org/jira/browse/YARN-5901
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Attachments: YARN-5901.02.patch, yarn5901.001.patch
>
>
> In TestGetGroups, the class-level setup method spins up, in a child thread, a 
> resource manager that Yarn clients can talk to. But it checks whether the 
> resource manager is fully started by doing resourcemanager.getServiceState() 
> == STATE.STARTED. This is not reliable since resourcemanager.start() will 
> first trigger service state change in RM, and then starts up all the services 
> added to RM. We need to wait for RM to fully start before YARN clients  can 
> send requests. Otherwise, the tests can fail due to "connection refused"  
> exception when the main thread sends out client requests to RM and if the RPC 
> server has not fired up in the child thread.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5901) Fix race condition in TestGetGroups beforeclass setup()

2016-12-01 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5901:
-
Attachment: YARN-5901.03.patch

> Fix race condition in TestGetGroups beforeclass setup()
> ---
>
> Key: YARN-5901
> URL: https://issues.apache.org/jira/browse/YARN-5901
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Attachments: YARN-5901.02.patch, YARN-5901.03.patch, 
> yarn5901.001.patch
>
>
> In TestGetGroups, the class-level setup method spins up, in a child thread, a 
> resource manager that Yarn clients can talk to. But it checks whether the 
> resource manager is fully started by doing resourcemanager.getServiceState() 
> == STATE.STARTED. This is not reliable since resourcemanager.start() will 
> first trigger service state change in RM, and then starts up all the services 
> added to RM. We need to wait for RM to fully start before YARN clients  can 
> send requests. Otherwise, the tests can fail due to "connection refused"  
> exception when the main thread sends out client requests to RM and if the RPC 
> server has not fired up in the child thread.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5958) Fix ASF license warnings for slider core module

2016-12-01 Thread Billie Rinaldi (JIRA)
Billie Rinaldi created YARN-5958:


 Summary: Fix ASF license warnings for slider core module
 Key: YARN-5958
 URL: https://issues.apache.org/jira/browse/YARN-5958
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi


The apache-rat-plugin is not configured properly for the 
hadoop-yarn-slider-core module, which is resulting in erroneous license 
warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application

2016-12-01 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-5739:

Attachment: YARN-5739-YARN-5355.006.patch

New patch to address review comments. 

> Provide timeline reader API to list available timeline entity types for one 
> application
> ---
>
> Key: YARN-5739
> URL: https://issues.apache.org/jira/browse/YARN-5739
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-5739-YARN-5355.001.patch, 
> YARN-5739-YARN-5355.002.patch, YARN-5739-YARN-5355.003.patch, 
> YARN-5739-YARN-5355.004.patch, YARN-5739-YARN-5355.005.patch, 
> YARN-5739-YARN-5355.006.patch
>
>
> Right now we only show a part of available timeline entity data in the new 
> YARN UI. However, some data (especially library specific data) are not 
> possible to be queried out by the web UI. It will be appealing for the UI to 
> provide an "entity browser" for each YARN application. Actually, simply 
> dumping out available timeline entities (with proper pagination, of course) 
> would be pretty helpful for UI users. 
> On timeline side, we're not far away from this goal. Right now I believe the 
> only thing missing is to list all available entity types within one 
> application. The challenge here is that we're not storing this data for each 
> application, but given this kind of call is relatively rare (compare to 
> writes and updates) we can perform some scanning during the read time. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5746) The state of the parentQueue and its childQueues should be synchronized.

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713023#comment-15713023
 ] 

Hadoop QA commented on YARN-5746:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 38m 44s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 54m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueState |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5746 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841332/YARN-5746.6.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 11cfb2c855bd 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 96c5749 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14146/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14146/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/14146/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> The state of the parentQueue and its childQueues should be synchronized.
> 
>
> Key: YARN-5746
> URL: 

[jira] [Commented] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713294#comment-15713294
 ] 

Hadoop QA commented on YARN-5739:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
1s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
12s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
36s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
29s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} YARN-5355 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 2 new + 
29 unchanged - 1 fixed = 31 total (was 30) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
16s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  5m 
36s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in 
the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | YARN-5739 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841356/YARN-5739-YARN-5355.006.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c8e5bb395d32 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-5355 / f734977 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14147/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.txt
 |
| javadoc | 

[jira] [Commented] (YARN-5756) Add state-machine implementation for queues

2016-12-01 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713290#comment-15713290
 ] 

Li Lu commented on YARN-5756:
-

Seems like the second submission has been ignored by Jenkins. Kick it for one 
more round of testing. 

> Add state-machine implementation for queues
> ---
>
> Key: YARN-5756
> URL: https://issues.apache.org/jira/browse/YARN-5756
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-5756.1.patch, YARN-5756.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5901) Fix race condition in TestGetGroups beforeclass setup()

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713312#comment-15713312
 ] 

Hadoop QA commented on YARN-5901:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: 
The patch generated 0 new + 1 unchanged - 1 fixed = 1 total (was 2) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
14s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5901 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841357/YARN-5901.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d43aba5c0599 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 19f373a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14148/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/14148/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix race condition in TestGetGroups beforeclass setup()
> ---
>
> Key: YARN-5901
> URL: https://issues.apache.org/jira/browse/YARN-5901
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Attachments: YARN-5901.02.patch, 

[jira] [Updated] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application

2016-12-01 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-5739:

Attachment: YARN-5739-YARN-5355.006.patch

New 006 patch to make Jenkins happy. 

> Provide timeline reader API to list available timeline entity types for one 
> application
> ---
>
> Key: YARN-5739
> URL: https://issues.apache.org/jira/browse/YARN-5739
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-5739-YARN-5355.001.patch, 
> YARN-5739-YARN-5355.002.patch, YARN-5739-YARN-5355.003.patch, 
> YARN-5739-YARN-5355.004.patch, YARN-5739-YARN-5355.005.patch, 
> YARN-5739-YARN-5355.006.patch
>
>
> Right now we only show a part of available timeline entity data in the new 
> YARN UI. However, some data (especially library specific data) are not 
> possible to be queried out by the web UI. It will be appealing for the UI to 
> provide an "entity browser" for each YARN application. Actually, simply 
> dumping out available timeline entities (with proper pagination, of course) 
> would be pretty helpful for UI users. 
> On timeline side, we're not far away from this goal. Right now I believe the 
> only thing missing is to list all available entity types within one 
> application. The challenge here is that we're not storing this data for each 
> application, but given this kind of call is relatively rare (compare to 
> writes and updates) we can perform some scanning during the read time. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-2902:
-
Fix Version/s: 2.8.0

> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.7.2, 2.6.4, 3.0.0-alpha1
>
> Attachments: YARN-2902-branch-2.6.01.patch, YARN-2902.002.patch, 
> YARN-2902.03.patch, YARN-2902.04.patch, YARN-2902.05.patch, 
> YARN-2902.06.patch, YARN-2902.07.patch, YARN-2902.08.patch, 
> YARN-2902.09.patch, YARN-2902.10.patch, YARN-2902.11.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5901) Fix race condition in TestGetGroups beforeclass setup()

2016-12-01 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5901:
-
Attachment: YARN-5901-branch-2.03.patch

Uploading a patch for branch-2.

> Fix race condition in TestGetGroups beforeclass setup()
> ---
>
> Key: YARN-5901
> URL: https://issues.apache.org/jira/browse/YARN-5901
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Attachments: YARN-5901-branch-2.03.patch, YARN-5901.02.patch, 
> YARN-5901.03.patch, yarn5901.001.patch
>
>
> In TestGetGroups, the class-level setup method spins up, in a child thread, a 
> resource manager that Yarn clients can talk to. But it checks whether the 
> resource manager is fully started by doing resourcemanager.getServiceState() 
> == STATE.STARTED. This is not reliable since resourcemanager.start() will 
> first trigger service state change in RM, and then starts up all the services 
> added to RM. We need to wait for RM to fully start before YARN clients  can 
> send requests. Otherwise, the tests can fail due to "connection refused"  
> exception when the main thread sends out client requests to RM and if the RPC 
> server has not fired up in the child thread.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3727) For better error recovery, check if the directory exists before using it for localization.

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3727:
-
Fix Version/s: 2.8.0

> For better error recovery, check if the directory exists before using it for 
> localization.
> --
>
> Key: YARN-3727
> URL: https://issues.apache.org/jira/browse/YARN-3727
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.8.0, 2.7.2, 2.6.2, 3.0.0-alpha1
>
> Attachments: YARN-3727.000.patch, YARN-3727.001.patch
>
>
> For better error recovery, check if the directory exists before using it for 
> localization.
> We saw the following localization failure happened due to existing cache 
> directories.
> {code}
> 2015-05-11 18:59:59,756 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  DEBUG: FAILED { hdfs:///X/libjars/1234.jar, 1431395961545, FILE, 
> null }, Rename cannot overwrite non empty destination directory 
> //8/yarn/nm/usercache//filecache/21637
> 2015-05-11 18:59:59,756 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>  Resource 
> hdfs:///X/libjars/1234.jar(->//8/yarn/nm/usercache//filecache/21637/1234.jar)
>  transitioned from DOWNLOADING to FAILED
> {code}
> The real cause for this failure may be disk failure, LevelDB operation 
> failure for {{startResourceLocalization}}/{{finishResourceLocalization}} or 
> others.
> I wonder whether we can add error recovery code to avoid the localization 
> failure by not using the existing cache directories for localization.
> The exception happened at {{files.rename(dst_work, destDirPath, 
> Rename.OVERWRITE)}} in FSDownload#call. Based on the following code, after 
> the exception, the existing cache directory used by {{LocalizedResource}} 
> will be deleted.
> {code}
> try {
>  .
>   files.rename(dst_work, destDirPath, Rename.OVERWRITE);
> } catch (Exception e) {
>   try {
> files.delete(destDirPath, true);
>   } catch (IOException ignore) {
>   }
>   throw e;
> } finally {
> {code}
> Since the conflicting local directory will be deleted after localization 
> failure,
> I think it will be better to check if the directory exists before using it 
> for localization to avoid the localization failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3690) [JDK8] 'mvn site' fails

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3690:
-
Fix Version/s: 2.8.0

> [JDK8] 'mvn site' fails
> ---
>
> Key: YARN-3690
> URL: https://issues.apache.org/jira/browse/YARN-3690
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, site
> Environment: CentOS 7.0, Oracle JDK 8u45.
>Reporter: Akira Ajisaka
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: YARN-3690-002.patch, YARN-3690-003.patch, YARN-3690-patch
>
>
> 'mvn site' failed by the following error:
> {noformat}
> [ERROR] 
> /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
>  error: package org.apache.hadoop.yarn.factories has already been annotated
> [ERROR] @InterfaceAudience.LimitedPrivate({ "MapReduce", "YARN" })
> [ERROR] ^
> [ERROR] java.lang.AssertionError
> [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
> [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
> [ERROR] at 
> com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
> [ERROR] at 
> com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
> [ERROR] at 
> com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
> [ERROR] at 
> com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
> [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
> [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
> [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
> [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
> [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
> [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
> [ERROR] at 
> com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
> [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
> [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
> [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
> [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
> [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
> [ERROR] javadoc: error - fatal error
> [ERROR] 
> [ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc 
> -J-Xmx1024m @options @packages
> [ERROR] 
> [ERROR] Refer to the generated Javadoc files in 
> '/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
> [ERROR] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5958) Fix ASF license warnings for slider core module

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713299#comment-15713299
 ] 

Hadoop QA commented on YARN-5958:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 4s{color} | {color:green} yarn-native-services passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} yarn-native-services passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} yarn-native-services passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} yarn-native-services passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-yarn-slider-core in yarn-native-services 
failed. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-slider-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
3s{color} | {color:green} hadoop-yarn-slider-core in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5958 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841359/YARN-5958-yarn-native-services.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  |
| uname | Linux 11e9d726561f 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | yarn-native-services / 165e50b |
| Default Java | 1.8.0_111 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/14149/artifact/patchprocess/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-slider_hadoop-yarn-slider-core.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/14149/artifact/patchprocess/patch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-slider_hadoop-yarn-slider-core.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14149/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-slider/hadoop-yarn-slider-core
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-slider/hadoop-yarn-slider-core
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/14149/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix ASF license warnings for slider core module
> ---
>
> Key: YARN-5958
> URL: https://issues.apache.org/jira/browse/YARN-5958
>

[jira] [Updated] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application

2016-12-01 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-5739:

Attachment: (was: YARN-5739-YARN-5355.006.patch)

> Provide timeline reader API to list available timeline entity types for one 
> application
> ---
>
> Key: YARN-5739
> URL: https://issues.apache.org/jira/browse/YARN-5739
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-5739-YARN-5355.001.patch, 
> YARN-5739-YARN-5355.002.patch, YARN-5739-YARN-5355.003.patch, 
> YARN-5739-YARN-5355.004.patch, YARN-5739-YARN-5355.005.patch, 
> YARN-5739-YARN-5355.006.patch
>
>
> Right now we only show a part of available timeline entity data in the new 
> YARN UI. However, some data (especially library specific data) are not 
> possible to be queried out by the web UI. It will be appealing for the UI to 
> provide an "entity browser" for each YARN application. Actually, simply 
> dumping out available timeline entities (with proper pagination, of course) 
> would be pretty helpful for UI users. 
> On timeline side, we're not far away from this goal. Right now I believe the 
> only thing missing is to list all available entity types within one 
> application. The challenge here is that we're not storing this data for each 
> application, but given this kind of call is relatively rare (compare to 
> writes and updates) we can perform some scanning during the read time. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3362:
-
Fix Version/s: 2.8.0

> Add node label usage in RM CapacityScheduler web UI
> ---
>
> Key: YARN-3362
> URL: https://issues.apache.org/jira/browse/YARN-3362
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, resourcemanager, webapp
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1
>
> Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue 
> Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, 
> 2015.05.10_3362_Queue_Hierarchy.png, 2015.05.12_3362_Queue_Hierarchy.png, 
> AppInLabelXnoStatsInSchedPage.png, CSWithLabelsView.png, 
> No-space-between-Active_user_info-and-next-queues.png, Screen Shot 2015-04-29 
> at 11.42.17 AM.png, YARN-3362-branch-2.7.002.patch, 
> YARN-3362-branch-2.7.003.patch, YARN-3362-branch-2.7.004.patch, 
> YARN-3362.20150428-3-modified.patch, YARN-3362.20150428-3.patch, 
> YARN-3362.20150506-1.patch, YARN-3362.20150507-1.patch, 
> YARN-3362.20150510-1.patch, YARN-3362.20150511-1.patch, 
> YARN-3362.20150512-1.patch, capacity-scheduler.xml
>
>
> We don't have node label usage in RM CapacityScheduler web UI now, without 
> this, user will be hard to understand what happened to nodes have labels 
> assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3535) Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3535:
-
Fix Version/s: 2.8.0

> Scheduler must re-request container resources when RMContainer transitions 
> from ALLOCATED to KILLED
> ---
>
> Key: YARN-3535
> URL: https://issues.apache.org/jira/browse/YARN-3535
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Peng Zhang
>Assignee: Peng Zhang
>Priority: Critical
> Fix For: 2.8.0, 2.7.2, 2.6.4, 3.0.0-alpha1
>
> Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 
> 0005-YARN-3535.patch, 0006-YARN-3535.patch, YARN-3535-001.patch, 
> YARN-3535-002.patch, syslog.tgz, yarn-app.log
>
>
> During rolling update of NM, AM start of container on NM failed. 
> And then job hang there.
> Attach AM logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3508) Prevent processing preemption events on the main RM dispatcher

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3508:
-
Fix Version/s: 2.8.0

> Prevent processing preemption events on the main RM dispatcher
> --
>
> Key: YARN-3508
> URL: https://issues.apache.org/jira/browse/YARN-3508
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: YARN-3508-branch-2.7.01.patch, YARN-3508.002.patch, 
> YARN-3508.01.patch, YARN-3508.03.patch, YARN-3508.04.patch, 
> YARN-3508.05.patch, YARN-3508.06.patch
>
>
> We recently saw the RM for a large cluster lag far behind on the 
> AsyncDispacher event queue.  The AsyncDispatcher thread was consistently 
> blocked on the highly-contended CapacityScheduler lock trying to dispatch 
> preemption-related events for RMContainerPreemptEventDispatcher.  Preemption 
> processing should occur on the scheduler event dispatcher thread or a 
> separate thread to avoid delaying the processing of other events in the 
> primary dispatcher queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5197) RM leaks containers if running container disappears from node update

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5197:
-
Fix Version/s: 2.8.0

> RM leaks containers if running container disappears from node update
> 
>
> Key: YARN-5197
> URL: https://issues.apache.org/jira/browse/YARN-5197
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2, 2.6.4
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Fix For: 2.8.0, 2.6.5, 2.7.4
>
> Attachments: YARN-5197-branch-2.7.003.patch, 
> YARN-5197-branch-2.8.003.patch, YARN-5197.001.patch, YARN-5197.002.patch, 
> YARN-5197.003.patch
>
>
> Once a node reports a container running in a status update, the corresponding 
> RMNodeImpl will track the container in its launchedContainers map.  If the 
> node somehow misses sending the completed container status to the RM and the 
> container simply disappears from subsequent heartbeats, the container will 
> leak in launchedContainers forever and the container completion event will 
> not be sent to the scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3170) YARN architecture document needs updating

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3170:
-
Fix Version/s: 2.8.0

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: YARN-3170-002.patch, YARN-3170-003.patch, 
> YARN-3170-004.patch, YARN-3170-005.patch, YARN-3170-006.patch, 
> YARN-3170-007.patch, YARN-3170-008.patch, YARN-3170-009.patch, 
> YARN-3170-010.patch, YARN-3170-011.patch, YARN-3170.patch
>
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3136:
-
Fix Version/s: 2.8.0

> getTransferredContainers can be a bottleneck during AM registration
> ---
>
> Key: YARN-3136
> URL: https://issues.apache.org/jira/browse/YARN-3136
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Sunil G
>  Labels: 2.7.2-candidate
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: 0001-YARN-3136.patch, 00010-YARN-3136.patch, 
> 00011-YARN-3136.patch, 00012-YARN-3136.patch, 00013-YARN-3136.patch, 
> 0002-YARN-3136.patch, 0003-YARN-3136.patch, 0004-YARN-3136.patch, 
> 0005-YARN-3136.patch, 0006-YARN-3136.patch, 0007-YARN-3136.patch, 
> 0008-YARN-3136.patch, 0009-YARN-3136.patch, YARN-3136.branch-2.7.patch
>
>
> While examining RM stack traces on a busy cluster I noticed a pattern of AMs 
> stuck waiting for the scheduler lock trying to call getTransferredContainers. 
>  The scheduler lock is highly contended, especially on a large cluster with 
> many nodes heartbeating, and it would be nice if we could find a way to 
> eliminate the need to grab this lock during this call.  We've already done 
> similar work during AM allocate calls to make sure they don't needlessly grab 
> the scheduler lock, and it would be good to do so here as well, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2801) Add documentation for node labels feature

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-2801:
-
Fix Version/s: 2.8.0

> Add documentation for node labels feature
> -
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gururaj Shetty
>Assignee: Wangda Tan
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: YARN-2801.1.patch, YARN-2801.2.patch, YARN-2801.3.patch, 
> YARN-2801.4.patch
>
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5756) Add state-machine implementation for queues

2016-12-01 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713505#comment-15713505
 ] 

Li Lu commented on YARN-5756:
-

Thanks [~xgong] for the patch! Generally fine, some comments:

QueueState.java
  - STOP_RUNNING state is a little bit confusing? How about RUNNING, CLOSED (or 
DRAINING), and STOPPED? 
  - Java doc inconsistencies: at the very beginning of the enum we said there 
are only two possible states? 

QueueStateManager.java
  - Consistency issues with stop and activate queue? We're using fine grained 
locking to change each queue's status. We need to make the process of stopping 
each queue and its subqueues atomic (as in concurrency, not in db). Otherwise, 
concurrent activate queue calls may result in inconsistent results. If coarse 
grained locking is fine with the current use case, we may want to make 
activateQueues and stopQueues synchronized? 
  - QueueStateManager only needs the queue mapping in SchedulerQueueManager, so 
we do not need to reference the whole SchedulerQueueManager here? I don't have 
a strong opinion here though...

> Add state-machine implementation for queues
> ---
>
> Key: YARN-5756
> URL: https://issues.apache.org/jira/browse/YARN-5756
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-5756.1.patch, YARN-5756.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3697) FairScheduler: ContinuousSchedulingThread can fail to shutdown

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3697:
-
Fix Version/s: 2.8.0

> FairScheduler: ContinuousSchedulingThread can fail to shutdown
> --
>
> Key: YARN-3697
> URL: https://issues.apache.org/jira/browse/YARN-3697
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Fix For: 2.8.0, 2.7.2, 2.6.4, 3.0.0-alpha1
>
> Attachments: YARN-3697.000.patch, YARN-3697.001.patch
>
>
> FairScheduler: ContinuousSchedulingThread can't be shutdown after stop 
> sometimes. 
> The reason is because the InterruptedException is blocked in 
> continuousSchedulingAttempt
> {code}
>   try {
> if (node != null && Resources.fitsIn(minimumAllocation,
> node.getAvailableResource())) {
>   attemptScheduling(node);
> }
>   } catch (Throwable ex) {
> LOG.error("Error while attempting scheduling for node " + node +
> ": " + ex.toString(), ex);
>   }
> {code}
> I saw the following exception after stop:
> {code}
> 2015-05-17 23:30:43,065 WARN  [FairSchedulerContinuousScheduling] 
> event.AsyncDispatcher (AsyncDispatcher.java:handle(247)) - AsyncDispatcher 
> thread interrupted
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
>   at 
> java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
>   at 
> java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:244)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:462)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:387)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:58)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.allocate(FSAppAttempt.java:357)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:516)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:649)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.assignContainer(FSAppAttempt.java:803)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:334)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:173)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1082)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1014)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:285)
> 2015-05-17 23:30:43,066 ERROR [FairSchedulerContinuousScheduling] 
> fair.FairScheduler (FairScheduler.java:continuousSchedulingAttempt(1017)) - 
> Error while attempting scheduling for node host: 127.0.0.2:2 #containers=1 
> available= used=: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.InterruptedException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.InterruptedException
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:249)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$ContainerStartedTransition.transition(RMContainerImpl.java:467)
>   at 
> 

[jira] [Updated] (YARN-4005) Completed container whose app is finished is not removed from NMStateStore

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4005:
-
Fix Version/s: 2.8.0

> Completed container whose app is finished is not removed from NMStateStore
> --
>
> Key: YARN-4005
> URL: https://issues.apache.org/jira/browse/YARN-4005
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jun Gong
>Assignee: Jun Gong
> Fix For: 2.8.0, 2.7.2, 2.6.2, 3.0.0-alpha1
>
> Attachments: YARN-4005.01.patch
>
>
> If a container is completed and its corresponding app is finished, NM only 
> removes it from its context and does not add it to 
> 'recentlyStoppedContainers' when calling 'getContainerStatuses'. Then NM will 
> not remove it from NMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5915) ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every event write

2016-12-01 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713282#comment-15713282
 ] 

Jason Lowe commented on YARN-5915:
--

Sure if the output stream could be unbuffered then having it buffer at the json 
generator is fine, and that's what the patch does.

+1 for the patch, I'll commit tomorrow if no objections.

> ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every 
> event write
> 
>
> Key: YARN-5915
> URL: https://issues.apache.org/jira/browse/YARN-5915
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.0.0-alpha1
>Reporter: Atul Sikaria
>Assignee: Atul Sikaria
> Attachments: YARN-5915.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5734) OrgQueue for easy CapacityScheduler queue configuration management

2016-12-01 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713297#comment-15713297
 ] 

Jonathan Hung commented on YARN-5734:
-

Thanks [~lewuathe], right now we are working on initial patches, and we will 
have a better idea of how to split tasks once we have a skeleton of the 
implementation. Regarding the target branch, we will have an option to use the 
flat configuration file as it is now, so this shouldn't be incompatible.

[~rkanter], thanks for the note. As you mentioned, configuration changes 
shouldn't be too frequent so we don't anticipate this being an issue but we'll 
definitely keep it in mind.

> OrgQueue for easy CapacityScheduler queue configuration management
> --
>
> Key: YARN-5734
> URL: https://issues.apache.org/jira/browse/YARN-5734
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Min Shen
>Assignee: Min Shen
> Attachments: OrgQueue_API-Based_Config_Management_v1.pdf, 
> OrgQueue_Design_v0.pdf
>
>
> The current xml based configuration mechanism in CapacityScheduler makes it 
> very inconvenient to apply any changes to the queue configurations. We saw 2 
> main drawbacks in the file based configuration mechanism:
> # This makes it very inconvenient to automate queue configuration updates. 
> For example, in our cluster setup, we leverage the queue mapping feature from 
> YARN-2411 to route users to their dedicated organization queues. It could be 
> extremely cumbersome to keep updating the config file to manage the very 
> dynamic mapping between users to organizations.
> # Even a user has the admin permission on one specific queue, that user is 
> unable to make any queue configuration changes to resize the subqueues, 
> changing queue ACLs, or creating new queues. All these operations need to be 
> performed in a centralized manner by the cluster administrators.
> With these current limitations, we realized the need of a more flexible 
> configuration mechanism that allows queue configurations to be stored and 
> managed more dynamically. We developed the feature internally at LinkedIn 
> which introduces the concept of MutableConfigurationProvider. What it 
> essentially does is to provide a set of configuration mutation APIs that 
> allows queue configurations to be updated externally with a set of REST APIs. 
> When performing the queue configuration changes, the queue ACLs will be 
> honored, which means only queue administrators can make configuration changes 
> to a given queue. MutableConfigurationProvider is implemented as a pluggable 
> interface, and we have one implementation of this interface which is based on 
> Derby embedded database.
> This feature has been deployed at LinkedIn's Hadoop cluster for a year now, 
> and have gone through several iterations of gathering feedbacks from users 
> and improving accordingly. With this feature, cluster administrators are able 
> to automate lots of thequeue configuration management tasks, such as setting 
> the queue capacities to adjust cluster resources between queues based on 
> established resource consumption patterns, or managing updating the user to 
> queue mappings. We have attached our design documentation with this ticket 
> and would like to receive feedbacks from the community regarding how to best 
> integrate it with the latest version of YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5915) ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every event write

2016-12-01 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713331#comment-15713331
 ] 

Junping Du commented on YARN-5915:
--

Cool. +1 as well.

> ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every 
> event write
> 
>
> Key: YARN-5915
> URL: https://issues.apache.org/jira/browse/YARN-5915
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.0.0-alpha1
>Reporter: Atul Sikaria
>Assignee: Atul Sikaria
> Attachments: YARN-5915.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens

2016-12-01 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713359#comment-15713359
 ] 

Jason Lowe commented on YARN-5910:
--

Pinging [~daryn] since I'm sure he has an opinion on this.

I'm not sure distributing the keytab is going to be considered a reasonable 
thing to do in some setups.  Part of the point of getting a token is to avoid 
needing to ship a keytab everywhere.  Once we have a keytab, is there a need to 
have a token?  There's also the problem of needing to renew the token while the 
AM is waiting to get scheduled if the cluster is really busy.  If the AM isn't 
running it can't renew the token.

My preference is to have the token be as self-descriptive as we can possibly 
get.  Doing the ApplicationSubmissionContext thing could work for the HA case, 
but I could see this being a potentially non-trivial payload the RM has to bear 
for each app (configs can get quite large).  It'd rather avoid adding that to 
the context for this purpose if we can do so, but if the token cannot be 
self-descriptive in all cases then we may not have much other choice that I can 
see.

> Support for multi-cluster delegation tokens
> ---
>
> Key: YARN-5910
> URL: https://issues.apache.org/jira/browse/YARN-5910
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: security
>Reporter: Clay B.
>Priority: Minor
>
> As an administrator running many secure (kerberized) clusters, some which 
> have peer clusters managed by other teams, I am looking for a way to run jobs 
> which may require services running on other clusters. Particular cases where 
> this rears itself are running something as core as a distcp between two 
> kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp 
> hdfs://LOCALCLUSTER/user/user292/test.out 
> hdfs://REMOTECLUSTER/user/user292/test.out.result}}).
> Thanks to YARN-3021, once can run for a while but if the delegation token for 
> the remote cluster needs renewal the job will fail[1]. One can pre-configure 
> their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes 
> available but that requires coordination that is not always feasible, 
> especially as a cluster's peers grow into the tens of clusters or across 
> management teams. Ideally, one could have core systems configured this way 
> but jobs could also specify their own handling of tokens and management when 
> needed?
> [1]: Example stack trace when the RM is unaware of a remote service:
> 
> {code}
> 2016-03-23 14:59:50,528 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  application_1458441356031_3317 found existing hdfs token Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: 
> (HDFS_DELEGATION_TOKEN token
>  10927 for user292)
> 2016-03-23 14:59:50,557 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  Unable to add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, 
> Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for 
> user292)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: Unable to map logical nameservice URI 
> 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a 
> failover proxy provider configured.
> at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164)
> at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128)
> at org.apache.hadoop.security.token.Token.renew(Token.java:377)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> 

[jira] [Commented] (YARN-5197) RM leaks containers if running container disappears from node update

2016-12-01 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713384#comment-15713384
 ] 

Junping Du commented on YARN-5197:
--

I think the patch goes to branch-2.8 as well. Add 2.8 in fix version.

> RM leaks containers if running container disappears from node update
> 
>
> Key: YARN-5197
> URL: https://issues.apache.org/jira/browse/YARN-5197
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2, 2.6.4
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Fix For: 2.6.5, 2.7.4
>
> Attachments: YARN-5197-branch-2.7.003.patch, 
> YARN-5197-branch-2.8.003.patch, YARN-5197.001.patch, YARN-5197.002.patch, 
> YARN-5197.003.patch
>
>
> Once a node reports a container running in a status update, the corresponding 
> RMNodeImpl will track the container in its launchedContainers map.  If the 
> node somehow misses sending the completed container status to the RM and the 
> container simply disappears from subsequent heartbeats, the container will 
> leak in launchedContainers forever and the container completion event will 
> not be sent to the scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713412#comment-15713412
 ] 

Hadoop QA commented on YARN-5739:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
0s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
58s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
37s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} YARN-5355 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 1 new + 
29 unchanged - 1 fixed = 30 total (was 30) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
45s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  6m 
31s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in 
the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | YARN-5739 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841363/YARN-5739-YARN-5355.006.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 69e37c9e8ef1 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-5355 / f734977 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14151/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14151/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice
 

[jira] [Commented] (YARN-5901) Fix race condition in TestGetGroups beforeclass setup()

2016-12-01 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713440#comment-15713440
 ] 

Daniel Templeton commented on YARN-5901:


LGTM +1

> Fix race condition in TestGetGroups beforeclass setup()
> ---
>
> Key: YARN-5901
> URL: https://issues.apache.org/jira/browse/YARN-5901
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Attachments: YARN-5901.02.patch, YARN-5901.03.patch, 
> yarn5901.001.patch
>
>
> In TestGetGroups, the class-level setup method spins up, in a child thread, a 
> resource manager that Yarn clients can talk to. But it checks whether the 
> resource manager is fully started by doing resourcemanager.getServiceState() 
> == STATE.STARTED. This is not reliable since resourcemanager.start() will 
> first trigger service state change in RM, and then starts up all the services 
> added to RM. We need to wait for RM to fully start before YARN clients  can 
> send requests. Otherwise, the tests can fail due to "connection refused"  
> exception when the main thread sends out client requests to RM and if the RPC 
> server has not fired up in the child thread.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens

2016-12-01 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713444#comment-15713444
 ] 

Jian He commented on YARN-5910:
---

Thanks for your inputs, Jason 
bq. Once we have a keytab, is there a need to have a token?
The map/reducer task can continue to use token.
bq. My preference is to have the token be as self-descriptive as we can 
possibly get. 
I agree this sounds a better approach, but it requires a lot of work in HDFS.  
bq. but I could see this being a potentially non-trivial payload the RM has to 
bear for each app
In this case, we can set the conf object to null once RM gets what it wants.

I'll talk with some hdfs folks to see whether this is doable on their side. 
Else, I think passing a conf object and then void it  might be a straightfoward 
approach at this point.   Waiting for [~daryn]'s input also. 


> Support for multi-cluster delegation tokens
> ---
>
> Key: YARN-5910
> URL: https://issues.apache.org/jira/browse/YARN-5910
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: security
>Reporter: Clay B.
>Priority: Minor
>
> As an administrator running many secure (kerberized) clusters, some which 
> have peer clusters managed by other teams, I am looking for a way to run jobs 
> which may require services running on other clusters. Particular cases where 
> this rears itself are running something as core as a distcp between two 
> kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp 
> hdfs://LOCALCLUSTER/user/user292/test.out 
> hdfs://REMOTECLUSTER/user/user292/test.out.result}}).
> Thanks to YARN-3021, once can run for a while but if the delegation token for 
> the remote cluster needs renewal the job will fail[1]. One can pre-configure 
> their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes 
> available but that requires coordination that is not always feasible, 
> especially as a cluster's peers grow into the tens of clusters or across 
> management teams. Ideally, one could have core systems configured this way 
> but jobs could also specify their own handling of tokens and management when 
> needed?
> [1]: Example stack trace when the RM is unaware of a remote service:
> 
> {code}
> 2016-03-23 14:59:50,528 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  application_1458441356031_3317 found existing hdfs token Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: 
> (HDFS_DELEGATION_TOKEN token
>  10927 for user292)
> 2016-03-23 14:59:50,557 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  Unable to add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, 
> Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for 
> user292)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: Unable to map logical nameservice URI 
> 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a 
> failover proxy provider configured.
> at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164)
> at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128)
> at org.apache.hadoop.security.token.Token.renew(Token.java:377)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511)
> at 
> 

[jira] [Commented] (YARN-5901) Fix race condition in TestGetGroups beforeclass setup()

2016-12-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713513#comment-15713513
 ] 

Hudson commented on YARN-5901:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10926 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10926/])
YARN-5901. Fix race condition in TestGetGroups beforeclass setup() (templedf: 
rev 2d77dc727d9b5e56009bbc36643d85500efcbca5)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestGetGroups.java


> Fix race condition in TestGetGroups beforeclass setup()
> ---
>
> Key: YARN-5901
> URL: https://issues.apache.org/jira/browse/YARN-5901
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Attachments: YARN-5901-branch-2.03.patch, YARN-5901.02.patch, 
> YARN-5901.03.patch, yarn5901.001.patch
>
>
> In TestGetGroups, the class-level setup method spins up, in a child thread, a 
> resource manager that Yarn clients can talk to. But it checks whether the 
> resource manager is fully started by doing resourcemanager.getServiceState() 
> == STATE.STARTED. This is not reliable since resourcemanager.start() will 
> first trigger service state change in RM, and then starts up all the services 
> added to RM. We need to wait for RM to fully start before YARN clients  can 
> send requests. Otherwise, the tests can fail due to "connection refused"  
> exception when the main thread sends out client requests to RM and if the RPC 
> server has not fired up in the child thread.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4931) Preempted resources go back to the same application

2016-12-01 Thread Miles Crawford (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712538#comment-15712538
 ] 

Miles Crawford commented on YARN-4931:
--

```

drf
5
20

0.8
fair

```

We use Amazon's EMR, so the yarn-site and mapred-site.xml are a part of their 
managed setup.

> Preempted resources go back to the same application
> ---
>
> Key: YARN-4931
> URL: https://issues.apache.org/jira/browse/YARN-4931
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: resourcemanager.log
>
>
> Sometimes a queue that needs resources causes preemption - but the preempted 
> containers are just allocated right back to the application that just 
> released them!
> Here is a tiny application (0007) that wants resources, and a container is 
> preempted from application 0002 to satisfy it:
> {code}
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
> (FairSchedulerUpdateThread): Should preempt  res for 
> queue root.default: resDueToMinShare = , 
> resDueToFairShare = 
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
> (FairSchedulerUpdateThread): Preempting container (prio=1res= vCores:1>) from queue root.milesc
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics
>  (FairSchedulerUpdateThread): Non-AM container preempted, current 
> appAttemptId=appattempt_1460047303577_0002_01, 
> containerId=container_1460047303577_0002_01_001038, resource= vCores:1>
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (FairSchedulerUpdateThread): container_1460047303577_0002_01_001038 Container 
> Transitioned from RUNNING to KILLED
> {code}
> But then a moment later, application 2 gets the container right back:
> {code}
> 2016-04-07 21:08:13,844 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode 
> (ResourceManager Event Processor): Assigned container 
> container_1460047303577_0002_01_001039 of capacity  
> on host ip-10-12-40-63.us-west-2.compute.internal:8041, which has 13 
> containers,  used and  
> available after allocation
> 2016-04-07 21:08:14,555 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (IPC Server handler 59 on 8030): container_1460047303577_0002_01_001039 
> Container Transitioned from ALLOCATED to ACQUIRED
> 2016-04-07 21:08:14,845 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (ResourceManager Event Processor): container_1460047303577_0002_01_001039 
> Container Transitioned from ACQUIRED to RUNNING
> {code}
> This results in new applications being unable to even get an AM, and never 
> starting at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-4931) Preempted resources go back to the same application

2016-12-01 Thread Miles Crawford (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712538#comment-15712538
 ] 

Miles Crawford edited comment on YARN-4931 at 12/1/16 5:24 PM:
---

{code}


drf
5
20

0.8
fair

{code}
We use Amazon's EMR, so the yarn-site and mapred-site.xml are a part of their 
managed setup.


was (Author: milesc):
```

drf
5
20

0.8
fair

```

We use Amazon's EMR, so the yarn-site and mapred-site.xml are a part of their 
managed setup.

> Preempted resources go back to the same application
> ---
>
> Key: YARN-4931
> URL: https://issues.apache.org/jira/browse/YARN-4931
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: resourcemanager.log
>
>
> Sometimes a queue that needs resources causes preemption - but the preempted 
> containers are just allocated right back to the application that just 
> released them!
> Here is a tiny application (0007) that wants resources, and a container is 
> preempted from application 0002 to satisfy it:
> {code}
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
> (FairSchedulerUpdateThread): Should preempt  res for 
> queue root.default: resDueToMinShare = , 
> resDueToFairShare = 
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
> (FairSchedulerUpdateThread): Preempting container (prio=1res= vCores:1>) from queue root.milesc
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics
>  (FairSchedulerUpdateThread): Non-AM container preempted, current 
> appAttemptId=appattempt_1460047303577_0002_01, 
> containerId=container_1460047303577_0002_01_001038, resource= vCores:1>
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (FairSchedulerUpdateThread): container_1460047303577_0002_01_001038 Container 
> Transitioned from RUNNING to KILLED
> {code}
> But then a moment later, application 2 gets the container right back:
> {code}
> 2016-04-07 21:08:13,844 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode 
> (ResourceManager Event Processor): Assigned container 
> container_1460047303577_0002_01_001039 of capacity  
> on host ip-10-12-40-63.us-west-2.compute.internal:8041, which has 13 
> containers,  used and  
> available after allocation
> 2016-04-07 21:08:14,555 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (IPC Server handler 59 on 8030): container_1460047303577_0002_01_001039 
> Container Transitioned from ALLOCATED to ACQUIRED
> 2016-04-07 21:08:14,845 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (ResourceManager Event Processor): container_1460047303577_0002_01_001039 
> Container Transitioned from ACQUIRED to RUNNING
> {code}
> This results in new applications being unable to even get an AM, and never 
> starting at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5694) ZKRMStateStore can prevent the transition to standby in branch-2.7 if the ZK node is unreachable

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713847#comment-15713847
 ] 

Hadoop QA commented on YARN-5694:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m  
8s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
26s{color} | {color:green} branch-2.6 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} branch-2.6 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} branch-2.6 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} branch-2.6 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} branch-2.6 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} branch-2.6 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} branch-2.6 passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.6 
failed with JDK v1.8.0_111. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} branch-2.6 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4112 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  1m 
35s{color} | {color:red} The patch 74 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
15s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed 
with JDK v1.8.0_111. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 47m 27s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
31s{color} | {color:red} The patch generated 127 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}129m 36s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_111 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_121 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:44eef0e |
| 

[jira] [Commented] (YARN-5714) ContainerExecutor does not order environment map

2016-12-01 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713858#comment-15713858
 ] 

Miklos Szegedi commented on YARN-5714:
--

Thank you, for the patch, [~rcatherinot]!
1. "%%A%%" should not match any variable in Windows but the current patch 
returns "A". Double %% is replaced with single %
2. This is a potential double increase of i without checking i < l
{code}
994 if (c == '\\') {
995   i++;
996 }
997 if (c == '\'') {
998   break;
999 }
1000i++;
{code}
3. There is a typo in this sentence:
{code}
855   // 2 - we need a map implementation that support entries to be
{code}

4. You have an iteration that has a depth of 3. This will run at every 
container launch, so it might cause perf issues. In particular 
{code}copy.containsKey(envDep){code} will be called again and again (6 times?) 
until all dependencies are added, if a set is ordered like this:
{code}
D=$A $B $C
A=*
B=*
C=*
{code}

Have you considered an algorithm, where you scan the dependencies only once? 
You may not even need the hash map for it in this case.
{code}
for (env: envsI)
  stack.push(env)
while(!stack.empty)
  env = stack.pop()
  if (!env.marked)
env.mark
if (env.dependencies.empty)
  stack.push(env)
  for(dep : end.dependencies())
stack.push(dep)
  else
ordered.putifnotthere(env);
{code}
5. It might be slower (needs to be tested) but have you considered regex for 
parsing? It would make the code shorter and easier to understand.

> ContainerExecutor does not order environment map
> 
>
> Key: YARN-5714
> URL: https://issues.apache.org/jira/browse/YARN-5714
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.1, 2.5.2, 2.7.3, 2.6.4, 3.0.0-alpha1
> Environment: all (linux and windows alike)
>Reporter: Remi Catherinot
>Assignee: Remi Catherinot
>Priority: Trivial
>  Labels: oct16-medium
> Attachments: YARN-5714.001.patch, YARN-5714.002.patch, 
> YARN-5714.003.patch, YARN-5714.004.patch
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> when dumping the launch container script, environment variables are dumped 
> based on the order internally used by the map implementation (hash based). It 
> does not take into consideration that some env varibales may refer each 
> other, and so that some env variables must be declared before those 
> referencing them.
> In my case, i ended up having LD_LIBRARY_PATH which was depending on 
> HADOOP_COMMON_HOME being dumped before HADOOP_COMMON_HOME. Thus it had a 
> wrong value and so native libraries weren't loaded. jobs were running but not 
> at their best efficiency. This is just a use case falling into that bug, but 
> i'm sure others may happen as well.
> I already have a patch running in my production environment, i just estimate 
> to 5 days for packaging the patch in the right fashion for JIRA + try my best 
> to add tests.
> Note : the patch is not OS aware with a default empty implementation. I will 
> only implement the unix version on a 1st release. I'm not used to windows env 
> variables syntax so it will take me more time/research for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService

2016-12-01 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5938:
--
Attachment: YARN-5938-YARN-5085.002.patch

Updating patch.. moved SchedulerRequestKey to a common, so it can be used by 
the OpportunisticAllocator

> Minor refactoring to OpportunisticContainerAllocatorAMService
> -
>
> Key: YARN-5938
> URL: https://issues.apache.org/jira/browse/YARN-5938
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5938-YARN-5085.001.patch, 
> YARN-5938-YARN-5085.002.patch, YARN-5938.001.patch
>
>
> Minor code re-organization to do the following:
> # The OpportunisticContainerAllocatorAMService currently allocates outside 
> the ApplicationAttempt lock maintained by the ApplicationMasterService. This 
> should happen inside the lock.
> # Refactored out some code to simplify the allocate() method.
> # Removed some unused fields inside the OpportunisticContainerAllocator.
> # Re-organized some of the code in the 
> OpportunisticContainerAllocatorAMService::allocate method to make it a bit 
> more readable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5959) Add support for ExecutionType change from OPPORTUNISTIC to GUARANTEED

2016-12-01 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh reassigned YARN-5959:
-

Assignee: Arun Suresh

> Add support for ExecutionType change from OPPORTUNISTIC to GUARANTEED
> -
>
> Key: YARN-5959
> URL: https://issues.apache.org/jira/browse/YARN-5959
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5959) Add support for ExecutionType change from OPPORTUNISTIC to GUARANTEED

2016-12-01 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5959:
-

 Summary: Add support for ExecutionType change from OPPORTUNISTIC 
to GUARANTEED
 Key: YARN-5959
 URL: https://issues.apache.org/jira/browse/YARN-5959
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5548) Use MockRMMemoryStateStore to reduce test failures

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713918#comment-15713918
 ] 

Hadoop QA commented on YARN-5548:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 4 new + 416 unchanged - 5 fixed = 420 total (was 421) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 39m 
18s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 54m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5548 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841399/YARN-5548.0009.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 84f5f80e5f13 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c87b3a4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14155/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14155/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/14155/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Use MockRMMemoryStateStore to reduce test failures
> --
>
> Key: YARN-5548
> 

[jira] [Updated] (YARN-5548) Use MockRMMemoryStateStore to reduce test failures

2016-12-01 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-5548:
---
Attachment: YARN-5548.0009.patch

> Use MockRMMemoryStateStore to reduce test failures
> --
>
> Key: YARN-5548
> URL: https://issues.apache.org/jira/browse/YARN-5548
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>  Labels: oct16-easy, test
> Attachments: YARN-5548.0001.patch, YARN-5548.0002.patch, 
> YARN-5548.0003.patch, YARN-5548.0004.patch, YARN-5548.0005.patch, 
> YARN-5548.0006.patch, YARN-5548.0007.patch, YARN-5548.0008.patch, 
> YARN-5548.0009.patch
>
>
> https://builds.apache.org/job/PreCommit-YARN-Build/12850/testReport/org.apache.hadoop.yarn.server.resourcemanager/TestRMRestart/testFinishedAppRemovalAfterRMRestart/
> {noformat}
> Error Message
> Stacktrace
> java.lang.AssertionError: expected null, but was: application_submission_context { application_id { id: 1 cluster_timestamp: 
> 1471885197388 } application_name: "" queue: "default" priority { priority: 0 
> } am_container_spec { } cancel_tokens_when_complete: true maxAppAttempts: 2 
> resource { memory: 1024 virtual_cores: 1 } applicationType: "YARN" 
> keep_containers_across_application_attempts: false 
> attempt_failures_validity_interval: 0 am_container_resource_request { 
> priority { priority: 0 } resource_name: "*" capability { memory: 1024 
> virtual_cores: 1 } num_containers: 0 relax_locality: true 
> node_label_expression: "" execution_type_request { execution_type: GUARANTEED 
> enforce_execution_type: false } } } user: "jenkins" start_time: 1471885197417 
> application_state: RMAPP_FINISHED finish_time: 1471885197478>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotNull(Assert.java:664)
>   at org.junit.Assert.assertNull(Assert.java:646)
>   at org.junit.Assert.assertNull(Assert.java:656)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testFinishedAppRemovalAfterRMRestart(TestRMRestart.java:1656)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5922) Remove direct references of HBaseTimelineWriter/Reader in core ATS classes

2016-12-01 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713815#comment-15713815
 ] 

Sangjin Lee commented on YARN-5922:
---

Sorry I was assuming incorrectly that these properties were not already in 
{{yarn-default.xml}}.

The real reason that the test is failing is because these default value 
definitions in {{YarnConfiguration}} are mistakenly flagged as *property keys*. 
You'd need to skip them using {{configurationPropsToSkipCompare}}. This is how 
a similar issue is addressed in {{TestYarnConfigurationFields}}:
{code}
configurationPropsToSkipCompare
.add(YarnConfiguration.DEFAULT_IPC_RECORD_FACTORY_CLASS);
configurationPropsToSkipCompare
.add(YarnConfiguration.DEFAULT_IPC_CLIENT_FACTORY_CLASS);
configurationPropsToSkipCompare
.add(YarnConfiguration.DEFAULT_IPC_SERVER_FACTORY_CLASS);
{code}
We should add our 2 default fields to get around this.

> Remove direct references of HBaseTimelineWriter/Reader in core ATS classes
> --
>
> Key: YARN-5922
> URL: https://issues.apache.org/jira/browse/YARN-5922
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-5922-YARN-5355.01.patch, 
> YARN-5922-YARN-5355.02.patch, YARN-5922.01.patch, YARN-5922.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application

2016-12-01 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713829#comment-15713829
 ] 

Sangjin Lee commented on YARN-5739:
---

The latest patch LGTM. Thanks [~gtCarrera9] for the update!

I'll wait until [~varun_saxena] takes a look at it, and then commit it if there 
are no objections.

> Provide timeline reader API to list available timeline entity types for one 
> application
> ---
>
> Key: YARN-5739
> URL: https://issues.apache.org/jira/browse/YARN-5739
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-5739-YARN-5355.001.patch, 
> YARN-5739-YARN-5355.002.patch, YARN-5739-YARN-5355.003.patch, 
> YARN-5739-YARN-5355.004.patch, YARN-5739-YARN-5355.005.patch, 
> YARN-5739-YARN-5355.006.patch
>
>
> Right now we only show a part of available timeline entity data in the new 
> YARN UI. However, some data (especially library specific data) are not 
> possible to be queried out by the web UI. It will be appealing for the UI to 
> provide an "entity browser" for each YARN application. Actually, simply 
> dumping out available timeline entities (with proper pagination, of course) 
> would be pretty helpful for UI users. 
> On timeline side, we're not far away from this goal. Right now I believe the 
> only thing missing is to list all available entity types within one 
> application. The challenge here is that we're not storing this data for each 
> application, but given this kind of call is relatively rare (compare to 
> writes and updates) we can perform some scanning during the read time. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5292) Support for PAUSED container state

2016-12-01 Thread Hitesh Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713883#comment-15713883
 ] 

Hitesh Sharma edited comment on YARN-5292 at 12/2/16 3:21 AM:
--

[~asuresh]], can you please look at the attached patch? Thanks!


was (Author: hrsharma):
[~arun suresh], can you please look at the attached patch? Thanks!

> Support for PAUSED container state
> --
>
> Key: YARN-5292
> URL: https://issues.apache.org/jira/browse/YARN-5292
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Hitesh Sharma
>Assignee: Hitesh Sharma
> Attachments: YARN-5292.001.patch, YARN-5292.002.patch, 
> YARN-5292.003.patch, YARN-5292.004.patch, yarn-5292.pdf
>
>
> YARN-2877 introduced OPPORTUNISTIC containers, and YARN-5216 proposes to add 
> capability to customize how OPPORTUNISTIC containers get preempted.
> In this JIRA we propose introducing a PAUSED container state.
> When a running container gets preempted, it enters the PAUSED state, where it 
> remains until resources get freed up on the node then the preempted container 
> can resume to the running state.
>  
> One scenario where this capability is useful is work preservation. How 
> preemption is done, and whether the container supports it, is implementation 
> specific.
> For instance, if the container is a virtual machine, then preempt would pause 
> the VM and resume would restore it back to the running state.
> If the container doesn't support preemption, then preempt would default to 
> killing the container. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5292) Support for PAUSED container state

2016-12-01 Thread Hitesh Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713883#comment-15713883
 ] 

Hitesh Sharma commented on YARN-5292:
-

[~arun suresh], can you please look at the attached patch? Thanks!

> Support for PAUSED container state
> --
>
> Key: YARN-5292
> URL: https://issues.apache.org/jira/browse/YARN-5292
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Hitesh Sharma
>Assignee: Hitesh Sharma
> Attachments: YARN-5292.001.patch, YARN-5292.002.patch, 
> YARN-5292.003.patch, YARN-5292.004.patch, yarn-5292.pdf
>
>
> YARN-2877 introduced OPPORTUNISTIC containers, and YARN-5216 proposes to add 
> capability to customize how OPPORTUNISTIC containers get preempted.
> In this JIRA we propose introducing a PAUSED container state.
> When a running container gets preempted, it enters the PAUSED state, where it 
> remains until resources get freed up on the node then the preempted container 
> can resume to the running state.
>  
> One scenario where this capability is useful is work preservation. How 
> preemption is done, and whether the container supports it, is implementation 
> specific.
> For instance, if the container is a virtual machine, then preempt would pause 
> the VM and resume would restore it back to the running state.
> If the container doesn't support preemption, then preempt would default to 
> killing the container. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5292) Support for PAUSED container state

2016-12-01 Thread Hitesh Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713883#comment-15713883
 ] 

Hitesh Sharma edited comment on YARN-5292 at 12/2/16 3:21 AM:
--

[~asuresh], can you please look at the attached patch? Thanks!


was (Author: hrsharma):
[~asuresh]], can you please look at the attached patch? Thanks!

> Support for PAUSED container state
> --
>
> Key: YARN-5292
> URL: https://issues.apache.org/jira/browse/YARN-5292
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Hitesh Sharma
>Assignee: Hitesh Sharma
> Attachments: YARN-5292.001.patch, YARN-5292.002.patch, 
> YARN-5292.003.patch, YARN-5292.004.patch, yarn-5292.pdf
>
>
> YARN-2877 introduced OPPORTUNISTIC containers, and YARN-5216 proposes to add 
> capability to customize how OPPORTUNISTIC containers get preempted.
> In this JIRA we propose introducing a PAUSED container state.
> When a running container gets preempted, it enters the PAUSED state, where it 
> remains until resources get freed up on the node then the preempted container 
> can resume to the running state.
>  
> One scenario where this capability is useful is work preservation. How 
> preemption is done, and whether the container supports it, is implementation 
> specific.
> For instance, if the container is a virtual machine, then preempt would pause 
> the VM and resume would restore it back to the running state.
> If the container doesn't support preemption, then preempt would default to 
> killing the container. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5901) Fix race condition in TestGetGroups beforeclass setup()

2016-12-01 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-5901:
---
Fix Version/s: 3.0.0-alpha2

> Fix race condition in TestGetGroups beforeclass setup()
> ---
>
> Key: YARN-5901
> URL: https://issues.apache.org/jira/browse/YARN-5901
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5901-branch-2.03.patch, YARN-5901.02.patch, 
> YARN-5901.03.patch, yarn5901.001.patch
>
>
> In TestGetGroups, the class-level setup method spins up, in a child thread, a 
> resource manager that Yarn clients can talk to. But it checks whether the 
> resource manager is fully started by doing resourcemanager.getServiceState() 
> == STATE.STARTED. This is not reliable since resourcemanager.start() will 
> first trigger service state change in RM, and then starts up all the services 
> added to RM. We need to wait for RM to fully start before YARN clients  can 
> send requests. Otherwise, the tests can fail due to "connection refused"  
> exception when the main thread sends out client requests to RM and if the RPC 
> server has not fired up in the child thread.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5694) ZKRMStateStore can prevent the transition to standby in branch-2.7 if the ZK node is unreachable

2016-12-01 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-5694:
---
Attachment: YARN-5694.branch-2.6.002.patch

Uploading new branch-2.6 patch to fix the test.

> ZKRMStateStore can prevent the transition to standby in branch-2.7 if the ZK 
> node is unreachable
> 
>
> Key: YARN-5694
> URL: https://issues.apache.org/jira/browse/YARN-5694
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.3
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
>  Labels: oct16-medium
> Attachments: YARN-5694.001.patch, YARN-5694.002.patch, 
> YARN-5694.003.patch, YARN-5694.004.patch, YARN-5694.004.patch, 
> YARN-5694.005.patch, YARN-5694.006.patch, YARN-5694.007.patch, 
> YARN-5694.008.patch, YARN-5694.branch-2.6.001.patch, 
> YARN-5694.branch-2.6.002.patch, YARN-5694.branch-2.7.001.patch, 
> YARN-5694.branch-2.7.002.patch, YARN-5694.branch-2.7.004.patch, 
> YARN-5694.branch-2.7.005.patch
>
>
> {{ZKRMStateStore.doStoreMultiWithRetries()}} holds the lock while trying to 
> talk to ZK.  If the connection fails, it will retry while still holding the 
> lock.  The retries are intended to be strictly time limited, but in the case 
> that the ZK node is unreachable, the time limit fails, resulting in the 
> thread holding the lock for over an hour.  Transitioning the RM to standby 
> requires that same lock, so in exactly the case that the RM should be 
> transitioning to standby, the {{VerifyActiveStatusThread}} blocks it from 
> happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4000) RM crashes with NPE if leaf queue becomes parent queue during restart

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4000:
-
Fix Version/s: 2.8.0

> RM crashes with NPE if leaf queue becomes parent queue during restart
> -
>
> Key: YARN-4000
> URL: https://issues.apache.org/jira/browse/YARN-4000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: YARN-4000-branch-2.7.01.patch, YARN-4000.01.patch, 
> YARN-4000.02.patch, YARN-4000.03.patch, YARN-4000.04.patch, 
> YARN-4000.05.patch, YARN-4000.06.patch
>
>
> This is a similar situation to YARN-2308.  If an application is active in 
> queue A and then the RM restarts with a changed capacity scheduler 
> configuration where queue A becomes a parent queue to other subqueues then 
> the RM will crash with a NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3975) WebAppProxyServlet should not redirect to RM page if AHS is enabled

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3975:
-
Fix Version/s: 2.8.0

> WebAppProxyServlet should not redirect to RM page if AHS is enabled
> ---
>
> Key: YARN-3975
> URL: https://issues.apache.org/jira/browse/YARN-3975
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Mit Desai
>Assignee: Mit Desai
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: YARN-3975.2.b2.patch, YARN-3975.3.patch, 
> YARN-3975.4.patch, YARN-3975.5.patch, YARN-3975.6.patch, YARN-3975.7.patch, 
> YARN-3975.8.patch, YARN-3975.9.b2.7.patch, YARN-3975.9.patch, 
> YARN-3975.9.patch
>
>
> WebAppProxyServlet should be updated to handle the case when the appreport 
> doesn't have a tracking URL and the Application History Server is eanbled.
> As we would have already tried the RM and got the 
> ApplicationNotFoundException we should not direct the user to the RM app page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3969) Allow jobs to be submitted to reservation that is active but does not have any allocations

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3969:
-
Fix Version/s: 2.8.0

> Allow jobs to be submitted to reservation that is active but does not have 
> any allocations
> --
>
> Key: YARN-3969
> URL: https://issues.apache.org/jira/browse/YARN-3969
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: YARN-3969-v1.patch, YARN-3969-v2.patch
>
>
> YARN-1051 introduces the notion of reserving resources prior to job 
> submission. A reservation is active from its arrival time to deadline but in 
> the interim there can be instances of time when it does not have any 
> resources allocated. We reject jobs that are submitted when the reservation 
> allocation is zero. Instead we should accept & queue the jobs till the 
> resources become available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5746) The state of the parentQueue and its childQueues should be synchronized.

2016-12-01 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-5746:

Attachment: YARN-5746.7.patch

Fix the unit test failure

> The state of the parentQueue and its childQueues should be synchronized.
> 
>
> Key: YARN-5746
> URL: https://issues.apache.org/jira/browse/YARN-5746
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>  Labels: oct16-easy
> Attachments: YARN-5746.1.patch, YARN-5746.2.patch, YARN-5746.3.patch, 
> YARN-5746.4.patch, YARN-5746.5.patch, YARN-5746.6.patch, YARN-5746.7.patch
>
>
> The state of the parentQueue and its childQeues need to be synchronized. 
> * If the state of the parentQueue becomes STOPPED, the state of its 
> childQueue need to become STOPPED as well. 
> * If we change the state of the queue to RUNNING, we should make sure the 
> state of all its ancestor must be RUNNING. Otherwise, we need to fail this 
> operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3967) Fetch the application report from the AHS if the RM does not know about it

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3967:
-
Fix Version/s: 2.8.0

> Fetch the application report from the AHS if the RM does not know about it
> --
>
> Key: YARN-3967
> URL: https://issues.apache.org/jira/browse/YARN-3967
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Mit Desai
>Assignee: Mit Desai
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: YARN-3967.1.patch, YARN-3967.2.patch, YARN-3967.3.patch
>
>
> If the application history service has been enabled and RM has forgotten 
> anout an application, try and fetch the app report form the AHS.
> On larger clusters, the RM can forget about the applications in about 30 
> minutes. The proxy url generated during the job submission will try to fetch 
> the app report from the RM and will fail to get anything from there. If the 
> app is not found in the RM, we will need to get the application report from 
> the Application History Server  (if it is enabled) to see if we can get any 
> information on that application before throwing an exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5958) Fix ASF license warnings for slider core module

2016-12-01 Thread Gour Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713706#comment-15713706
 ] 

Gour Saha commented on YARN-5958:
-

Have been yearning to see the ASF license errors gone. Thanks [~billie.rinaldi] 
for taking care of this. +1 for the patch.

> Fix ASF license warnings for slider core module
> ---
>
> Key: YARN-5958
> URL: https://issues.apache.org/jira/browse/YARN-5958
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
> Attachments: YARN-5958-yarn-native-services.001.patch
>
>
> The apache-rat-plugin is not configured properly for the 
> hadoop-yarn-slider-core module, which is resulting in erroneous license 
> warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4009) CORS support for ResourceManager REST API

2016-12-01 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4009:
-
Fix Version/s: 2.8.0

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch, YARN-4009.005.patch, 
> YARN-4009.006.patch, YARN-4009.007.patch, YARN-4009.8.patch, 
> YARN-4009.LOGGING.patch, YARN-4009.LOGGING.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5955) Use threadpool or multiple thread to recover app

2016-12-01 Thread Zhaofei Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713671#comment-15713671
 ] 

Zhaofei Meng commented on YARN-5955:


OK.

> Use threadpool or multiple thread to recover app
> 
>
> Key: YARN-5955
> URL: https://issues.apache.org/jira/browse/YARN-5955
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: Zhaofei Meng
>Assignee: Ajith S
> Fix For: 2.7.1
>
>
> current app recovery is one by one,use thead pool can make recovery faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5958) Fix ASF license warnings for slider core module

2016-12-01 Thread Gour Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha updated YARN-5958:

Fix Version/s: yarn-native-services

> Fix ASF license warnings for slider core module
> ---
>
> Key: YARN-5958
> URL: https://issues.apache.org/jira/browse/YARN-5958
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
> Fix For: yarn-native-services
>
> Attachments: YARN-5958-yarn-native-services.001.patch
>
>
> The apache-rat-plugin is not configured properly for the 
> hadoop-yarn-slider-core module, which is resulting in erroneous license 
> warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5944) Native services AM should remain up if RM is down

2016-12-01 Thread Gour Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha updated YARN-5944:

Fix Version/s: yarn-native-services

> Native services AM should remain up if RM is down
> -
>
> Key: YARN-5944
> URL: https://issues.apache.org/jira/browse/YARN-5944
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
> Fix For: yarn-native-services
>
> Attachments: YARN-5944-yarn-native-services.001.patch, 
> YARN-5944-yarn-native-services.002.patch
>
>
> If the RM is down, the native services AM will retry connecting the to RM 
> until yarn.resourcemanager.connect.max-wait.ms has been exceeded. At that 
> point, the AM will stop itself. For a long running service, we would prefer 
> the AM to stay up while the RM is down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5955) Use threadpool or multiple thread to recover app

2016-12-01 Thread 5feixiang (JIRA)
5feixiang created YARN-5955:
---

 Summary: Use threadpool or multiple thread to recover app
 Key: YARN-5955
 URL: https://issues.apache.org/jira/browse/YARN-5955
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.7.1
Reporter: 5feixiang
 Fix For: 2.7.1


current app recovery is one by one,use thead pool can make recovery faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5932) Retrospect moveApplicationToQueue in align with YARN-5611

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711827#comment-15711827
 ] 

Hadoop QA commented on YARN-5932:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
28s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 56s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 28 new + 653 unchanged - 13 fixed = 681 total (was 666) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
30s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 913 unchanged - 0 fixed = 914 total (was 913) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 37s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m  
7s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}101m 11s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5932 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841259/YARN-5932.0001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 85d6dedb6673 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1f7613b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 

[jira] [Commented] (YARN-4931) Preempted resources go back to the same application

2016-12-01 Thread Yevhenii Semenov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711738#comment-15711738
 ] 

Yevhenii Semenov commented on YARN-4931:


Hi [~milesc], 
I have met the same issue. After investigation I noticed that if I use 
minResources property in fair-scheduler.xml and it's greather that queue's 
FairShare, it might affect the FairShare of other queues and cause this strange 
behavior after preemption. 

May you please provide your fair-scheduler.xml, yarn-site.xml and 
mapred-site.xml if possible? It would be helpful for better understanding of 
the issue. Thanks. 

> Preempted resources go back to the same application
> ---
>
> Key: YARN-4931
> URL: https://issues.apache.org/jira/browse/YARN-4931
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: resourcemanager.log
>
>
> Sometimes a queue that needs resources causes preemption - but the preempted 
> containers are just allocated right back to the application that just 
> released them!
> Here is a tiny application (0007) that wants resources, and a container is 
> preempted from application 0002 to satisfy it:
> {code}
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
> (FairSchedulerUpdateThread): Should preempt  res for 
> queue root.default: resDueToMinShare = , 
> resDueToFairShare = 
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
> (FairSchedulerUpdateThread): Preempting container (prio=1res= vCores:1>) from queue root.milesc
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics
>  (FairSchedulerUpdateThread): Non-AM container preempted, current 
> appAttemptId=appattempt_1460047303577_0002_01, 
> containerId=container_1460047303577_0002_01_001038, resource= vCores:1>
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (FairSchedulerUpdateThread): container_1460047303577_0002_01_001038 Container 
> Transitioned from RUNNING to KILLED
> {code}
> But then a moment later, application 2 gets the container right back:
> {code}
> 2016-04-07 21:08:13,844 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode 
> (ResourceManager Event Processor): Assigned container 
> container_1460047303577_0002_01_001039 of capacity  
> on host ip-10-12-40-63.us-west-2.compute.internal:8041, which has 13 
> containers,  used and  
> available after allocation
> 2016-04-07 21:08:14,555 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (IPC Server handler 59 on 8030): container_1460047303577_0002_01_001039 
> Container Transitioned from ALLOCATED to ACQUIRED
> 2016-04-07 21:08:14,845 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (ResourceManager Event Processor): container_1460047303577_0002_01_001039 
> Container Transitioned from ACQUIRED to RUNNING
> {code}
> This results in new applications being unable to even get an AM, and never 
> starting at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5932) Retrospect moveApplicationToQueue in align with YARN-5611

2016-12-01 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-5932:
--
Attachment: YARN-5932.0001.patch

Thanks [~jianhe]

Yes. We are not checking this condition as of now in CS. Ideally we are moving 
resource from one queue to another. hence post move, there could be some 
imbalances. Could such cases be handled in next allocation / preemption cycle 
itself.

I had few thoughts here for some more potential validations (app priority etc), 
but it may make move more stricter. Could that be a pblm to end user as its an 
old api to customer.

Addressed other comments and uploading a new patch.

> Retrospect moveApplicationToQueue in align with YARN-5611
> -
>
> Key: YARN-5932
> URL: https://issues.apache.org/jira/browse/YARN-5932
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5932.0001.patch, YARN-5932.v0.patch, 
> YARN-5932.v1.patch
>
>
> All dynamic api's of an application's state change could follow a general 
> design approach. Currently priority and app timeouts are following this 
> approach all corner cases.
> *Steps*
> - Do a pre-validate check to ensure that changes are fine.
> - Update this information to state-store
> - Perform real move operation and update in-memory data structures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5889) Improve user-limit calculation in capacity scheduler

2016-12-01 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711647#comment-15711647
 ] 

Sunil G commented on YARN-5889:
---

Thanks [~eepayne]

Yes. for scheduler, 1ms is also smaller. It was a tradeoff to see the 
performance gain and its impact. With SLS test, i could be see good improvement 
  in allocation speed.

Now to bridge the gap, there are 2 cases
- How to make sure that every allocation gets correct and accurate user-limit 
value given computation happens at 1ms?
- In a lousy cluster, how can we save CPU cycles to prevent too much of 
unnecessary computations?

Yes, an ideal way is as suggested by you.
- Any change in resource (allocation and release of container  etc) for a given 
user could set a state variable. This will set off by the computation thread if 
next cycle falls immediate.
- Its not ideal to ask allocation thread to hold till computation. So by seeing 
this state variable, we might need to compute user-limit in same allocation 
thread. 

I was looking in second step to see how much impact it can cause if user-limit 
is slightly older. We may over allocate or we may under allocate. I think 
under-allocate scenario is fine as we will allocate more from next milli 
second. However overallocate scenario may be a worry. Still we have 
preemptions/opportunistic ways to handle this.

Ideally we were looking to avoid user-limit computation from same allocation 
thread. So after step 1), we can force the user-allocate thread to push for an 
immediate computation. Still there could some exceptionally rare case where 
user-limit thread is doing computation as per release/allocate demand. But 
another allocation thread (heartbeat) may also go in same time frame. If this 
is fine, I could update my patch to handle this case.

Thoughts?

> Improve user-limit calculation in capacity scheduler
> 
>
> Key: YARN-5889
> URL: https://issues.apache.org/jira/browse/YARN-5889
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5889.v0.patch, YARN-5889.v1.patch, 
> YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with 
> a write lock. To improve performance, this tickets is focussing on moving 
> user-limit calculation out of heartbeat allocation flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5955) Use threadpool or multiple thread to recover app

2016-12-01 Thread Ajith S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S reassigned YARN-5955:
-

Assignee: Ajith S

> Use threadpool or multiple thread to recover app
> 
>
> Key: YARN-5955
> URL: https://issues.apache.org/jira/browse/YARN-5955
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: 5feixiang
>Assignee: Ajith S
> Fix For: 2.7.1
>
>
> current app recovery is one by one,use thead pool can make recovery faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5955) Use threadpool or multiple thread to recover app

2016-12-01 Thread Ajith S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711870#comment-15711870
 ] 

Ajith S commented on YARN-5955:
---

I would like to work on this if you have not yet started

> Use threadpool or multiple thread to recover app
> 
>
> Key: YARN-5955
> URL: https://issues.apache.org/jira/browse/YARN-5955
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: 5feixiang
> Fix For: 2.7.1
>
>
> current app recovery is one by one,use thead pool can make recovery faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5927) BaseContainerManagerTest::waitForNMContainerState timeout accounting is not accurate

2016-12-01 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711892#comment-15711892
 ] 

Kai Sasaki commented on YARN-5927:
--

[~miklos.szeg...@cloudera.com] Thanks!

[~leftnoteasy] Sorry for bothering you but could you review this when you get a 
chance?

> BaseContainerManagerTest::waitForNMContainerState timeout accounting is not 
> accurate
> 
>
> Key: YARN-5927
> URL: https://issues.apache.org/jira/browse/YARN-5927
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Kai Sasaki
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-5917.01.patch, YARN-5917.02.patch
>
>
> See below that timeoutSecs is increased twice. We also do a sleep right away 
> before even checking the observed value.
> {code}
> do {
>   Thread.sleep(2000);
>  ...
>   timeoutSecs += 2;
> } while (!finalStates.contains(currentState)
> && timeoutSecs++ < timeOutMax);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5955) Use threadpool or multiple thread to recover app

2016-12-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711953#comment-15711953
 ] 

Varun Saxena commented on YARN-5955:


We have actually made quite a few changes to make recovery faster internally.
And this includes using multiple threads for recovery in addition to other 
changes.

Plan to contribute along with Ajith to community. Will upload a design doc and 
initial patch soon.

> Use threadpool or multiple thread to recover app
> 
>
> Key: YARN-5955
> URL: https://issues.apache.org/jira/browse/YARN-5955
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: 5feixiang
>Assignee: Ajith S
> Fix For: 2.7.1
>
>
> current app recovery is one by one,use thead pool can make recovery faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5938) Minor refactoring to OpportunisticContainerAllocatorAMService

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15714111#comment-15714111
 ] 

Hadoop QA commented on YARN-5938:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 12 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
55s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
4s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
56s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
38s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
46s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
10s{color} | {color:green} YARN-5085 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  5s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 16 new + 1498 unchanged - 14 fixed = 1514 total (was 1512) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
25s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
40s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 35s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}117m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5938 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841405/YARN-5938-YARN-5085.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 68060417d15c 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64