[jira] [Commented] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir

2017-01-05 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800714#comment-15800714
 ] 

Varun Saxena commented on YARN-6056:


IIUC, the issue here is that the exit code returned from delete as user is not 
successful and hence indicates failure.
But we would still continue to delete other directories in the list. Correct ?

> Yarn NM using LCE shows a failure when trying to delete a non-existing dir
> --
>
> Key: YARN-6056
> URL: https://issues.apache.org/jira/browse/YARN-6056
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.6.5
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: YARN-6056-branch-2.6.1.patch
>
>
> As part of YARN-2902 the clean up of the local directories was changed to 
> ignore non existing directories and proceed with others in the list. This 
> part of the code change was not backported into branch-2.6, backporting just 
> that part now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir

2017-01-05 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800714#comment-15800714
 ] 

Varun Saxena edited comment on YARN-6056 at 1/5/17 8:17 AM:


[~wilfreds], IIUC, the issue here is that the exit code returned from delete as 
user is not successful and hence indicates failure.
But we would still continue to delete other directories in the list. Correct ?


was (Author: varun_saxena):
IIUC, the issue here is that the exit code returned from delete as user is not 
successful and hence indicates failure.
But we would still continue to delete other directories in the list. Correct ?

> Yarn NM using LCE shows a failure when trying to delete a non-existing dir
> --
>
> Key: YARN-6056
> URL: https://issues.apache.org/jira/browse/YARN-6056
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.6.5
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: YARN-6056-branch-2.6.1.patch
>
>
> As part of YARN-2902 the clean up of the local directories was changed to 
> ignore non existing directories and proceed with others in the list. This 
> part of the code change was not backported into branch-2.6, backporting just 
> that part now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800721#comment-15800721
 ] 

Hadoop QA commented on YARN-5959:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 16 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
48s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
33s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m  7s{color} | {color:orange} root: The patch generated 37 new + 1694 
unchanged - 18 fixed = 1731 total (was 1712) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 0 new + 912 unchanged - 1 fixed = 912 total (was 913) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} hadoop-sls in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
40s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 41m 
23s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
42s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {co

[jira] [Updated] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters

2017-01-05 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-5585:

Attachment: YARN-5585-YARN-5355.0006.patch

Updated the patch as per finalized Java Doc. 

> [Atsv2] Reader side changes for entity prefix and support for pagination via 
> additional filters
> ---
>
> Key: YARN-5585
> URL: https://issues.apache.org/jira/browse/YARN-5585
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
>  Labels: yarn-5355-merge-blocker
> Attachments: 0001-YARN-5585.patch, YARN-5585-YARN-5355.0001.patch, 
> YARN-5585-YARN-5355.0002.patch, YARN-5585-YARN-5355.0003.patch, 
> YARN-5585-YARN-5355.0004.patch, YARN-5585-YARN-5355.0005.patch, 
> YARN-5585-YARN-5355.0006.patch, YARN-5585-workaround.patch, YARN-5585.v0.patch
>
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Current Behavior : Default limit is set to 100. If there are 1000 entities 
> then REST call gives first/last 100 entities. How to retrieve next set of 100 
> entities i.e 101 to 200 OR 900 to 801?
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is 
> no way to achieve this. 
> So proposal is to have fromId in the filter like 
> *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to 
> app-10. 
> Since ATS is targeting large number of entities storage, it is very common 
> use case to get next set of entities using fromId rather than querying all 
> the entites. This is very useful for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys

2017-01-05 Thread Ajith S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800932#comment-15800932
 ] 

Ajith S commented on YARN-5547:
---

Hi guys, sorry for delay. [~jlowe]  thanks for your comments. You are right, we 
can avoid storing killed state for container which will not be recovered. Also, 
for deleting the unknown keys,  would it be ok to remove unknown keys in 
{{NMLeveldbStateStoreService.loadContainerState(ContainerId, LeveldbIterator, 
String)}} .? As per the patch it will be after the warning log about the 
unknown keys
This will avoid any scanning of store hence forth avoid performance penalty

> NMLeveldbStateStore should be more tolerant of unknown keys
> ---
>
> Key: YARN-5547
> URL: https://issues.apache.org/jira/browse/YARN-5547
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Ajith S
> Attachments: YARN-5547.01.patch, YARN-5547.02.patch, 
> YARN-5547.03.patch
>
>
> Whenever new keys are added to the NM state store it will break rolling 
> downgrades because the code will throw if it encounters an unrecognized key.  
> If instead it skipped unrecognized keys it could be simpler to continue 
> supporting rolling downgrades.  We need to define the semantics of 
> unrecognized keys when containers and apps are cleaned up, e.g.: we may want 
> to delete all keys underneath an app or container directory when it is being 
> removed from the state store to prevent leaking unrecognized keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters

2017-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800956#comment-15800956
 ] 

Hadoop QA commented on YARN-5585:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
14s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
1s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
40s{color} | {color:green} YARN-5355 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
31s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} YARN-5355 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
26s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 37s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 17 new + 25 unchanged - 13 fixed = 42 total (was 38) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
26s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
45s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
55s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in 
the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 41m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | YARN-5585 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845753/YARN-5585-YARN-5355.0006.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvns

[jira] [Updated] (YARN-5554) MoveApplicationAcrossQueues does not check user permission on the target queue

2017-01-05 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-5554:

Attachment: YARN-5554.14.patch

fixed one checkstyle issue introduced and the one remark from the review

> MoveApplicationAcrossQueues does not check user permission on the target queue
> --
>
> Key: YARN-5554
> URL: https://issues.apache.org/jira/browse/YARN-5554
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Haibo Chen
>Assignee: Wilfred Spiegelenburg
>  Labels: oct16-medium
> Attachments: YARN-5554.10.patch, YARN-5554.11.patch, 
> YARN-5554.12.patch, YARN-5554.13.patch, YARN-5554.14.patch, 
> YARN-5554.2.patch, YARN-5554.3.patch, YARN-5554.4.patch, YARN-5554.5.patch, 
> YARN-5554.6.patch, YARN-5554.7.patch, YARN-5554.8.patch, YARN-5554.9.patch
>
>
> moveApplicationAcrossQueues operation currently does not check user 
> permission on the target queue. This incorrectly allows one user to move 
> his/her own applications to a queue that the user has no access to



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6015) AsyncDispatcher thread name can be set to improved debugging

2017-01-05 Thread Ajith S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated YARN-6015:
--
Attachment: YARN-6015.02.patch

Thanks for the review. i have updated the patch based on your comments. please 
review

> AsyncDispatcher thread name can be set to improved debugging
> 
>
> Key: YARN-6015
> URL: https://issues.apache.org/jira/browse/YARN-6015
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ajith S
>Assignee: Ajith S
> Attachments: YARN-6015.01.patch, YARN-6015.02.patch
>
>
> Currently all the running instances of AsyncDispatcher have same thread name. 
> To improve debugging, we can have option to set thread name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps

2017-01-05 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801085#comment-15801085
 ] 

Rohith Sharma K S commented on YARN-6027:
-

bq. We could consider the fromId as (user + flow), right?
Flow entityId has format of *yarn-cluster/148357440/rohithsharmaks@Sleep 
job* . So, fromId could be *id* self. Thoughts? 

> Support fromId for flows/flowrun apps
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6015) AsyncDispatcher thread name can be set to improved debugging

2017-01-05 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801089#comment-15801089
 ] 

Naganarasimha G R commented on YARN-6015:
-

Thanks for the patch [~ajithshetty], overall modifications looks fine, will 
wait for the jenkins run and if no other comments will commit it.

> AsyncDispatcher thread name can be set to improved debugging
> 
>
> Key: YARN-6015
> URL: https://issues.apache.org/jira/browse/YARN-6015
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ajith S
>Assignee: Ajith S
> Attachments: YARN-6015.01.patch, YARN-6015.02.patch
>
>
> Currently all the running instances of AsyncDispatcher have same thread name. 
> To improve debugging, we can have option to set thread name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps

2017-01-05 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801147#comment-15801147
 ] 

Varun Saxena commented on YARN-6027:


Well, we can potentially reuse the ID. However, the query is within the scope 
of a cluster and pagination will be within a specific day so first 2 parts are 
sort of unnecessary.
Either ways, we may have to consider escaping the delimiters("/") if they are 
part of the string itself. Also user will have to be checked for "@". However, 
@ is not allowed in Linux usernames.

However, assuming cluster is known both by reader and client and timestamp will 
always be a number, both can identify how to parse the ID. But some logic will 
have to be added at both ends. We cannot directly replace the delimiters and 
construct the row key. This needs to be taken care of.

> Support fromId for flows/flowrun apps
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps

2017-01-05 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801151#comment-15801151
 ] 

Varun Saxena commented on YARN-6027:


However, if I am not wrong, other OS do allow @ in username.

> Support fromId for flows/flowrun apps
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5889) Improve user-limit calculation in capacity scheduler

2017-01-05 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-5889:
--
Attachment: YARN-5889.0001.patch

Attaching an initial version of patch as per discussion.

> Improve user-limit calculation in capacity scheduler
> 
>
> Key: YARN-5889
> URL: https://issues.apache.org/jira/browse/YARN-5889
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5889.0001.patch, YARN-5889.v0.patch, 
> YARN-5889.v1.patch, YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with 
> a write lock. To improve performance, this tickets is focussing on moving 
> user-limit calculation out of heartbeat allocation flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2

2017-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801234#comment-15801234
 ] 

Hadoop QA commented on YARN-6041:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 40 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
49s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
28s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
30s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
29s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m  
8s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  3m 
 8s{color} | {color:green} branch-2 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  9m 
54s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
54s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
23s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
27s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
27s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  6m 27s{color} 
| {color:red} root-jdk1.8.0_111 with JDK v1.8.0_111 generated 1 new + 860 
unchanged - 1 fixed = 861 total (was 861) {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
10s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
10s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 27s{color} | {color:orange} root: The patch generated 19 new + 3469 
unchanged - 43 fixed = 3488 total (was 3512) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  3m 
49s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
0s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 12m 
45s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc

[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2

2017-01-05 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801509#comment-15801509
 ] 

Arun Suresh commented on YARN-6041:
---

As mentioned earlier, the javac, javadoc and whitespace errors are better left 
unfixed to retain the style of the existing files.
The Testcase errors are not related.
[~kasha] / [~leftnoteasy], do let me know if this is good for checkin

> Opportunistic containers : Combined patch for branch-2 
> ---
>
> Key: YARN-6041
> URL: https://issues.apache.org/jira/browse/YARN-6041
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-6041-branch-2.001.patch, 
> YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch
>
>
> This is a combined patch targeting branch-2 of the following JIRAs which have 
> already been committed to trunk :
> YARN-5938. Refactoring OpportunisticContainerAllocator to use 
> SchedulerRequestKey instead of Priority and other misc fixes
> YARN-5646. Add documentation and update config parameter names for scheduling 
> of OPPORTUNISTIC containers.
> YARN-5982. Simplify opportunistic container parameters and metrics.
> YARN-5918. Handle Opportunistic scheduling allocate request failure when NM 
> is lost.
> YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager 
> container lifecycle.
> YARN-5823. Update NMTokens in case of requests with only opportunistic 
> containers.
> YARN-5377. Fix 
> TestQueuingContainerManager.testKillMultipleOpportunisticContainers.
> YARN-2995. Enhance UI to show cluster resource utilization of various 
> container Execution types.
> YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http 
> Address.
> YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method 
> to handle OPPORTUNISTIC container requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir

2017-01-05 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801512#comment-15801512
 ] 

Wilfred Spiegelenburg commented on YARN-6056:
-

correct, if you pass in multiple directories then a directory in that list 
which does not exist on the file system should not be fatal. We should not stop 
processing and just continue with the next in the list. In that way a directory 
that does not exist is not a failed delete. The end result is the correct the 
directory does not exist (any more) on the FS and should thus not be a failure.

I am not sure what is going on with the build but it looks like {{protoc}} 
failed which caused a cascading failure.


> Yarn NM using LCE shows a failure when trying to delete a non-existing dir
> --
>
> Key: YARN-6056
> URL: https://issues.apache.org/jira/browse/YARN-6056
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.6.5
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: YARN-6056-branch-2.6.1.patch
>
>
> As part of YARN-2902 the clean up of the local directories was changed to 
> ignore non existing directories and proceed with others in the list. This 
> part of the code change was not backported into branch-2.6, backporting just 
> that part now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6015) AsyncDispatcher thread name can be set to improved debugging

2017-01-05 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801544#comment-15801544
 ] 

Daniel Templeton commented on YARN-6015:


+1 from me.  I'll go kick Jenkins since it seems to have gone deaf lately.

> AsyncDispatcher thread name can be set to improved debugging
> 
>
> Key: YARN-6015
> URL: https://issues.apache.org/jira/browse/YARN-6015
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ajith S
>Assignee: Ajith S
> Attachments: YARN-6015.01.patch, YARN-6015.02.patch
>
>
> Currently all the running instances of AsyncDispatcher have same thread name. 
> To improve debugging, we can have option to set thread name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5554) MoveApplicationAcrossQueues does not check user permission on the target queue

2017-01-05 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801637#comment-15801637
 ] 

Daniel Templeton commented on YARN-5554:


In doing a last pass, I have two questions on the test code:
# In {{testMoveApplicationSubmitTargetQueue()}} and 
{{testMoveApplicationAdminTargetQueue()}}, would it make sense to test that the 
moves that are supposed to work do actually work?
# Why a {{ConcurrentHashMap}} in 
{{createClientRMServiceForMoveApplicationRequest()}} instead of 
{{Collections.singletonMap()}}?

> MoveApplicationAcrossQueues does not check user permission on the target queue
> --
>
> Key: YARN-5554
> URL: https://issues.apache.org/jira/browse/YARN-5554
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Haibo Chen
>Assignee: Wilfred Spiegelenburg
>  Labels: oct16-medium
> Attachments: YARN-5554.10.patch, YARN-5554.11.patch, 
> YARN-5554.12.patch, YARN-5554.13.patch, YARN-5554.14.patch, 
> YARN-5554.2.patch, YARN-5554.3.patch, YARN-5554.4.patch, YARN-5554.5.patch, 
> YARN-5554.6.patch, YARN-5554.7.patch, YARN-5554.8.patch, YARN-5554.9.patch
>
>
> moveApplicationAcrossQueues operation currently does not check user 
> permission on the target queue. This incorrectly allows one user to move 
> his/her own applications to a queue that the user has no access to



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5258) Document Use of Docker with LinuxContainerExecutor

2017-01-05 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801657#comment-15801657
 ] 

Daniel Templeton commented on YARN-5258:


[~sidharta-s] or [~vvasudev], any comments?  I'd love to get this into 2.8.0.

> Document Use of Docker with LinuxContainerExecutor
> --
>
> Key: YARN-5258
> URL: https://issues.apache.org/jira/browse/YARN-5258
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
>  Labels: oct16-easy
> Attachments: YARN-5258.001.patch, YARN-5258.002.patch, 
> YARN-5258.003.patch, YARN-5258.004.patch
>
>
> There aren't currently any docs that explain how to configure Docker and all 
> of its various options aside from reading all of the JIRAs.  We need to 
> document the configuration, use, and troubleshooting, along with helpful 
> examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4990) Re-direction of a particular log file within in a container in NM UI does not redirect properly to Log Server ( history ) on container completion

2017-01-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801743#comment-15801743
 ] 

Jason Lowe commented on YARN-4990:
--

This would be a nice fix to get into 2.8 and seems to be low risk.  Any 
objections?

> Re-direction of a particular log file within in a container in NM UI does not 
> redirect properly to Log Server ( history ) on container completion
> -
>
> Key: YARN-4990
> URL: https://issues.apache.org/jira/browse/YARN-4990
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Xuan Gong
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-4990.1.patch, YARN-4990.2.patch
>
>
> The NM does the redirection to the history server correctly. However if the 
> user is viewing or has a link to a particular specific file, the redirect 
> ends up going to the top level page for the container and not redirecting to 
> the specific file. Additionally, the start param to show logs from the offset 
> 0 also goes missing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys

2017-01-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801788#comment-15801788
 ] 

Jason Lowe commented on YARN-5547:
--

bq. for deleting the unknown keys, would it be ok to remove unknown keys in 
NMLeveldbStateStoreService.loadContainerState(ContainerId, LeveldbIterator, 
String) .?

That should be OK as long as we record the container as killed before we remove 
the unknown keys.  When we eventually add the ability to ignore unknown keys 
without killing the container then it can be problematic.  For example:
# NM is on version V and is using key K, which is new in version V, that is not 
deemed critical to the recovery of a running container.
# NM is downgraded to version V-1
# On startup, NM with version V-1 deletes the unknown key K for the container 
but keeps it running because it was deemed safe to ignore in the (yet to be 
added) state store key descriptor table
# With the container still running, NM is upgraded to version V again 
# Now the container has lost key K yet was started on NM version V and 
continues to run on NM version V.

If we skip the unknown keys that are deemed "safe to ignore" then we can leak 
per the concern above if the container completes on version V-1.  One way to 
fix that case is to have the NM always try to delete the list of unknown keys 
in the (yet to be added) safe-to-ignore key descriptor table when the container 
completes.  Should be fine unless that table gets to be particularly large.  
But we don't have to implement that now, only when we add the ability to ignore 
unknown keys without killing a container.  For the purposes of this JIRA, we 
will always be killing containers that have unknown keys so it's simpler.


> NMLeveldbStateStore should be more tolerant of unknown keys
> ---
>
> Key: YARN-5547
> URL: https://issues.apache.org/jira/browse/YARN-5547
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Ajith S
> Attachments: YARN-5547.01.patch, YARN-5547.02.patch, 
> YARN-5547.03.patch
>
>
> Whenever new keys are added to the NM state store it will break rolling 
> downgrades because the code will throw if it encounters an unrecognized key.  
> If instead it skipped unrecognized keys it could be simpler to continue 
> supporting rolling downgrades.  We need to define the semantics of 
> unrecognized keys when containers and apps are cleaned up, e.g.: we may want 
> to delete all keys underneath an app or container directory when it is being 
> removed from the state store to prevent leaking unrecognized keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6015) AsyncDispatcher thread name can be set to improved debugging

2017-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801854#comment-15801854
 ] 

Hadoop QA commented on YARN-6015:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
1s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
17s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 45m 10s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}120m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6015 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845766/YARN-6015.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 1e2c0f9dd246 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a605ff3 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14565/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-

[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-01-05 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801872#comment-15801872
 ] 

Robert Kanter commented on YARN-6050:
-

[~leftnoteasy], you're right.  I should have changed that when I made the 
protobuf changes.  I'll upload a new patch soon.

Though I think we should still throw and exception if there's no ANY request 
because otherwise, the client will be expecting a specific rack or node, and it 
won't be doing that, and they'll be left wondering why.  An exception with a 
clear error message makes it more obvious what's happening.

> AMs can't be scheduled on racks or nodes
> 
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch, 
> YARN-6050.003.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6057) yarn.scheduler.minimum-allocation describtion update required

2017-01-05 Thread Bibin A Chundatt (JIRA)
Bibin A Chundatt created YARN-6057:
--

 Summary: yarn.scheduler.minimum-allocation describtion update 
required
 Key: YARN-6057
 URL: https://issues.apache.org/jira/browse/YARN-6057
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Priority: Minor


{code}
  
The minimum allocation for every container request at the RM,
in terms of virtual CPU cores. Requests lower than this will throw a
InvalidResourceRequestException.
yarn.scheduler.minimum-allocation-vcores
1
  
{code}

*Requests lower than this will throw a   InvalidResourceRequestException.* Only 
incase of maximum allocation vcore and memory InvalidResourceRequestException 
is thrown



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6057) yarn.scheduler.minimum-allocation describtion update required

2017-01-05 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801897#comment-15801897
 ] 

Bibin A Chundatt commented on YARN-6057:


IIUC for minimum allocation the requests gets rounded up to minimum value

> yarn.scheduler.minimum-allocation describtion update required
> -
>
> Key: YARN-6057
> URL: https://issues.apache.org/jira/browse/YARN-6057
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Priority: Minor
>
> {code}
>   
> The minimum allocation for every container request at the RM,
> in terms of virtual CPU cores. Requests lower than this will throw a
> InvalidResourceRequestException.
> yarn.scheduler.minimum-allocation-vcores
> 1
>   
> {code}
> *Requests lower than this will throw a   InvalidResourceRequestException.* 
> Only incase of maximum allocation vcore and memory 
> InvalidResourceRequestException is thrown



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6057) yarn.scheduler.minimum-allocation describtion update required

2017-01-05 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801897#comment-15801897
 ] 

Bibin A Chundatt edited comment on YARN-6057 at 1/5/17 5:15 PM:


IIUC for minimum allocation the requests gets roundup to minimum value/ 
increment resource value


was (Author: bibinchundatt):
IIUC for minimum allocation the requests gets rounded up to minimum value

> yarn.scheduler.minimum-allocation describtion update required
> -
>
> Key: YARN-6057
> URL: https://issues.apache.org/jira/browse/YARN-6057
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Priority: Minor
>
> {code}
>   
> The minimum allocation for every container request at the RM,
> in terms of virtual CPU cores. Requests lower than this will throw a
> InvalidResourceRequestException.
> yarn.scheduler.minimum-allocation-vcores
> 1
>   
> {code}
> *Requests lower than this will throw a   InvalidResourceRequestException.* 
> Only incase of maximum allocation vcore and memory 
> InvalidResourceRequestException is thrown



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters

2017-01-05 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801911#comment-15801911
 ] 

Varun Saxena commented on YARN-5585:


Thanks [~rohithsharma] for the latest patch.
The patch LGTM.
Checkstyle issues are unrelated.

Will wait for a day before committing it.

> [Atsv2] Reader side changes for entity prefix and support for pagination via 
> additional filters
> ---
>
> Key: YARN-5585
> URL: https://issues.apache.org/jira/browse/YARN-5585
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
>  Labels: yarn-5355-merge-blocker
> Attachments: 0001-YARN-5585.patch, YARN-5585-YARN-5355.0001.patch, 
> YARN-5585-YARN-5355.0002.patch, YARN-5585-YARN-5355.0003.patch, 
> YARN-5585-YARN-5355.0004.patch, YARN-5585-YARN-5355.0005.patch, 
> YARN-5585-YARN-5355.0006.patch, YARN-5585-workaround.patch, YARN-5585.v0.patch
>
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Current Behavior : Default limit is set to 100. If there are 1000 entities 
> then REST call gives first/last 100 entities. How to retrieve next set of 100 
> entities i.e 101 to 200 OR 900 to 801?
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is 
> no way to achieve this. 
> So proposal is to have fromId in the filter like 
> *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to 
> app-10. 
> Since ATS is targeting large number of entities storage, it is very common 
> use case to get next set of entities using fromId rather than querying all 
> the entites. This is very useful for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps

2017-01-05 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801934#comment-15801934
 ] 

Varun Saxena commented on YARN-6027:


By the way in YARN-5585 we have kept fromId and fromIdPrefix as inclusive. We 
cant keep fromId here, inclusive. Right ? Can clients determine next flowId or 
we should let the reader side do it ?

> Support fromId for flows/flowrun apps
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6057) yarn.scheduler.minimum-allocation describtion update required

2017-01-05 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801933#comment-15801933
 ] 

Daniel Templeton commented on YARN-6057:


In the case of both minimum and maximum, resource requests that are out of 
bounds are quietly adjusted to be in bounds.  (See 
{{DefaultResourceCalculator.normalize()}} and 
{{DominantResourceCalculator.normalize()}}) The minimum will also prevent NMs 
that have fewer vcores from starting.  (See 
{{ResourceTrackerService.registerNodeManager()}}.)

> yarn.scheduler.minimum-allocation describtion update required
> -
>
> Key: YARN-6057
> URL: https://issues.apache.org/jira/browse/YARN-6057
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Priority: Minor
>
> {code}
>   
> The minimum allocation for every container request at the RM,
> in terms of virtual CPU cores. Requests lower than this will throw a
> InvalidResourceRequestException.
> yarn.scheduler.minimum-allocation-vcores
> 1
>   
> {code}
> *Requests lower than this will throw a   InvalidResourceRequestException.* 
> Only incase of maximum allocation vcore and memory 
> InvalidResourceRequestException is thrown



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6057) yarn.scheduler.minimum-allocation describtion update required

2017-01-05 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton reassigned YARN-6057:
--

Assignee: Daniel Templeton

> yarn.scheduler.minimum-allocation describtion update required
> -
>
> Key: YARN-6057
> URL: https://issues.apache.org/jira/browse/YARN-6057
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Daniel Templeton
>Priority: Minor
>
> {code}
>   
> The minimum allocation for every container request at the RM,
> in terms of virtual CPU cores. Requests lower than this will throw a
> InvalidResourceRequestException.
> yarn.scheduler.minimum-allocation-vcores
> 1
>   
> {code}
> *Requests lower than this will throw a   InvalidResourceRequestException.* 
> Only incase of maximum allocation vcore and memory 
> InvalidResourceRequestException is thrown



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters

2017-01-05 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801940#comment-15801940
 ] 

Sangjin Lee commented on YARN-5585:
---

+1. Thanks [~rohithsharma]!

> [Atsv2] Reader side changes for entity prefix and support for pagination via 
> additional filters
> ---
>
> Key: YARN-5585
> URL: https://issues.apache.org/jira/browse/YARN-5585
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
>  Labels: yarn-5355-merge-blocker
> Attachments: 0001-YARN-5585.patch, YARN-5585-YARN-5355.0001.patch, 
> YARN-5585-YARN-5355.0002.patch, YARN-5585-YARN-5355.0003.patch, 
> YARN-5585-YARN-5355.0004.patch, YARN-5585-YARN-5355.0005.patch, 
> YARN-5585-YARN-5355.0006.patch, YARN-5585-workaround.patch, YARN-5585.v0.patch
>
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Current Behavior : Default limit is set to 100. If there are 1000 entities 
> then REST call gives first/last 100 entities. How to retrieve next set of 100 
> entities i.e 101 to 200 OR 900 to 801?
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is 
> no way to achieve this. 
> So proposal is to have fromId in the filter like 
> *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to 
> app-10. 
> Since ATS is targeting large number of entities storage, it is very common 
> use case to get next set of entities using fromId rather than querying all 
> the entites. This is very useful for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6057) yarn.scheduler.minimum-allocation-* and yarn.scheduler.maximum-allocation-* descriptions are incorrect about behavior when a request is out of bounds

2017-01-05 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-6057:
---
Summary: yarn.scheduler.minimum-allocation-* and 
yarn.scheduler.maximum-allocation-* descriptions are incorrect about behavior 
when a request is out of bounds  (was: yarn.scheduler.minimum-allocation 
describtion update required)

> yarn.scheduler.minimum-allocation-* and yarn.scheduler.maximum-allocation-* 
> descriptions are incorrect about behavior when a request is out of bounds
> -
>
> Key: YARN-6057
> URL: https://issues.apache.org/jira/browse/YARN-6057
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Daniel Templeton
>Priority: Minor
>
> {code}
>   
> The minimum allocation for every container request at the RM,
> in terms of virtual CPU cores. Requests lower than this will throw a
> InvalidResourceRequestException.
> yarn.scheduler.minimum-allocation-vcores
> 1
>   
> {code}
> *Requests lower than this will throw a   InvalidResourceRequestException.* 
> Only incase of maximum allocation vcore and memory 
> InvalidResourceRequestException is thrown



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3955) Support for priority ACLs in CapacityScheduler

2017-01-05 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3955:
--
Attachment: YARN-3955.0008.patch

Thanks [~leftnoteasy] for the detailed comments. I have some more doubts here.

1) Common logic of checkAccess / getDefaultPriority can be merged further: both 
can get approvedPriority first.
>> priority acls are stored in ascending order. So for checkAccess, we need to 
>> see whether ACL match or not and then submitted priority is lesser than 
>> configure priority. However in case there are no configurations for priority 
>> ACLs or ACLs are disabled, we still need to say access check is passed. Now 
>> for default priority, we will loop through all configured priority acls and 
>> if any ACLs are matching, we try to get max priority all group from which 
>> default could be taken.
 Do you mean that below methods also can be made common.

{noformat}
if (!isACLsEnable) {
  return true;
}
List acls = allAcls.get(queueName);
if (acls == null || acls.isEmpty()) {
  return true;
}
{noformat}

There is one issue here. If approvedPriorityACL comes are null, for checkAccess 
it means false. If we put above code also inside {{getPriorityPerUserACL}}, 
then we expect to return true if that returns null. Since there is conflict of 
interest, i pulled it out. May be you could explain a bit further if I missed 
some.

2) As I commented above, do changes of capacity-scheduler.xml related to the 
patch? I cannot find which module uses acl_access_priority in configuration. If 
not, could you add correct default value?
>> in {{CapacitySchedulerConfiguration.getAclKey(AccessType acl)}}, we try to 
>> get priority ACL config from  acl_access_priority . And that is used to 
>> parse and then populate internal structures. By default I kept it *, but I 
>> have given an example as below.
{noformat}
The ACL of who can submit applications with configured priority.
For e.g, [user={name} group={name} max_priority={priority} 
default_priority={priority}]
{noformat}

3) CapacityScheduler:
* updateApplicationPriority should hold writeLock?
* similiarily, checkAndGetApplicationPriority should hold readlock?
>> Done. Updated in patch

* checkAndGetApplicationPriority: when an app's priority set to negative, I 
think we should use 0 instead of max. Thoughts?
{noformat}
  if (appPriority.compareTo(getMaxClusterLevelAppPriority()) < 0) {
appPriority = Priority
.newInstance(getMaxClusterLevelAppPriority().getPriority());
  }
{noformat}
This code will reset to cluster-max priority only if submitted priority is more 
than cluster max. Since I used {{compareTo}}, it not looks very readable.
Now to your point, we never worry much of -ve priority as such since we use 
priority as integer. Do you feel we need to make 0 as lowest priority ?

4) AppPriorityACLsMgr:
* addPrioirityACLs, should we do "replace" instead of "add" to acl groups? If 
it is not intentional, could you add a test to make sure update of acls works? 
(like change from [1,2,3] to [1,3,4])
 >> Could I add a clear model. It may be more easy. Thoughts? Updated patch as 
 >> per this.

* getPriorityPerUserACL -> getMappedPriorityAclForUGI.
>> Done.
5) As I mentioned before, remove readlock of LQ#getPriorityAcls, final should 
be enough.
>> One doubt here. Since priorityAcls could also be updated in reinitialize, we 
>> can’t make it as final rt. refreshQueue’s call flow for eg.
6) YarnScheduler: why the new added method has SettableFuture in parameters? It 
doesn't look very clean ...
>> I agree with you. But we are doing statestore update within scheduler. Hence 
>> we need to pass future to see exception is thrown immediately. Hence we had 
>> to add this while doing move to queue.


> Support for priority ACLs in CapacityScheduler
> --
>
> Key: YARN-3955
> URL: https://issues.apache.org/jira/browse/YARN-3955
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: ApplicationPriority-ACL.pdf, 
> ApplicationPriority-ACLs-v2.pdf, YARN-3955.0001.patch, YARN-3955.0002.patch, 
> YARN-3955.0003.patch, YARN-3955.0004.patch, YARN-3955.0005.patch, 
> YARN-3955.0006.patch, YARN-3955.0007.patch, YARN-3955.0008.patch, 
> YARN-3955.v0.patch, YARN-3955.v1.patch, YARN-3955.wip1.patch
>
>
> Support will be added for User-level access permission to use different 
> application-priorities. This is to avoid situations where all users try 
> running max priority in the cluster and thus degrading the value of 
> priorities.
> Access Control Lists can be set per priority level within each queue. Below 
> is an example configuration that can be added in capacity scheduler 
> configuration
> file for each Queue level.
> y

[jira] [Commented] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir

2017-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801998#comment-15801998
 ] 

Hadoop QA commented on YARN-6056:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m 
49s{color} | {color:red} root in branch-2.6.1 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed 
with JDK v1.8.0_111. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
6s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed 
with JDK v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed. 
{color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed. 
{color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m  
6s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
6s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with 
JDK v1.8.0_111. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  0m  6s{color} | 
{color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_111. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m  6s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_111. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
7s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with 
JDK v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  0m  7s{color} | 
{color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m  7s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
2s{color} | {color:red} The patch has 765 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m 
19s{color} | {color:red} The patch 16 line(s) with tabs. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m  6s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
28s{color} | {color:red} The patch generated 47 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}  5m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:date2017-01-05 |
| JIRA Issue | YARN-6056 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845723/YARN-6056-branch-2.6.1.patch
 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 71e6b70555b4 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2.6.1 / 41d19f4 |
| Default Java | 1.7.0_121 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_111 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/14567/artifact/patchprocess/branch-mvninstall-root.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/14567/artifact/patchprocess/branch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop

[jira] [Commented] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802113#comment-15802113
 ] 

Wangda Tan commented on YARN-5959:
--

Committing ...

> RM changes to support change of container ExecutionType
> ---
>
> Key: YARN-5959
> URL: https://issues.apache.org/jira/browse/YARN-5959
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5959-YARN-5085.001.patch, 
> YARN-5959-YARN-5085.002.patch, YARN-5959-YARN-5085.003.patch, 
> YARN-5959-YARN-5085.004.patch, YARN-5959-YARN-5085.005.patch, 
> YARN-5959.005.patch, YARN-5959.combined.001.patch, YARN-5959.wip.002.patch, 
> YARN-5959.wip.003.patch, YARN-5959.wip.patch
>
>
> RM side changes to allow an AM to ask for change of ExecutionType.
> Currently, there are two cases:
> # *Promotion* : OPPORTUNISTIC to GUARANTEED.
> # *Demotion* : GUARANTEED to OPPORTUNISTIC.
> This is similar in YARN-1197 which allows for change in Container resources. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-01-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802134#comment-15802134
 ] 

Wangda Tan commented on YARN-6050:
--

[~rkanter],

bq. Though I think we should still throw and exception if there's no ANY 
request because otherwise, the client will be expecting a specific rack or 
node, and it won't be doing that, and they'll be left wondering why. An 
exception with a clear error message makes it more obvious what's happening.
I'm fine with either way, since the change you proposed could be treated as a 
bug fix instead of incompatible behavior change.

> AMs can't be scheduled on racks or nodes
> 
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch, 
> YARN-6050.003.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6058) Support for listing all applications i.e /apps

2017-01-05 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-6058:
---

 Summary: Support for listing all applications i.e /apps
 Key: YARN-6058
 URL: https://issues.apache.org/jira/browse/YARN-6058
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelinereader
Reporter: Rohith Sharma K S
Assignee: Rohith Sharma K S
Priority: Critical


Primary use case for /apps is many execution engines runs on top of YARN 
example, Tez, MR. These engines will have their own UI's which list specific 
type of entities which are published by them Ex: DAG entities. 
But, these UI's do not aware of either userName or flowName or applicationId 
which are submitted by these engines.
Currently, given that user do not aware of user, flownName, and applicationId, 
then he can not retrieve any entities. 

By supporting /apps with filters, user can list of application with given 
ApplicationType. These applications can be used for retrieving engine specific 
entities like DAG. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6040) Remove usage of ResourceRequest from AppSchedulerInfo / SchedulerApplicationAttempt

2017-01-05 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-6040:
-
Attachment: YARN-6040.006.patch

Attached 006 patch, rebased to latest trunk. [~asuresh] could you please 
review? 

> Remove usage of ResourceRequest from AppSchedulerInfo / 
> SchedulerApplicationAttempt
> ---
>
> Key: YARN-6040
> URL: https://issues.apache.org/jira/browse/YARN-6040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6040.001.patch, YARN-6040.002.patch, 
> YARN-6040.003.patch, YARN-6040.004.patch, YARN-6040.005.patch, 
> YARN-6040.006.patch
>
>
> As mentioned by YARN-5906, currently schedulers are using ResourceRequest 
> heavily so it will be very hard to adopt the new PowerfulResourceRequest 
> (YARN-4902).
> This JIRA is the 2nd step of refactoring, which remove usage of 
> ResourceRequest from AppSchedulingInfo / SchedulerApplicationAttempt. Instead 
> of returning ResourceRequest, it returns a lightweight and API-independent 
> object - {{PendingAsk}}.
> The only remained ResourceRequest API of AppSchedulingInfo will be used by 
> web service to get list of ResourceRequests.
> So after this patch, usage of ResourceRequest will be isolated inside 
> AppSchedulingInfo, so it will be more flexible to update internal data 
> structure and upgrade old ResourceRequest API to new.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5234) ResourceManager REST API missing descriptions for what's returned when using Fair Scheduler

2017-01-05 Thread Grant Sohn (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Sohn resolved YARN-5234.
--
   Resolution: Fixed
Fix Version/s: 3.0.0-alpha1

Looked at latest docs and this has been addressed.

> ResourceManager REST API missing descriptions for what's returned when using 
> Fair Scheduler
> ---
>
> Key: YARN-5234
> URL: https://issues.apache.org/jira/browse/YARN-5234
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation, fairscheduler, resourcemanager
>Reporter: Grant Sohn
>Priority: Minor
> Fix For: 3.0.0-alpha1
>
>
> Cluster Scheduler API indicates support for Capacity and Fifo.  What's 
> missing is what would be returned if using Fair scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2

2017-01-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802169#comment-15802169
 ] 

Wangda Tan commented on YARN-6041:
--

[~asuresh],

How you plan to merge these changes to branch-2? I think it might be better to 
do cherry-pick one-by-one, and file a separate JIRA to address new comments 
from [~kasha] and me. Commit this huge patch to branch-2 creates troubles for 
future maintenance.

Thoughts?

> Opportunistic containers : Combined patch for branch-2 
> ---
>
> Key: YARN-6041
> URL: https://issues.apache.org/jira/browse/YARN-6041
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-6041-branch-2.001.patch, 
> YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch
>
>
> This is a combined patch targeting branch-2 of the following JIRAs which have 
> already been committed to trunk :
> YARN-5938. Refactoring OpportunisticContainerAllocator to use 
> SchedulerRequestKey instead of Priority and other misc fixes
> YARN-5646. Add documentation and update config parameter names for scheduling 
> of OPPORTUNISTIC containers.
> YARN-5982. Simplify opportunistic container parameters and metrics.
> YARN-5918. Handle Opportunistic scheduling allocate request failure when NM 
> is lost.
> YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager 
> container lifecycle.
> YARN-5823. Update NMTokens in case of requests with only opportunistic 
> containers.
> YARN-5377. Fix 
> TestQueuingContainerManager.testKillMultipleOpportunisticContainers.
> YARN-2995. Enhance UI to show cluster resource utilization of various 
> container Execution types.
> YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http 
> Address.
> YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method 
> to handle OPPORTUNISTIC container requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5556) Support for deleting queues without requiring a RM restart

2017-01-05 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802176#comment-15802176
 ] 

Xuan Gong commented on YARN-5556:
-

[~Naganarasimha] Thanks for the comments

[~leftnoteasy] Please comment if you have any further suggestions.

bq. So user needs to delete a queue(say a2) then he needs to remove the queue 
from its parent's "yarn.scheduler.capacity..queues" config and 
also mention its state(yarn.scheduler.capacity..state) as DELETED 
right ?

Do not need to remove the queue from its parent's 
"yarn.scheduler.capacity..queues" config, just mention its 
state(yarn.scheduler.capacity..state) as DELETED. 

bq. How to delete intermediate queues? i presume we need NOT configure state 
for each of its children right ? or do we plan to support delete of only leaf 
queue?

We need not configure the state for each of its children. Just mark delete for 
the queue itself. 

bq. Do we need to consider the moving of queues(along with its apps) from one 
queue hiearchy to another ? IMO it complicates but not sure about the real 
world usecases.

we can consider this scenario later.

bq. In case of HA, i think it further complicates as if both the RM's are 
initialiased with old queue settings and then if new queue is updated then CS 
is aware of deleted queue else if the RM starts of with updated xml(with 
deleted queue) then deleted queue information is not available and if failover 
happens to this RM then apps running on the deleted queue cannot be recovered 
as the queue doesnt exist. so do we need to start maintaining the deleted queue 
in statestore or need handling of creating queue objects for the queues whose 
state has been marked as deleted (then we need to consider 2nd point) ?

Yes, this is the fundamental issue with the "configuration-based" approach. 
This api-based approach would solve this issue: 
https://issues.apache.org/jira/browse/YARN-5734. But for "configuration-based" 
approach, in RM HA case, we have to make sure the configuration file for every 
RM nodes is updated.

bq. do we need to consider showing of the deleted queues in the webui ? may be 
in another jira but the code needs to be updated.

Yes, we could file a separate jira, and do it later.

The basic workflow could be: before we can actually delete the queue, we should 
make sure the queue in STOPPED state which means this queue can not accept any 
new applications, and all apps (including pending request) have been finished 
(for now, we could simply wait. or add a command/flag to force kill later). 
Then, we could delete the queue and split capacity.

Thanks

Xuan Gong

> Support for deleting queues without requiring a RM restart
> --
>
> Key: YARN-5556
> URL: https://issues.apache.org/jira/browse/YARN-5556
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Xuan Gong
>Assignee: Naganarasimha G R
> Attachments: YARN-5556.v1.001.patch, YARN-5556.v1.002.patch, 
> YARN-5556.v1.003.patch, YARN-5556.v1.004.patch
>
>
> Today, we could add or modify queues without restarting the RM, via a CS 
> refresh. But for deleting queue, we have to restart the ResourceManager. We 
> could support for deleting queues without requiring a RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4990) Re-direction of a particular log file within in a container in NM UI does not redirect properly to Log Server ( history ) on container completion

2017-01-05 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802185#comment-15802185
 ] 

Junping Du commented on YARN-4990:
--

I am fine with it.

> Re-direction of a particular log file within in a container in NM UI does not 
> redirect properly to Log Server ( history ) on container completion
> -
>
> Key: YARN-4990
> URL: https://issues.apache.org/jira/browse/YARN-4990
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Xuan Gong
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-4990.1.patch, YARN-4990.2.patch
>
>
> The NM does the redirection to the history server correctly. However if the 
> user is viewing or has a link to a particular specific file, the redirect 
> ends up going to the top level page for the container and not redirecting to 
> the specific file. Additionally, the start param to show logs from the offset 
> 0 also goes missing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2

2017-01-05 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802195#comment-15802195
 ] 

Arun Suresh commented on YARN-6041:
---

[~leftnoteasy], So what I plan to do was, like you mentioned, cherry-pick the 
10 JIRAs specified in the description (I actually created the patch by doing 
just that.. and then doing a "git diff ").. Then commit 
JUST the changes Karthik suggested as "YARN-6041: .." which I will cherry-pick 
on to trunk as well.

> Opportunistic containers : Combined patch for branch-2 
> ---
>
> Key: YARN-6041
> URL: https://issues.apache.org/jira/browse/YARN-6041
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-6041-branch-2.001.patch, 
> YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch
>
>
> This is a combined patch targeting branch-2 of the following JIRAs which have 
> already been committed to trunk :
> YARN-5938. Refactoring OpportunisticContainerAllocator to use 
> SchedulerRequestKey instead of Priority and other misc fixes
> YARN-5646. Add documentation and update config parameter names for scheduling 
> of OPPORTUNISTIC containers.
> YARN-5982. Simplify opportunistic container parameters and metrics.
> YARN-5918. Handle Opportunistic scheduling allocate request failure when NM 
> is lost.
> YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager 
> container lifecycle.
> YARN-5823. Update NMTokens in case of requests with only opportunistic 
> containers.
> YARN-5377. Fix 
> TestQueuingContainerManager.testKillMultipleOpportunisticContainers.
> YARN-2995. Enhance UI to show cluster resource utilization of various 
> container Execution types.
> YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http 
> Address.
> YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method 
> to handle OPPORTUNISTIC container requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2

2017-01-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802212#comment-15802212
 ] 

Wangda Tan commented on YARN-6041:
--

[~asuresh], so it will generate 10 commits (plus one for suggestions from this 
JIRA), correct?

It will be better to create a separate JIRA to track kasha's suggestions and 
commit it separately (so we will have a JIRA number)

> Opportunistic containers : Combined patch for branch-2 
> ---
>
> Key: YARN-6041
> URL: https://issues.apache.org/jira/browse/YARN-6041
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-6041-branch-2.001.patch, 
> YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch
>
>
> This is a combined patch targeting branch-2 of the following JIRAs which have 
> already been committed to trunk :
> YARN-5938. Refactoring OpportunisticContainerAllocator to use 
> SchedulerRequestKey instead of Priority and other misc fixes
> YARN-5646. Add documentation and update config parameter names for scheduling 
> of OPPORTUNISTIC containers.
> YARN-5982. Simplify opportunistic container parameters and metrics.
> YARN-5918. Handle Opportunistic scheduling allocate request failure when NM 
> is lost.
> YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager 
> container lifecycle.
> YARN-5823. Update NMTokens in case of requests with only opportunistic 
> containers.
> YARN-5377. Fix 
> TestQueuingContainerManager.testKillMultipleOpportunisticContainers.
> YARN-2995. Enhance UI to show cluster resource utilization of various 
> container Execution types.
> YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http 
> Address.
> YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method 
> to handle OPPORTUNISTIC container requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802237#comment-15802237
 ] 

Hudson commented on YARN-5959:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11075 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11075/])
YARN-5959. RM changes to support change of container ExecutionType. (wangda: 
rev 0a55bd841ec0f2eb89a0383f4c589526e8b138d4)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/UpdateContainerRequest.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/OpportunisticContainerAllocatorAMService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoAppAttempt.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClientOnRMRestart.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/TestRMContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSCapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/scheduler/OpportunisticContainerContext.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerUpdateType.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 

[jira] [Updated] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-01-05 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-6050:

Attachment: YARN-6050.004.patch

The 004 patch
- Removes the {{getAMContainerResourceRequest}} check that [~leftnoteasy] 
pointed out
- Adds a check that there are at least 1 {{ResourceRequest}} or a {{Resource}} 
set; plus a test.

> AMs can't be scheduled on racks or nodes
> 
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch, 
> YARN-6050.003.patch, YARN-6050.004.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5556) Support for deleting queues without requiring a RM restart

2017-01-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802255#comment-15802255
 ] 

Wangda Tan commented on YARN-5556:
--

Just offline discussed with [~xgong]. 

I think we don't need the additional DELETED state, first it generate some 
maintenance overheads, for example we need to maintain state transition to/from 
of the DELETED state. And since by design a queue can be deleted only if queue 
is stopped and no app running, so the impact of typo should be minimum. Our 
preference is simply remove queue from config.

And for re-distribution of stopped/deleted queue. For delete queue it should be 
obvious, since the queue is gone, sum of its siblings should be 100. For 
stopped queue, our expectation is, it will be reactivated at some time. So it 
will be better to keep the capacity as-is, and admin can update max-capacity of 
its siblings to make sure queue capacity can be utilized. 

I think we need to update design doc to make it up-to-date.

Thoughts? 

> Support for deleting queues without requiring a RM restart
> --
>
> Key: YARN-5556
> URL: https://issues.apache.org/jira/browse/YARN-5556
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Xuan Gong
>Assignee: Naganarasimha G R
> Attachments: YARN-5556.v1.001.patch, YARN-5556.v1.002.patch, 
> YARN-5556.v1.003.patch, YARN-5556.v1.004.patch
>
>
> Today, we could add or modify queues without restarting the RM, via a CS 
> refresh. But for deleting queue, we have to restart the ResourceManager. We 
> could support for deleting queues without requiring a RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5280) Allow YARN containers to run with Java Security Manager

2017-01-05 Thread Greg Phillips (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Phillips updated YARN-5280:

Attachment: YARN-5280.006.patch

> Allow YARN containers to run with Java Security Manager
> ---
>
> Key: YARN-5280
> URL: https://issues.apache.org/jira/browse/YARN-5280
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Affects Versions: 2.6.4
>Reporter: Greg Phillips
>Assignee: Greg Phillips
>Priority: Minor
>  Labels: oct16-medium
> Attachments: YARN-5280.001.patch, YARN-5280.002.patch, 
> YARN-5280.003.patch, YARN-5280.004.patch, YARN-5280.005.patch, 
> YARN-5280.006.patch, YARN-5280.patch, YARNContainerSandbox.pdf
>
>
> YARN applications have the ability to perform privileged actions which have 
> the potential to add instability into the cluster. The Java Security Manager 
> can be used to prevent users from running privileged actions while still 
> allowing their core data processing use cases. 
> Introduce a YARN flag which will allow a Hadoop administrator to enable the 
> Java Security Manager for user code, while still providing complete 
> permissions to core Hadoop libraries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3955) Support for priority ACLs in CapacityScheduler

2017-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802298#comment-15802298
 ] 

Hadoop QA commented on YARN-3955:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
37s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 42s{color} | {color:orange} root: The patch generated 13 new + 484 unchanged 
- 2 fixed = 497 total (was 486) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
31s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 913 unchanged - 0 fixed = 914 total (was 913) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
30s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 39m 
37s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
5s{color} | {color:green} hadoop-sls in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}124m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-3955 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845827/YARN-3955.0008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 25481fef2165 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a605ff3 |
| Default Java | 1.8.0_111

[jira] [Updated] (YARN-5964) Lower the granularity of locks in FairScheduler

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5964:
-
Fix Version/s: (was: 2.7.1)

> Lower the granularity of locks in FairScheduler
> ---
>
> Key: YARN-5964
> URL: https://issues.apache.org/jira/browse/YARN-5964
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.7.1
> Environment: CentOS-7.1
>Reporter: zhengchenyu
>Priority: Critical
>   Original Estimate: 2m
>  Remaining Estimate: 2m
>
> When too many applications are running, we found that client couldn't submit 
> the application, and a high callqueuelength of port 8032. I catch the jstack 
> of resourcemanager when callqueuelength is too high. I found that the thread 
> "IPC Server handler xxx on 8032" are waitting for the object lock of 
> FairScheduler, nodeupdate holds the lock of the FairScheduler. Maybe high 
> process time leads to the phenomenon that client can't submit the 
> application. 
> Here I don't consider the problem that client can't submit the application, 
> only estimates the performance of the fairscheduler. We can see too many 
> function which needs object lock are used, the granularity of object lock is 
> too big. For example, nodeUpdate and getAppWeight wanna hold the same object 
> lock. It is unresonable and inefficiency. I recommand that the low 
> granularity lock replaces the current lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6032) SharedCacheManager cleaner task should rm InMemorySCMStore some cachedResources which does not exists in hdfs fs

2017-01-05 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802329#comment-15802329
 ] 

Junping Du commented on YARN-6032:
--

Remove fix version as the jira haven't be resolved.

>  SharedCacheManager cleaner task should rm InMemorySCMStore some 
> cachedResources which does not exists in hdfs fs
> -
>
> Key: YARN-6032
> URL: https://issues.apache.org/jira/browse/YARN-6032
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Zhaofei Meng
>
> If cacheresources exist in scm but not exist in hdfs,the cacheresources  
> whill not rm from scm until restart scm.So we shoult add check funcion in 
> cleaner task that  rm the cachedResources which does not exists in hdfs fs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-3955) Support for priority ACLs in CapacityScheduler

2017-01-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802330#comment-15802330
 ] 

Wangda Tan edited comment on YARN-3955 at 1/5/17 7:45 PM:
--

1)
bq. There is one issue here. If approvedPriorityACL comes are null, for 
checkAccess it means false.
Ok gotcha, my bad, we cannot merge the two. 

2)
Got it, not related to your patch. The previous design of "acl-key" is bad, it 
will be hard to find which code path uses it...
And in addition, I didn't see test case that parses raw priority acls (string) 
to List of PriorityACLGroup. Could you point me if there's any test cases exist?

Few renaming suggestions: 
- PriorityACLConfiguration \-> AppPriorityACLConfigurationParser (I was trying 
to find where's the parser code, and since we're adding queue priority 
YARN-5864, so it will be better to add an App\- to distinguish that) 
- Similiarily, PriorityAclConfig -> AppPriorityACLOwnerType (or any better 
name?)
- PriorityACLGroup -> AppPriorityACLGroup
- Do you think is it better to rename acl_access_priority to 
acl_app_max_priority? 

3)
bq. This code will reset to cluster-max priority only if submitted priority is 
more than cluster max. Since I used compareTo, it not looks very readable.
Yeah, since we're using Priority in different ways, sometimes lower is more 
important and sometimes higher is more important. Could you use ">" to do the 
comparision?

bq. checkAndGetApplicationPriority: when an app's priority set to negative, I 
think we should use 0 instead of max. Thoughts?
Negative value looks fine, since app can set lower priority if needed.

4)
bq. Could I add a clear model. It may be more easy. Thoughts? Updated patch as 
per this. 
Not quite sure what did you mean. From my understanding, existing logic read 
acls from configs while refreshQueues, and what we need to do is to replace all 
ACLs instead of append to previous acl list, correct? 

bq. One doubt here. Since priorityAcls could also be updated in reinitialize, 
we can’t make it as final rt. refreshQueue’s call flow for eg.
Since the returned list can be modified by another thread, so the readLock 
cannot provide enough protection. The better way might be readLock + copyList.

bq. But we are doing statestore update within scheduler. Hence we need to pass 
future to see exception is thrown immediately. Hence we had to add this while 
doing move to queue.
Make sense.


was (Author: leftnoteasy):
1)
bq. There is one issue here. If approvedPriorityACL comes are null, for 
checkAccess it means false.
Ok gotcha, my bad, we cannot merge the two. 

2)
Got it, not related to your patch. The previous design of "acl-key" is bad, it 
will be hard to find which code path uses it...
And in addition, I didn't see test case that parses raw priority acls (string) 
to List of PriorityACLGroup. Could you point me if there's any test cases exist?

Few renaming suggestions: 
- PriorityACLConfiguration -> AppPriorityACLConfigurationParser (I was trying 
to find where's the parser code, and since we're adding queue priority 
YARN-5864, so it will be better to add an App- to distinguish that) 
- Similiarily, PriorityAclConfig -> AppPriorityACLOwnerType (or any better 
name?)
- PriorityACLGroup -> AppPriorityACLGroup
- Do you think is it better to rename acl_access_priority to 
acl_app_max_priority? 

3)
bq. This code will reset to cluster-max priority only if submitted priority is 
more than cluster max. Since I used compareTo, it not looks very readable.
Yeah, since we're using Priority in different ways, sometimes lower is more 
important and sometimes higher is more important. Could you use ">" to do the 
comparision?

bq. checkAndGetApplicationPriority: when an app's priority set to negative, I 
think we should use 0 instead of max. Thoughts?
Negative value looks fine, since app can set lower priority if needed.

4)
bq. Could I add a clear model. It may be more easy. Thoughts? Updated patch as 
per this. 
Not quite sure what did you mean. From my understanding, existing logic read 
acls from configs while refreshQueues, and what we need to do is to replace all 
ACLs instead of append to previous acl list, correct? 

bq. One doubt here. Since priorityAcls could also be updated in reinitialize, 
we can’t make it as final rt. refreshQueue’s call flow for eg.
Since the returned list can be modified by another thread, so the readLock 
cannot provide enough protection. The better way might be readLock + copyList.

bq. But we are doing statestore update within scheduler. Hence we need to pass 
future to see exception is thrown immediately. Hence we had to add this while 
doing move to queue.
Make sense.

> Support for priority ACLs in CapacityScheduler
> --
>
> Key: YARN-3955
> URL: https://issues.apache.org/jira/browse/YARN-3955
> Pro

[jira] [Updated] (YARN-6032) SharedCacheManager cleaner task should rm InMemorySCMStore some cachedResources which does not exists in hdfs fs

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-6032:
-
Fix Version/s: (was: 2.7.1)

>  SharedCacheManager cleaner task should rm InMemorySCMStore some 
> cachedResources which does not exists in hdfs fs
> -
>
> Key: YARN-6032
> URL: https://issues.apache.org/jira/browse/YARN-6032
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Zhaofei Meng
>
> If cacheresources exist in scm but not exist in hdfs,the cacheresources  
> whill not rm from scm until restart scm.So we shoult add check funcion in 
> cleaner task that  rm the cachedResources which does not exists in hdfs fs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3955) Support for priority ACLs in CapacityScheduler

2017-01-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802330#comment-15802330
 ] 

Wangda Tan commented on YARN-3955:
--

1)
bq. There is one issue here. If approvedPriorityACL comes are null, for 
checkAccess it means false.
Ok gotcha, my bad, we cannot merge the two. 

2)
Got it, not related to your patch. The previous design of "acl-key" is bad, it 
will be hard to find which code path uses it...
And in addition, I didn't see test case that parses raw priority acls (string) 
to List of PriorityACLGroup. Could you point me if there's any test cases exist?

Few renaming suggestions: 
- PriorityACLConfiguration -> AppPriorityACLConfigurationParser (I was trying 
to find where's the parser code, and since we're adding queue priority 
YARN-5864, so it will be better to add an App- to distinguish that) 
- Similiarily, PriorityAclConfig -> AppPriorityACLOwnerType (or any better 
name?)
- PriorityACLGroup -> AppPriorityACLGroup
- Do you think is it better to rename acl_access_priority to 
acl_app_max_priority? 

3)
bq. This code will reset to cluster-max priority only if submitted priority is 
more than cluster max. Since I used compareTo, it not looks very readable.
Yeah, since we're using Priority in different ways, sometimes lower is more 
important and sometimes higher is more important. Could you use ">" to do the 
comparision?

bq. checkAndGetApplicationPriority: when an app's priority set to negative, I 
think we should use 0 instead of max. Thoughts?
Negative value looks fine, since app can set lower priority if needed.

4)
bq. Could I add a clear model. It may be more easy. Thoughts? Updated patch as 
per this. 
Not quite sure what did you mean. From my understanding, existing logic read 
acls from configs while refreshQueues, and what we need to do is to replace all 
ACLs instead of append to previous acl list, correct? 

bq. One doubt here. Since priorityAcls could also be updated in reinitialize, 
we can’t make it as final rt. refreshQueue’s call flow for eg.
Since the returned list can be modified by another thread, so the readLock 
cannot provide enough protection. The better way might be readLock + copyList.

bq. But we are doing statestore update within scheduler. Hence we need to pass 
future to see exception is thrown immediately. Hence we had to add this while 
doing move to queue.
Make sense.

> Support for priority ACLs in CapacityScheduler
> --
>
> Key: YARN-3955
> URL: https://issues.apache.org/jira/browse/YARN-3955
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: ApplicationPriority-ACL.pdf, 
> ApplicationPriority-ACLs-v2.pdf, YARN-3955.0001.patch, YARN-3955.0002.patch, 
> YARN-3955.0003.patch, YARN-3955.0004.patch, YARN-3955.0005.patch, 
> YARN-3955.0006.patch, YARN-3955.0007.patch, YARN-3955.0008.patch, 
> YARN-3955.v0.patch, YARN-3955.v1.patch, YARN-3955.wip1.patch
>
>
> Support will be added for User-level access permission to use different 
> application-priorities. This is to avoid situations where all users try 
> running max priority in the cluster and thus degrading the value of 
> priorities.
> Access Control Lists can be set per priority level within each queue. Below 
> is an example configuration that can be added in capacity scheduler 
> configuration
> file for each Queue level.
> yarn.scheduler.capacity.root...acl=user1,user2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5831) Propagate allowPreemptionFrom flag all the way down to the app

2017-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802328#comment-15802328
 ] 

Hadoop QA commented on YARN-5831:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 21s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 46 unchanged - 0 fixed = 47 total (was 46) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
21s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 913 unchanged - 0 fixed = 914 total (was 913) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m 34s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5831 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845682/YARN-5831.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 32d87da1364c 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 0a55bd8 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14568/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/14568/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14568/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https

[jira] [Commented] (YARN-5955) Use threadpool or multiple thread to recover app

2017-01-05 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802336#comment-15802336
 ] 

Junping Du commented on YARN-5955:
--

Remove fix version as JIRA hasn't been resolved.

> Use threadpool or multiple thread to recover app
> 
>
> Key: YARN-5955
> URL: https://issues.apache.org/jira/browse/YARN-5955
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: Zhaofei Meng
>Assignee: Ajith S
>
> current app recovery is one by one,use thead pool can make recovery faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5936) when cpu strict mode is closed, yarn couldn't assure scheduling fairness between containers

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5936:
-
Fix Version/s: (was: 2.7.1)

> when cpu strict mode is closed, yarn couldn't assure scheduling fairness 
> between containers
> ---
>
> Key: YARN-5936
> URL: https://issues.apache.org/jira/browse/YARN-5936
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.1
> Environment: CentOS7.1
>Reporter: zhengchenyu
>Priority: Critical
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> When using LinuxContainer, the setting that 
> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" is 
> true could assure scheduling fairness with the cpu bandwith of cgroup. But 
> the cpu bandwidth of cgroup would lead to bad performance in our experience. 
> Without cpu bandwidth of cgroup, cpu.share of cgroup is our only way to 
> assure scheduling fairness, but it is not completely effective. For example, 
> There are two container that have same vcore(means same cpu.share), one 
> container is single-threaded, the other container is multi-thread. the 
> multi-thread will have more CPU time, It's unreasonable!
> Here is my test case, I submit two distributedshell application. And two 
> commmand are below:
> {code}
> hadoop jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar 
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar 
> -shell_script ./run.sh  -shell_args 10 -num_containers 1 -container_memory 
> 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
> hadoop jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar 
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar 
> -shell_script ./run.sh  -shell_args 1  -num_containers 1 -container_memory 
> 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
> {code}
>  here show the cpu time of the two container:
> {code}
>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
> 15448 yarn  20   0 9059592  28336   9180 S 998.7  0.1  24:09.30 java
> 15026 yarn  20   0 9050340  27480   9188 S 100.0  0.1   3:33.97 java
> 13767 yarn  20   0 1799816 381208  18528 S   4.6  1.2   0:30.55 java
>77 root  rt   0   0  0  0 S   0.3  0.0   0:00.74 
> migration/1   
> {code}
> We find the cpu time of Muliti-Thread are ten times than the cpu time of 
> Single-Thread, though the two container have same cpu.share.
> notes:
> run.sh
> {code} 
>   java -cp /home/yarn/loop.jar:$CLASSPATH loop.loop $1
> {code} 
> loop.java
> {code} 
> package loop;
> public class loop {
>   public static void main(String[] args) {
>   // TODO Auto-generated method stub
>   int loop = 1;
>   if(args.length>=1) {
>   System.out.println(args[0]);
>   loop = Integer.parseInt(args[0]);
>   }
>   for(int i=0;i   System.out.println("start thread " + i);
>   new Thread(new Runnable() {
>   @Override
>   public void run() {
>   // TODO Auto-generated method stub
>   int j=0;
>   while(true){j++;}
>   }
>   }).start();
>   }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5955) Use threadpool or multiple thread to recover app

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5955:
-
Fix Version/s: (was: 2.7.1)

> Use threadpool or multiple thread to recover app
> 
>
> Key: YARN-5955
> URL: https://issues.apache.org/jira/browse/YARN-5955
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: Zhaofei Meng
>Assignee: Ajith S
>
> current app recovery is one by one,use thead pool can make recovery faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5846) Improve the fairscheduler attemptScheduler

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5846:
-
Fix Version/s: (was: 2.7.1)

> Improve the fairscheduler attemptScheduler 
> ---
>
> Key: YARN-5846
> URL: https://issues.apache.org/jira/browse/YARN-5846
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.7.1
> Environment: CentOS-7.1
>Reporter: zhengchenyu
>Priority: Critical
>  Labels: fairscheduler
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> when I assign a container, we must consider two factor:
> (1) sort the queue and application, and select the proper request. 
> (2) then we assure this request's host is just this node (data locality). 
> or skip this loop!
> this algorithm regard the sorting queue and application as primary factor. 
> when yarn consider data locality, for example, 
> yarn.scheduler.fair.locality.threshold.node=1, 
> yarn.scheduler.fair.locality.threshold.rack=1 (or 
> yarn.scheduler.fair.locality-delay-rack-ms and 
> yarn.scheduler.fair.locality-delay-node-ms is very large) and lots of 
> applications are runnig, the process of assigning contianer becomes very slow.
> I think data locality is more important then the sequence of the queue and 
> applications. 
> I wanna a new algorithm like this:
>   (1) when resourcemanager accept a new request, notice the RMNodeImpl, 
> and then record this association between RMNode and request
>   (2) when assign containers for node, we assign container by 
> RMNodeImpl's association between RMNode and request directly
>   (3) then I consider the priority of queue and applation. In one object 
> of RMNodeImpl, we sort the request of association.
>   (4) and I think the sorting of current algorithm is consuming, in 
> especial, losts of applications are running, lots of sorting are called. so I 
> think we should sort the queue and applicaiton in a daemon thread, because 
> less error of queues's sequences is allowed.
>   
>   
>   
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5846) Improve the fairscheduler attemptScheduler

2017-01-05 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802340#comment-15802340
 ] 

Junping Du commented on YARN-5846:
--

Hi, we shouldn't set fix version here unless the commit get checked in.

> Improve the fairscheduler attemptScheduler 
> ---
>
> Key: YARN-5846
> URL: https://issues.apache.org/jira/browse/YARN-5846
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.7.1
> Environment: CentOS-7.1
>Reporter: zhengchenyu
>Priority: Critical
>  Labels: fairscheduler
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> when I assign a container, we must consider two factor:
> (1) sort the queue and application, and select the proper request. 
> (2) then we assure this request's host is just this node (data locality). 
> or skip this loop!
> this algorithm regard the sorting queue and application as primary factor. 
> when yarn consider data locality, for example, 
> yarn.scheduler.fair.locality.threshold.node=1, 
> yarn.scheduler.fair.locality.threshold.rack=1 (or 
> yarn.scheduler.fair.locality-delay-rack-ms and 
> yarn.scheduler.fair.locality-delay-node-ms is very large) and lots of 
> applications are runnig, the process of assigning contianer becomes very slow.
> I think data locality is more important then the sequence of the queue and 
> applications. 
> I wanna a new algorithm like this:
>   (1) when resourcemanager accept a new request, notice the RMNodeImpl, 
> and then record this association between RMNode and request
>   (2) when assign containers for node, we assign container by 
> RMNodeImpl's association between RMNode and request directly
>   (3) then I consider the priority of queue and applation. In one object 
> of RMNodeImpl, we sort the request of association.
>   (4) and I think the sorting of current algorithm is consuming, in 
> especial, losts of applications are running, lots of sorting are called. so I 
> think we should sort the queue and applicaiton in a daemon thread, because 
> less error of queues's sequences is allowed.
>   
>   
>   
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3795) ZKRMStateStore crashes due to IOException: Broken pipe

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3795:
-
Fix Version/s: (was: 2.7.1)

> ZKRMStateStore crashes due to IOException: Broken pipe
> --
>
> Key: YARN-3795
> URL: https://issues.apache.org/jira/browse/YARN-3795
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.5.0
>Reporter: lachisis
>Priority: Critical
>
> 2015-06-05 06:06:54,848 INFO org.apache.zookeeper.ClientCnxn: Socket 
> connection established to dap88/134.41.33.88:2181, initiating session
> 2015-06-05 06:06:54,876 INFO org.apache.zookeeper.ClientCnxn: Session 
> establishment complete on server dap88/134.41.33.88:2181, sessionid = 
> 0x34db2f72ac50c86, negotiated timeout = 1
> 2015-06-05 06:06:54,881 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Watcher event type: None with state:SyncConnected for path:null for Service 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> 2015-06-05 06:06:54,881 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-05 06:06:54,881 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-05 06:06:54,881 WARN org.apache.zookeeper.ClientCnxn: Session 
> 0x34db2f72ac50c86 for server dap88/134.41.33.88:2181, unexpected error, 
> closing socket connection and attempting reconnect
> java.io.IOException: Broken pipe
>   at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>   at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>   at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
>   at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
> 2015-06-05 06:06:54,986 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Watcher event type: None with state:Disconnected for path:null for Service 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> 2015-06-05 06:06:54,986 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session disconnected
> 2015-06-05 06:06:55,278 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
> connection to server dap87/134.41.33.87:2181. Will not attempt to 
> authenticate using SASL (unknown error)
> 2015-06-05 06:06:55,278 INFO org.apache.zookeeper.ClientCnxn: Socket 
> connection established to dap87/134.41.33.87:2181, initiating session
> 2015-06-05 06:06:55,330 INFO org.apache.zookeeper.ClientCnxn: Session 
> establishment complete on server dap87/134.41.33.87:2181, sessionid = 
> 0x34db2f72ac50c86, negotiated timeout = 1
> 2015-06-05 06:06:55,343 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Watcher event type: None with state:SyncConnected for path:null for Service 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> 2015-06-05 06:06:55,343 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-05 06:06:55,344 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-05 06:06:55,345 WARN org.apache.zookeeper.ClientCnxn: Session 
> 0x34db2f72ac50c86 for server dap87/134.41.33.87:2181, unexpected error, 
> closing socket connection and attempting reconnect
> java.io.IOException: Broken pipe
>   at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>   at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>   at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
>   at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-6040) Remove usage of ResourceRequest from AppSchedulerInfo / SchedulerApplicationAttempt

2017-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802354#comment-15802354
 ] 

Hadoop QA commented on YARN-6040:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 33s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 26 new + 954 unchanged - 19 fixed = 980 total (was 973) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 42s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6040 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845837/YARN-6040.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 9c22d01c90a5 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 0a55bd8 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14569/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14569/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14569/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-B

[jira] [Updated] (YARN-3614) FileSystemRMStateStore throw exception when failed to remove application, that cause resourcemanager to crash

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3614:
-
Fix Version/s: (was: 2.7.1)

> FileSystemRMStateStore throw exception when failed to remove application, 
> that cause resourcemanager to crash
> -
>
> Key: YARN-3614
> URL: https://issues.apache.org/jira/browse/YARN-3614
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.5.0, 2.7.0
>Reporter: lachisis
>Priority: Critical
>  Labels: patch
> Attachments: YARN-3614-1.patch
>
>
> FileSystemRMStateStore is only a accessorial plug-in of rmstore. 
> When it failed to remove application, I think warning is enough, but now 
> resourcemanager crashed.
> Recently, I configure 
> "yarn.resourcemanager.state-store.max-completed-applications"  to limit 
> applications number in rmstore. when applications number exceed the limit, 
> some old applications will be removed. If failed to remove, resourcemanager 
> will crash.
> The following is log: 
> 2015-05-11 06:58:43,815 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Removing 
> info for app: application_1430994493305_0053
> 2015-05-11 06:58:43,815 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore:
>  Removing info for app: application_1430994493305_0053 at: 
> /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053
> 2015-05-11 06:58:43,816 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
> removing app: application_1430994493305_0053
> java.lang.Exception: Failed to delete 
> /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:572)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:471)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:185)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:171)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:806)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:879)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:874)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
> at java.lang.Thread.run(Thread.java:745)
> 2015-05-11 06:58:43,819 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a 
> org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
> STATE_STORE_OP_FAILED. Cause:
> java.lang.Exception: Failed to delete 
> /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:572)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:471)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:185)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:171)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateM

[jira] [Updated] (YARN-3550) Improve YARN RM REST API error messages

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3550:
-
Fix Version/s: (was: 2.7.1)

> Improve YARN RM REST API error messages
> ---
>
> Key: YARN-3550
> URL: https://issues.apache.org/jira/browse/YARN-3550
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.6.0
>Reporter: Rajesh Kartha
>Priority: Minor
>
> The error messages from an invalid REST call to the YARN RM Rest service does 
> not yield in useful messages.
> Here is a simple example of using GET instead of POST to get a new 
> application id:
> $ curl -X GET  http://myhost:8088/ws/v1/cluster/apps/new-application
>  standalone="yes"?>WebApplicationExceptionjavax.ws.rs.WebApplicationException
> and the RM log has this:
> 2015-04-27 11:18:27,783 WARN  webapp.GenericExceptionHandler 
> (GenericExceptionHandler.java:toResponse(98)) - INTERNAL_SERVER_ERROR
> javax.ws.rs.WebApplicationException
> at 
> com.sun.jersey.server.impl.uri.rules.TerminatingRule.accept(TerminatingRule.java:66)
> at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
> at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
> at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:84)
> Would be useful to return a useful message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3193) When visit standby RM webui, it will redirect to the active RM webui slowly.

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3193:
-
Fix Version/s: (was: 2.7.1)

> When visit standby RM webui, it will redirect to the active RM webui slowly.
> 
>
> Key: YARN-3193
> URL: https://issues.apache.org/jira/browse/YARN-3193
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Reporter: Japs_123
>Assignee: Steve Loughran
>Priority: Minor
>
> when visit the standby RM web ui, it will redirect to the active RM web ui. 
> but this redirect is very slow which give client bad experience. I have try 
> to visit standby namenode,  it directly show the web to client quickly. So, 
> can we improve this experience with YARN  like HDFS. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6040) Remove usage of ResourceRequest from AppSchedulerInfo / SchedulerApplicationAttempt

2017-01-05 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802425#comment-15802425
 ] 

Arun Suresh commented on YARN-6040:
---

Thanks for updating the patch [~leftnoteasy],

# Looks like SchedulingPlacementSet still exposes getPendingAllocationNumber(), 
can we change that to match the PendingAsk ? Thinking further, wondering if you 
need that method.. you should be doing getPendingAsk(resourceName).getCount() 
right ?
# Similarly, looks like you might not need 
SchedulerApplicationAttempt::getPendingAllocationNumber() either..

Everything else looks ok to me..

> Remove usage of ResourceRequest from AppSchedulerInfo / 
> SchedulerApplicationAttempt
> ---
>
> Key: YARN-6040
> URL: https://issues.apache.org/jira/browse/YARN-6040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6040.001.patch, YARN-6040.002.patch, 
> YARN-6040.003.patch, YARN-6040.004.patch, YARN-6040.005.patch, 
> YARN-6040.006.patch
>
>
> As mentioned by YARN-5906, currently schedulers are using ResourceRequest 
> heavily so it will be very hard to adopt the new PowerfulResourceRequest 
> (YARN-4902).
> This JIRA is the 2nd step of refactoring, which remove usage of 
> ResourceRequest from AppSchedulingInfo / SchedulerApplicationAttempt. Instead 
> of returning ResourceRequest, it returns a lightweight and API-independent 
> object - {{PendingAsk}}.
> The only remained ResourceRequest API of AppSchedulingInfo will be used by 
> web service to get list of ResourceRequests.
> So after this patch, usage of ResourceRequest will be isolated inside 
> AppSchedulingInfo, so it will be more flexible to update internal data 
> structure and upgrade old ResourceRequest API to new.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6059) Update paused container state in the state store

2017-01-05 Thread Hitesh Sharma (JIRA)
Hitesh Sharma created YARN-6059:
---

 Summary: Update paused container state in the state store
 Key: YARN-6059
 URL: https://issues.apache.org/jira/browse/YARN-6059
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Sharma
Assignee: Hitesh Sharma






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5246) NMWebAppFilter web redirects drop query parameters

2017-01-05 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-5246:
-
Fix Version/s: (was: 2.9.0)
   2.8.0

Thanks, Varun!  I committed this to branch-2.8 as well.

> NMWebAppFilter web redirects drop query parameters
> --
>
> Key: YARN-5246
> URL: https://issues.apache.org/jira/browse/YARN-5246
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-5246.001.patch, YARN-5246.002.patch
>
>
> The NMWebAppFilter drops query parameters when it carries out a redirect to 
> the log server. This leads to problems when users have simple web 
> authentication setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4990) Re-direction of a particular log file within in a container in NM UI does not redirect properly to Log Server ( history ) on container completion

2017-01-05 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-4990:
-
Fix Version/s: (was: 2.9.0)
   2.8.0

Thanks, Xuan!  I committed this to branch-2.8 as well.

> Re-direction of a particular log file within in a container in NM UI does not 
> redirect properly to Log Server ( history ) on container completion
> -
>
> Key: YARN-4990
> URL: https://issues.apache.org/jira/browse/YARN-4990
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Xuan Gong
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: YARN-4990.1.patch, YARN-4990.2.patch
>
>
> The NM does the redirection to the history server correctly. However if the 
> user is viewing or has a link to a particular specific file, the redirect 
> ends up going to the top level page for the container and not redirecting to 
> the specific file. Additionally, the start param to show logs from the offset 
> 0 also goes missing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5222) DockerContainerExecutor dosn't set work directory

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5222:
-
Fix Version/s: (was: 2.7.2)

> DockerContainerExecutor dosn't set work directory
> -
>
> Key: YARN-5222
> URL: https://issues.apache.org/jira/browse/YARN-5222
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.1, 2.7.2
> Environment: centos
>Reporter: zhengchenyu
>Priority: Critical
>  Labels: patch
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When I submit a spark task on Docker Container, NoClassDefFoundError happend! 
> but the MapReduce taks dosen‘t have this problem. Because, whe lauch the 
> Docker container, Docker dosen't set the work directory for comand, this 
> program can't find spark-assembly-1.6.1-hadoop2.7.1.jar. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5222) DockerContainerExecutor dosn't set work directory

2017-01-05 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802485#comment-15802485
 ] 

Junping Du commented on YARN-5222:
--

Also, please don't set fix version field as we don't have any patch get 
committed.

> DockerContainerExecutor dosn't set work directory
> -
>
> Key: YARN-5222
> URL: https://issues.apache.org/jira/browse/YARN-5222
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.1, 2.7.2
> Environment: centos
>Reporter: zhengchenyu
>Priority: Critical
>  Labels: patch
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When I submit a spark task on Docker Container, NoClassDefFoundError happend! 
> but the MapReduce taks dosen‘t have this problem. Because, whe lauch the 
> Docker container, Docker dosen't set the work directory for comand, this 
> program can't find spark-assembly-1.6.1-hadoop2.7.1.jar. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4348) ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding blocking ZK's event thread

2017-01-05 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802493#comment-15802493
 ] 

Junping Du commented on YARN-4348:
--

Hi [~jianhe] and [~ozawa], Does this fix need to go to 
trunk/branch-2/branch-2.8?

> ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding 
> blocking ZK's event thread
> --
>
> Key: YARN-4348
> URL: https://issues.apache.org/jira/browse/YARN-4348
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2, 2.6.2
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>Priority: Blocker
> Fix For: 2.7.2, 2.6.3
>
> Attachments: YARN-4348-branch-2.7.002.patch, 
> YARN-4348-branch-2.7.003.patch, YARN-4348-branch-2.7.004.patch, 
> YARN-4348.001.patch, YARN-4348.001.patch, log.txt
>
>
> Jian mentioned that the current internal ZK configuration of ZKRMStateStore 
> can cause a following situation:
> 1. syncInternal timeouts, 
> 2. but sync succeeded later on.
> We should use zkResyncWaitTime as the timeout value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4348) ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding blocking ZK's event thread

2017-01-05 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802518#comment-15802518
 ] 

Jian He commented on YARN-4348:
---

No, it doesn't need to. The zkstore implementation has been changed by using 
curator 2.8 upwards

> ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding 
> blocking ZK's event thread
> --
>
> Key: YARN-4348
> URL: https://issues.apache.org/jira/browse/YARN-4348
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2, 2.6.2
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>Priority: Blocker
> Fix For: 2.7.2, 2.6.3
>
> Attachments: YARN-4348-branch-2.7.002.patch, 
> YARN-4348-branch-2.7.003.patch, YARN-4348-branch-2.7.004.patch, 
> YARN-4348.001.patch, YARN-4348.001.patch, log.txt
>
>
> Jian mentioned that the current internal ZK configuration of ZKRMStateStore 
> can cause a following situation:
> 1. syncInternal timeouts, 
> 2. but sync succeeded later on.
> We should use zkResyncWaitTime as the timeout value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5280) Allow YARN containers to run with Java Security Manager

2017-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802560#comment-15802560
 ] 

Hadoop QA commented on YARN-5280:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 45s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 5 new + 267 unchanged - 2 fixed = 272 total (was 269) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
23s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m  
0s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5280 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845844/YARN-5280.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux dcaeed50ee29 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 
10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 0a55bd8 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14571/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
 |
|  Te

[jira] [Commented] (YARN-4348) ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding blocking ZK's event thread

2017-01-05 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802604#comment-15802604
 ] 

Junping Du commented on YARN-4348:
--

Got it. Thanks for confirmation here, Jian!

> ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding 
> blocking ZK's event thread
> --
>
> Key: YARN-4348
> URL: https://issues.apache.org/jira/browse/YARN-4348
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2, 2.6.2
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>Priority: Blocker
> Fix For: 2.7.2, 2.6.3
>
> Attachments: YARN-4348-branch-2.7.002.patch, 
> YARN-4348-branch-2.7.003.patch, YARN-4348-branch-2.7.004.patch, 
> YARN-4348.001.patch, YARN-4348.001.patch, log.txt
>
>
> Jian mentioned that the current internal ZK configuration of ZKRMStateStore 
> can cause a following situation:
> 1. syncInternal timeouts, 
> 2. but sync succeeded later on.
> We should use zkResyncWaitTime as the timeout value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2

2017-01-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802732#comment-15802732
 ] 

Wangda Tan commented on YARN-6041:
--

+1 to latest patch, [~asuresh] please wait another day before cherry-picking it 
to see is there any other comments.

> Opportunistic containers : Combined patch for branch-2 
> ---
>
> Key: YARN-6041
> URL: https://issues.apache.org/jira/browse/YARN-6041
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-6041-branch-2.001.patch, 
> YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch
>
>
> This is a combined patch targeting branch-2 of the following JIRAs which have 
> already been committed to trunk :
> YARN-5938. Refactoring OpportunisticContainerAllocator to use 
> SchedulerRequestKey instead of Priority and other misc fixes
> YARN-5646. Add documentation and update config parameter names for scheduling 
> of OPPORTUNISTIC containers.
> YARN-5982. Simplify opportunistic container parameters and metrics.
> YARN-5918. Handle Opportunistic scheduling allocate request failure when NM 
> is lost.
> YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager 
> container lifecycle.
> YARN-5823. Update NMTokens in case of requests with only opportunistic 
> containers.
> YARN-5377. Fix 
> TestQueuingContainerManager.testKillMultipleOpportunisticContainers.
> YARN-2995. Enhance UI to show cluster resource utilization of various 
> container Execution types.
> YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http 
> Address.
> YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method 
> to handle OPPORTUNISTIC container requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4675) Reorganize TimeClientImpl into TimeClientV1Impl and TimeClientV2Impl

2017-01-05 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802756#comment-15802756
 ] 

Sangjin Lee commented on YARN-4675:
---

One implication of refactoring the interface is that code that uses the 
timeline client would need to be updated along with the API changes. MR and DS 
are not a problem, but other off-hadoop clients such as Tez would need to make 
this change when this change lands on trunk. I assume it is not a major 
problem, but just so that we are aware.

cc [~rohithsharma]

> Reorganize TimeClientImpl into TimeClientV1Impl and TimeClientV2Impl
> 
>
> Key: YARN-4675
> URL: https://issues.apache.org/jira/browse/YARN-4675
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: YARN-5355, yarn-5355-merge-blocker
> Attachments: YARN-4675-YARN-2928.v1.001.patch
>
>
> We need to reorganize TimeClientImpl into TimeClientV1Impl ,  
> TimeClientV2Impl and if required a base class, so that its clear which part 
> of the code belongs to which version and thus better maintainable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3544) AM logs link missing in the RM UI for a completed app

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3544:
-
Fix Version/s: 2.8.0

> AM logs link missing in the RM UI for a completed app 
> --
>
> Key: YARN-3544
> URL: https://issues.apache.org/jira/browse/YARN-3544
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.7.0
>Reporter: Hitesh Shah
>Assignee: Xuan Gong
>Priority: Blocker
>  Labels: 2.6.1-candidate
> Fix For: 2.6.1, 2.8.0, 2.7.1, 3.0.0-alpha1
>
> Attachments: Screen Shot 2015-04-27 at 6.24.05 PM.png, 
> YARN-3544-branch-2.6.1.txt, YARN-3544-branch-2.7-1.2.patch, 
> YARN-3544-branch-2.7-1.patch, YARN-3544.1.patch
>
>
> AM log links should always be present ( for both running and completed apps).
> Likewise node info is also empty. This is usually quite crucial when trying 
> to debug where an AM was launched and a pointer to which NM's logs to look at 
> if the AM failed to launch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3681:
-
Fix Version/s: 2.8.0

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: windows, yarn-client
> Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1
>
> Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
> YARN-3681.1.patch, YARN-3681.branch-2.0.patch, yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3725) App submission via REST API is broken in secure mode due to Timeline DT service address is empty

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3725:
-
Fix Version/s: 2.8.0

> App submission via REST API is broken in secure mode due to Timeline DT 
> service address is empty
> 
>
> Key: YARN-3725
> URL: https://issues.apache.org/jira/browse/YARN-3725
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineserver
>Affects Versions: 2.7.0
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
>  Labels: 2.6.1-candidate
> Fix For: 2.6.1, 2.8.0, 2.7.1, 3.0.0-alpha1
>
> Attachments: YARN-3725-branch-2.6.1.txt, YARN-3725.1.patch
>
>
> YARN-2971 changes TimelineClient to use the service address from Timeline DT 
> to renew the DT instead of configured address. This break the procedure of 
> submitting an YARN app via REST API in the secure mode.
> The problem is that service address is set by the client instead of the 
> server in Java code. REST API response is an encode token Sting, such that 
> it's so inconvenient to deserialize it and set the service address and 
> serialize it again. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3733) Fix DominantRC#compare() does not work as expected if cluster resource is empty

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3733:
-
Fix Version/s: 2.8.0

> Fix DominantRC#compare() does not work as expected if cluster resource is 
> empty
> ---
>
> Key: YARN-3733
> URL: https://issues.apache.org/jira/browse/YARN-3733
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3 , 2 NM , 2 RM
> one NM - 3 GB 6 v core
>Reporter: Bibin A Chundatt
>Assignee: Rohith Sharma K S
>Priority: Blocker
>  Labels: 2.6.1-candidate
> Fix For: 2.6.1, 2.8.0, 2.7.1, 3.0.0-alpha1
>
> Attachments: 0001-YARN-3733.patch, 0002-YARN-3733.patch, 
> 0002-YARN-3733.patch, YARN-3733.patch
>
>
> Steps to reproduce
> =
> 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster)
> 2. Configure map and reduce size to 512 MB  after changing scheduler minimum 
> size to 512 MB
> 3. Configure capacity scheduler and AM limit to .5 
> (DominantResourceCalculator is configured)
> 4. Submit 30 concurrent task 
> 5. Switch RM
> Actual
> =
> For 12 Jobs AM gets allocated and all 12 starts running
> No other Yarn child is initiated , *all 12 Jobs in Running state for ever*
> Expected
> ===
> Only 6 should be running at a time since max AM allocated is .5 (3072 MB)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3701) Isolating the error of generating a single app report when getting all apps from generic history service

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3701:
-
Fix Version/s: 2.8.0

> Isolating the error of generating a single app report when getting all apps 
> from generic history service
> 
>
> Key: YARN-3701
> URL: https://issues.apache.org/jira/browse/YARN-3701
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
> Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1
>
> Attachments: YARN-3701.1.patch
>
>
> Nowadays, if some error of generating a single app report when getting the 
> application list from generic history service, it will throw the exception. 
> Therefore, even if it just 1 out of 100 apps has something wrong, the whole 
> app list is screwed. The worst impact is making the default page (app list) 
> of GHS web UI crash, wile REST API /applicationhistory/apps will also break.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5309) Fix SSLFactory truststore reloader thread leak in TimelineClientImpl

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5309:
-
Fix Version/s: 2.8.0

> Fix SSLFactory truststore reloader thread leak in TimelineClientImpl
> 
>
> Key: YARN-5309
> URL: https://issues.apache.org/jira/browse/YARN-5309
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver, yarn
>Affects Versions: 2.7.1
>Reporter: Thomas Friedrich
>Assignee: Weiwei Yang
>Priority: Blocker
> Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1
>
> Attachments: YARN-5309.001.patch, YARN-5309.002.patch, 
> YARN-5309.003.patch, YARN-5309.004.patch, YARN-5309.005.patch, 
> YARN-5309.branch-2.7.3.001.patch, YARN-5309.branch-2.8.001.patch
>
>
> We found a similar issue as HADOOP-11368 in TimelineClientImpl. The class 
> creates an instance of SSLFactory in newSslConnConfigurator and subsequently 
> creates the ReloadingX509TrustManager instance which in turn starts a trust 
> store reloader thread. 
> However, the SSLFactory is never destroyed and hence the trust store reloader 
> threads are not killed.
> This problem was observed by a customer who had SSL enabled in Hadoop and 
> submitted many queries against the HiveServer2. After a few days, the HS2 
> instance crashed and from the Java dump we could see many (over 13000) 
> threads like this:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> HiveServer2 uses the JobClient to submit a job:
> Thread [HiveServer2-Background-Pool: Thread-188] (Suspended (breakpoint at 
> line 89 in 
> ReloadingX509TrustManager))   
>   owns: Object  (id=464)  
>   owns: Object  (id=465)  
>   owns: Object  (id=466)  
>   owns: ServiceLoader  (id=210)
>   ReloadingX509TrustManager.(String, String, String, long) line: 89 
>   FileBasedKeyStoresFactory.init(SSLFactory$Mode) line: 209   
>   SSLFactory.init() line: 131 
>   TimelineClientImpl.newSslConnConfigurator(int, Configuration) line: 532 
>   TimelineClientImpl.newConnConfigurator(Configuration) line: 507 
>   TimelineClientImpl.serviceInit(Configuration) line: 269 
>   TimelineClientImpl(AbstractService).init(Configuration) line: 163   
>   YarnClientImpl.serviceInit(Configuration) line: 169 
>   YarnClientImpl(AbstractService).init(Configuration) line: 163   
>   ResourceMgrDelegate.serviceInit(Configuration) line: 102
>   ResourceMgrDelegate(AbstractService).init(Configuration) line: 163  
>   ResourceMgrDelegate.(YarnConfiguration) line: 96  
>   YARNRunner.(Configuration) line: 112  
>   YarnClientProtocolProvider.create(Configuration) line: 34   
>   Cluster.initialize(InetSocketAddress, Configuration) line: 95   
>   Cluster.(InetSocketAddress, Configuration) line: 82   
>   Cluster.(Configuration) line: 75  
>   JobClient.init(JobConf) line: 475   
>   JobClient.(JobConf) line: 454 
>   MapRedTask(ExecDriver).execute(DriverContext) line: 401 
>   MapRedTask.execute(DriverContext) line: 137 
>   MapRedTask(Task).executeTask() line: 160 
>   TaskRunner.runSequential() line: 88 
>   Driver.launchTask(Task, String, boolean, String, int, 
> DriverContext) line: 1653   
>   Driver.execute() line: 1412 
> For every job, a new instance of JobClient/YarnClientImpl/TimelineClientImpl 
> is created. But because the HS2 process stays up for days, the previous trust 
> store reloader threads are still hanging around in the HS2 process and 
> eventually use all the resources available. 
> It seems like a similar fix as HADOOP-11368 is needed in TimelineClientImpl 
> but it doesn't have a destroy method to begin with. 
> One option to avoid this problem is to disable the yarn timeline service 
> (yarn.timeline-service.enabled=false).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3426) Add jdiff support to YARN

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3426:
-
Fix Version/s: 2.8.0

> Add jdiff support to YARN
> -
>
> Key: YARN-3426
> URL: https://issues.apache.org/jira/browse/YARN-3426
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Li Lu
>Assignee: Li Lu
>Priority: Blocker
> Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1
>
> Attachments: YARN-3426-040615-1.patch, YARN-3426-040615.patch, 
> YARN-3426-040715.patch, YARN-3426-040815.patch, YARN-3426-05-12-2016.txt, 
> YARN-3426-06-09-2016.txt, YARN-3426-branch-2.005.patch, 
> YARN-3426-branch-2.8.005.patch
>
>
> Maybe we'd like to extend our current jdiff tool for hadoop-common and hdfs 
> to YARN as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3850:
-
Fix Version/s: 2.8.0

> NM fails to read files from full disks which can lead to container logs being 
> lost and other issues
> ---
>
> Key: YARN-3850
> URL: https://issues.apache.org/jira/browse/YARN-3850
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation, nodemanager
>Affects Versions: 2.7.0
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: 2.6.1-candidate
> Fix For: 2.6.1, 2.8.0, 2.7.1, 3.0.0-alpha1
>
> Attachments: YARN-3850.01.patch, YARN-3850.02.patch
>
>
> *Container logs* can be lost if disk has become full(~90% full).
> When application finishes, we upload logs after aggregation by calling 
> {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns 
> checks the eligible directories on call to 
> {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would 
> return nothing. So none of the container logs are aggregated and uploaded.
> But on application finish, we also call 
> {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the 
> application directory which contains container logs. This is because it calls 
> {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks 
> as well.
> So we are left with neither aggregated logs for the app nor the individual 
> container logs for the app.
> In addition to this, there are 2 more issues :
> # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so 
> NM will fail to serve up logs from full disks from its web interfaces.
> # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full 
> disks so it is possible that on container recovery, PID file is not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6022) Revert changes of AbstractResourceRequest

2017-01-05 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802770#comment-15802770
 ] 

Daniel Templeton commented on YARN-6022:


Sorry, [~leftnoteasy], there's a conflict now.  Mind rebasing?

> Revert changes of AbstractResourceRequest
> -
>
> Key: YARN-6022
> URL: https://issues.apache.org/jira/browse/YARN-6022
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-6022.001.patch, YARN-6022.002.patch, 
> YARN-6022.003.patch
>
>
> YARN-5774 added AbstractResourceRequest to make easier internal scheduler 
> change, this is not a correct approach: For example, with this change, we 
> need to make AbstractResourceRequest to be public/stable. And end users can 
> use it like:
> {code}
> AbstractResourceRequest request = ...
> request.setCapability(...)
> {code}
> But AbstractResourceRequest should not be visible by application at all. 
> We need to revert it from branch-2.8 / branch-2 / trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3764) CapacityScheduler should forbid moving LeafQueue from one parent to another

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3764:
-
Fix Version/s: 2.8.0

> CapacityScheduler should forbid moving LeafQueue from one parent to another
> ---
>
> Key: YARN-3764
> URL: https://issues.apache.org/jira/browse/YARN-3764
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1
>
> Attachments: YARN-3764.1.patch
>
>
> Currently CapacityScheduler doesn't handle the case well, for example:
> A queue structure:
> {code}
> root
>   |
>   a (100)
> /   \
>x y
>   (50)   (50)
> {code}
> And reinitialize using following structure:
> {code}
>  root
>  /   \ 
> (50)a x (50)
> |
> y
>(100)
> {code}
> The actual queue structure after reinitialize is:
> {code}
>  root
> /\
>a (50) x (50)
>   /  \
>  xy
> (50)  (100)
> {code}
> We should forbid admin doing that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3522) DistributedShell uses the wrong user to put timeline data

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3522:
-
Fix Version/s: 2.8.0

> DistributedShell uses the wrong user to put timeline data
> -
>
> Key: YARN-3522
> URL: https://issues.apache.org/jira/browse/YARN-3522
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
> Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1
>
> Attachments: YARN-3522.1.patch, YARN-3522.2.patch, YARN-3522.3.patch
>
>
> YARN-3287 breaks the timeline access control of distributed shell. In 
> distributed shell AM:
> {code}
> if (conf.getBoolean(YarnConfiguration.TIMELINE_SERVICE_ENABLED,
>   YarnConfiguration.DEFAULT_TIMELINE_SERVICE_ENABLED)) {
>   // Creating the Timeline Client
>   timelineClient = TimelineClient.createTimelineClient();
>   timelineClient.init(conf);
>   timelineClient.start();
> } else {
>   timelineClient = null;
>   LOG.warn("Timeline service is not enabled");
> }
> {code}
> {code}
>   ugi.doAs(new PrivilegedExceptionAction() {
> @Override
> public TimelinePutResponse run() throws Exception {
>   return timelineClient.putEntities(entity);
> }
>   });
> {code}
> YARN-3287 changes the timeline client to get the right ugi at serviceInit, 
> but DS AM still doesn't use submitter ugi to init timeline client, but use 
> the ugi for each put entity call. It result in the wrong user of the put 
> request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6022) Revert changes of AbstractResourceRequest

2017-01-05 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802779#comment-15802779
 ] 

Daniel Templeton commented on YARN-6022:


Bah, nevermind.  The conflict is trivial.  I'll take care of it.

> Revert changes of AbstractResourceRequest
> -
>
> Key: YARN-6022
> URL: https://issues.apache.org/jira/browse/YARN-6022
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-6022.001.patch, YARN-6022.002.patch, 
> YARN-6022.003.patch
>
>
> YARN-5774 added AbstractResourceRequest to make easier internal scheduler 
> change, this is not a correct approach: For example, with this change, we 
> need to make AbstractResourceRequest to be public/stable. And end users can 
> use it like:
> {code}
> AbstractResourceRequest request = ...
> request.setCapability(...)
> {code}
> But AbstractResourceRequest should not be visible by application at all. 
> We need to revert it from branch-2.8 / branch-2 / trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4424) Fix deadlock in RMAppImpl

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4424:
-
Fix Version/s: 2.8.0

> Fix deadlock in RMAppImpl
> -
>
> Key: YARN-4424
> URL: https://issues.apache.org/jira/browse/YARN-4424
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Jian He
>Priority: Blocker
> Fix For: 2.8.0, 2.7.2, 2.6.3, 3.0.0-alpha1
>
> Attachments: YARN-4424.1.patch
>
>
> {code}
> yarn@XXX:/mnt/hadoopqe$ /usr/hdp/current/hadoop-yarn-client/bin/yarn 
> application -list -appStates NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING
> 15/12/04 21:59:54 INFO impl.TimelineClientImpl: Timeline service address: 
> http://XXX:8188/ws/v1/timeline/
> 15/12/04 21:59:54 INFO client.RMProxy: Connecting to ResourceManager at 
> XXX/0.0.0.0:8050
> 15/12/04 21:59:55 INFO client.AHSProxy: Connecting to Application History 
> server at XXX/0.0.0.0:10200
> {code}
> {code:title=RM log}
> 2015-12-04 21:59:19,744 INFO  event.AsyncDispatcher 
> (AsyncDispatcher.java:handle(243)) - Size of event-queue is 237000
> 2015-12-04 22:00:50,945 INFO  event.AsyncDispatcher 
> (AsyncDispatcher.java:handle(243)) - Size of event-queue is 238000
> 2015-12-04 22:02:22,416 INFO  event.AsyncDispatcher 
> (AsyncDispatcher.java:handle(243)) - Size of event-queue is 239000
> 2015-12-04 22:03:53,593 INFO  event.AsyncDispatcher 
> (AsyncDispatcher.java:handle(243)) - Size of event-queue is 24
> 2015-12-04 22:05:24,856 INFO  event.AsyncDispatcher 
> (AsyncDispatcher.java:handle(243)) - Size of event-queue is 241000
> 2015-12-04 22:06:56,235 INFO  event.AsyncDispatcher 
> (AsyncDispatcher.java:handle(243)) - Size of event-queue is 242000
> 2015-12-04 22:08:27,510 INFO  event.AsyncDispatcher 
> (AsyncDispatcher.java:handle(243)) - Size of event-queue is 243000
> 2015-12-04 22:09:58,786 INFO  event.AsyncDispatcher 
> (AsyncDispatcher.java:handle(243)) - Size of event-queue is 244000
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4610:
-
Fix Version/s: 2.8.0

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1
>
> Attachments: YARN-4610-branch-2.7.002.patch, YARN-4610.001.patch, 
> YARN-4610.branch-2.7.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6060) Linux container executor fails to run container on directories mounted as noexec

2017-01-05 Thread Miklos Szegedi (JIRA)
Miklos Szegedi created YARN-6060:


 Summary: Linux container executor fails to run container on 
directories mounted as noexec
 Key: YARN-6060
 URL: https://issues.apache.org/jira/browse/YARN-6060
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager, yarn
Reporter: Miklos Szegedi
Assignee: Miklos Szegedi


If node manager directories are mounted as noexec, LCE fails with the following 
error:
Launching container...
Couldn't execute the container launch file 
/tmp/hadoop-/nm-local-dir/usercache//appcache/application_1483656052575_0001/container_1483656052575_0001_02_01/launch_container.sh
 - Permission denied



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2

2017-01-05 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802796#comment-15802796
 ] 

Karthik Kambatla commented on YARN-6041:


The changes to config names and methods look good to me.

> Opportunistic containers : Combined patch for branch-2 
> ---
>
> Key: YARN-6041
> URL: https://issues.apache.org/jira/browse/YARN-6041
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-6041-branch-2.001.patch, 
> YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch
>
>
> This is a combined patch targeting branch-2 of the following JIRAs which have 
> already been committed to trunk :
> YARN-5938. Refactoring OpportunisticContainerAllocator to use 
> SchedulerRequestKey instead of Priority and other misc fixes
> YARN-5646. Add documentation and update config parameter names for scheduling 
> of OPPORTUNISTIC containers.
> YARN-5982. Simplify opportunistic container parameters and metrics.
> YARN-5918. Handle Opportunistic scheduling allocate request failure when NM 
> is lost.
> YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager 
> container lifecycle.
> YARN-5823. Update NMTokens in case of requests with only opportunistic 
> containers.
> YARN-5377. Fix 
> TestQueuingContainerManager.testKillMultipleOpportunisticContainers.
> YARN-2995. Enhance UI to show cluster resource utilization of various 
> container Execution types.
> YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http 
> Address.
> YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method 
> to handle OPPORTUNISTIC container requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6049) Graceful Decommission web link is broken, gives 404 Not Found

2017-01-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-6049:
--
Fix Version/s: (was: 3.0.0-alpha2)

> Graceful Decommission web link is broken, gives 404 Not Found
> -
>
> Key: YARN-6049
> URL: https://issues.apache.org/jira/browse/YARN-6049
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0-alpha1
>Reporter: Emre Sevinç
>Priority: Minor
>  Labels: documentation, easyfix
>
> Graceful Decommission page, 
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html,
>  is broken. It gives "Not Found  The requested URL 
> /docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html was not 
> found on this server." error.
> There are links to this problematic web page from all of the HTML pages in 
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6049) Graceful Decommission web link is broken, gives 404 Not Found

2017-01-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-6049:
--
Target Version/s: 3.0.0-alpha2  (was: 3.0.0-alpha1)

> Graceful Decommission web link is broken, gives 404 Not Found
> -
>
> Key: YARN-6049
> URL: https://issues.apache.org/jira/browse/YARN-6049
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0-alpha1
>Reporter: Emre Sevinç
>Priority: Minor
>  Labels: documentation, easyfix
>
> Graceful Decommission page, 
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html,
>  is broken. It gives "Not Found  The requested URL 
> /docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html was not 
> found on this server." error.
> There are links to this problematic web page from all of the HTML pages in 
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5556) Support for deleting queues without requiring a RM restart

2017-01-05 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802851#comment-15802851
 ] 

Xuan Gong commented on YARN-5556:
-

sounds good. Will update the design doc

> Support for deleting queues without requiring a RM restart
> --
>
> Key: YARN-5556
> URL: https://issues.apache.org/jira/browse/YARN-5556
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Xuan Gong
>Assignee: Naganarasimha G R
> Attachments: YARN-5556.v1.001.patch, YARN-5556.v1.002.patch, 
> YARN-5556.v1.003.patch, YARN-5556.v1.004.patch
>
>
> Today, we could add or modify queues without restarting the RM, via a CS 
> refresh. But for deleting queue, we have to restart the ResourceManager. We 
> could support for deleting queues without requiring a RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6060) Linux container executor fails to run container on directories mounted as noexec

2017-01-05 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-6060:
-
Attachment: YARN-6060.000.patch

> Linux container executor fails to run container on directories mounted as 
> noexec
> 
>
> Key: YARN-6060
> URL: https://issues.apache.org/jira/browse/YARN-6060
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, yarn
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-6060.000.patch
>
>
> If node manager directories are mounted as noexec, LCE fails with the 
> following error:
> Launching container...
> Couldn't execute the container launch file 
> /tmp/hadoop-/nm-local-dir/usercache//appcache/application_1483656052575_0001/container_1483656052575_0001_02_01/launch_container.sh
>  - Permission denied



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5724) [Umbrella] Better Queue Management in YARN

2017-01-05 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802882#comment-15802882
 ] 

Xuan Gong commented on YARN-5724:
-

update the design doc based on the discussion on YARN-5556

> [Umbrella] Better Queue Management in YARN
> --
>
> Key: YARN-5724
> URL: https://issues.apache.org/jira/browse/YARN-5724
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: capacity scheduler
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: 
> Designdocv1-Configuration-basedQueueManagementinYARN.pdf, 
> Designdocv2-Configuration-basedQueueManagementinYARN.pdf
>
>
> This serves as an umbrella ticket for tasks related to better queue 
> management in YARN.
> Today's the only way to manage the queue is through admins editing 
> configuration files and then issuing a refresh command. This will bring many 
> inconveniences. For example, the users can not create / delete /modify their 
> own queues without talking to site level admins.
> Even in today's approach (configuration-based), we still have several places 
> needed to improve: 
> *  It is possible today to add or modify queues without restarting the RM, 
> via a CS refresh. But for deleting queue, we have to restart the 
> resourcemanager.
> * When a queue is STOPPED, resources allocated to the queue can be handled 
> better. Currently, they'll only be used if the other queues are setup to go 
> over their capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5724) [Umbrella] Better Queue Management in YARN

2017-01-05 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-5724:

Attachment: Designdocv2-Configuration-basedQueueManagementinYARN.pdf

> [Umbrella] Better Queue Management in YARN
> --
>
> Key: YARN-5724
> URL: https://issues.apache.org/jira/browse/YARN-5724
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: capacity scheduler
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: 
> Designdocv1-Configuration-basedQueueManagementinYARN.pdf, 
> Designdocv2-Configuration-basedQueueManagementinYARN.pdf
>
>
> This serves as an umbrella ticket for tasks related to better queue 
> management in YARN.
> Today's the only way to manage the queue is through admins editing 
> configuration files and then issuing a refresh command. This will bring many 
> inconveniences. For example, the users can not create / delete /modify their 
> own queues without talking to site level admins.
> Even in today's approach (configuration-based), we still have several places 
> needed to improve: 
> *  It is possible today to add or modify queues without restarting the RM, 
> via a CS refresh. But for deleting queue, we have to restart the 
> resourcemanager.
> * When a queue is STOPPED, resources allocated to the queue can be handled 
> better. Currently, they'll only be used if the other queues are setup to go 
> over their capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6054) TimelineServer fails to start when some LevelDb state files are missing.

2017-01-05 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-6054:
---
Attachment: YARN-6054.01.patch

Here's a patch along with a unit test.

> TimelineServer fails to start when some LevelDb state files are missing.
> 
>
> Key: YARN-6054
> URL: https://issues.apache.org/jira/browse/YARN-6054
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Ravi Prakash
> Attachments: YARN-6054.01.patch
>
>
> We encountered an issue recently where the TimelineServer failed to start 
> because some state files went missing.
> {code}
> 2016-11-21 20:46:43,134 INFO org.apache.hadoop.service.AbstractService: 
> Service 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
>  failed in state INITED
> ; cause: org.apache.hadoop.service.ServiceStateException: 
> org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 
> missing files; e.g.: /timelines
> erver/leveldb-timeline-store.ldb/127897.sst
> org.apache.hadoop.service.ServiceStateException: 
> org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 
> missing files; e.g.: /timelineserver/lev
> eldb-timeline-store.ldb/127897.sst
> 2016-11-21 20:46:43,135 FATAL 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer:
>  Error starting ApplicationHistoryServer
> org.apache.hadoop.service.ServiceStateException: 
> org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 
> missing files; e.g.: 
> /timelineserver/leveldb-timeline-store.ldb/127897.sst
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:172)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:182)
> Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: 
> Corruption: 9 missing files; e.g.: 
> /timelineserver/leveldb-timeline-store.ldb/127897.sst
> at 
> org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
> at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
> at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
> at 
> org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore.serviceInit(LeveldbTimelineStore.java:229)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> ... 5 more
> 2016-11-21 20:46:43,136 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status -1
> {code}
> Ideally we shouldn't have any missing state files. However I'd posit that the 
> TimelineServer should have graceful degradation instead of failing to start 
> at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   3   >