date:20170104

[jira] [Commented] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir

2017-01-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800557#comment-15800557
 ] 

Hadoop QA commented on YARN-6056:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
31s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m 
53s{color} | {color:red} root in branch-2.6.1 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed 
with JDK v1.8.0_111. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed 
with JDK v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
12s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed. 
{color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red}  0m 
13s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed. 
{color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
6s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with 
JDK v1.8.0_111. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  0m  6s{color} | 
{color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_111. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m  6s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_111. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with 
JDK v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  0m  8s{color} | 
{color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m  8s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch has 444 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m 
16s{color} | {color:red} The patch 16 line(s) with tabs. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m  9s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_121. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
39s{color} | {color:red} The patch generated 47 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:date2017-01-05 |
| JIRA Issue | YARN-6056 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845723/YARN-6056-branch-2.6.1.patch
 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 8b46aa325857 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2.6.1 / 41d19f4 |
| Default Java | 1.7.0_121 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_111 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/14561/artifact/patchprocess/branch-mvninstall-root.txt
 |
| compile |

[jira] [Updated] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir

2017-01-04 Thread Wilfred Spiegelenburg (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-6056:

Attachment: YARN-6056-branch-2.6.1.patch

patch just for branch-2.6

> Yarn NM using LCE shows a failure when trying to delete a non-existing dir
> --
>
> Key: YARN-6056
> URL: https://issues.apache.org/jira/browse/YARN-6056
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.6.5
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: YARN-6056-branch-2.6.1.patch
>
>
> As part of YARN-2902 the clean up of the local directories was changed to 
> ignore non existing directories and proceed with others in the list. This 
> part of the code change was not backported into branch-2.6, backporting just 
> that part now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2017-01-04 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800488#comment-15800488
 ] 

Wilfred Spiegelenburg commented on YARN-2902:
-

Oops yes branch-2.6 was the broken one, it has a separate patch. I will port 
only the ENOENT change to branch-2.6
thank you for confirming this.

Opened: YARN-6056 and will provide a patch there

> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.7.2, 2.6.4, 3.0.0-alpha1
>
> Attachments: YARN-2902-branch-2.6.01.patch, YARN-2902.002.patch, 
> YARN-2902.03.patch, YARN-2902.04.patch, YARN-2902.05.patch, 
> YARN-2902.06.patch, YARN-2902.07.patch, YARN-2902.08.patch, 
> YARN-2902.09.patch, YARN-2902.10.patch, YARN-2902.11.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir

2017-01-04 Thread Wilfred Spiegelenburg (JIRA)

Wilfred Spiegelenburg created YARN-6056:
---

 Summary: Yarn NM using LCE shows a failure when trying to delete a 
non-existing dir
 Key: YARN-6056
 URL: https://issues.apache.org/jira/browse/YARN-6056
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.6.5
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg


As part of YARN-2902 the clean up of the local directories was changed to 
ignore non existing directories and proceed with others in the list. This part 
of the code change was not backported into branch-2.6, backporting just that 
part now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters

2017-01-04 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800448#comment-15800448
 ] 

Sangjin Lee commented on YARN-5585:
---

At least Varun's proposal for fromIdPrefix contains the following language:
{noformat}
If fromIdPrefix is same for all entities of a given entity type, then the user 
must provide fromId as a filter to denote the start entity from which further 
entities will be fetched.
{noformat}

This is bit confusing, and it's not clear that fromIdPrefix is mandatory even 
in this case. We can add your sentence right after this (in fromIdPrefix).

The fromId javadoc reads OK IMO.

> [Atsv2] Reader side changes for entity prefix and support for pagination via 
> additional filters
> ---
>
> Key: YARN-5585
> URL: https://issues.apache.org/jira/browse/YARN-5585
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
>  Labels: yarn-5355-merge-blocker
> Attachments: 0001-YARN-5585.patch, YARN-5585-YARN-5355.0001.patch, 
> YARN-5585-YARN-5355.0002.patch, YARN-5585-YARN-5355.0003.patch, 
> YARN-5585-YARN-5355.0004.patch, YARN-5585-YARN-5355.0005.patch, 
> YARN-5585-workaround.patch, YARN-5585.v0.patch
>
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Current Behavior : Default limit is set to 100. If there are 1000 entities 
> then REST call gives first/last 100 entities. How to retrieve next set of 100 
> entities i.e 101 to 200 OR 900 to 801?
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is 
> no way to achieve this. 
> So proposal is to have fromId in the filter like 
> *getApps?limit=5&=app-5* which gives list of apps from app-6 to 
> app-10. 
> Since ATS is targeting large number of entities storage, it is very common 
> use case to get next set of entities using fromId rather than querying all 
> the entites. This is very useful for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters

2017-01-04 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800380#comment-15800380
 ] 

Rohith Sharma K S commented on YARN-5585:
-

I am confused, [~sjlee0] would you clarify it please?

> [Atsv2] Reader side changes for entity prefix and support for pagination via 
> additional filters
> ---
>
> Key: YARN-5585
> URL: https://issues.apache.org/jira/browse/YARN-5585
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
>  Labels: yarn-5355-merge-blocker
> Attachments: 0001-YARN-5585.patch, YARN-5585-YARN-5355.0001.patch, 
> YARN-5585-YARN-5355.0002.patch, YARN-5585-YARN-5355.0003.patch, 
> YARN-5585-YARN-5355.0004.patch, YARN-5585-YARN-5355.0005.patch, 
> YARN-5585-workaround.patch, YARN-5585.v0.patch
>
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Current Behavior : Default limit is set to 100. If there are 1000 entities 
> then REST call gives first/last 100 entities. How to retrieve next set of 100 
> entities i.e 101 to 200 OR 900 to 801?
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is 
> no way to achieve this. 
> So proposal is to have fromId in the filter like 
> *getApps?limit=5&=app-5* which gives list of apps from app-6 to 
> app-10. 
> Since ATS is targeting large number of entities storage, it is very common 
> use case to get next set of entities using fromId rather than querying all 
> the entites. This is very useful for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-04 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5959:
--
Attachment: YARN-5959.005.patch

Attaching trunk patch..
Thanks for the review [~leftnoteasy].

> RM changes to support change of container ExecutionType
> ---
>
> Key: YARN-5959
> URL: https://issues.apache.org/jira/browse/YARN-5959
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5959-YARN-5085.001.patch, 
> YARN-5959-YARN-5085.002.patch, YARN-5959-YARN-5085.003.patch, 
> YARN-5959-YARN-5085.004.patch, YARN-5959-YARN-5085.005.patch, 
> YARN-5959.005.patch, YARN-5959.combined.001.patch, YARN-5959.wip.002.patch, 
> YARN-5959.wip.003.patch, YARN-5959.wip.patch
>
>
> RM side changes to allow an AM to ask for change of ExecutionType.
> Currently, there are two cases:
> # *Promotion* : OPPORTUNISTIC to GUARANTEED.
> # *Demotion* : GUARANTEED to OPPORTUNISTIC.
> This is similar in YARN-1197 which allows for change in Container resources. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys

2017-01-04 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800306#comment-15800306
 ] 

Varun Vasudev commented on YARN-5547:
-

[~ajithshetty] - any updates on this?

> NMLeveldbStateStore should be more tolerant of unknown keys
> ---
>
> Key: YARN-5547
> URL: https://issues.apache.org/jira/browse/YARN-5547
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Ajith S
> Attachments: YARN-5547.01.patch, YARN-5547.02.patch, 
> YARN-5547.03.patch
>
>
> Whenever new keys are added to the NM state store it will break rolling 
> downgrades because the code will throw if it encounters an unrecognized key.  
> If instead it skipped unrecognized keys it could be simpler to continue 
> supporting rolling downgrades.  We need to define the semantics of 
> unrecognized keys when containers and apps are cleaned up, e.g.: we may want 
> to delete all keys underneath an app or container directory when it is being 
> removed from the state store to prevent leaking unrecognized keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-01-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800282#comment-15800282
 ] 

Hadoop QA commented on YARN-6050:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 15 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  9m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 57s{color} | {color:orange} root: The patch generated 4 new + 1652 unchanged 
- 5 fixed = 1656 total (was 1657) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
29s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 11s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}102m 
24s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}228m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6050 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845663/YARN-6050.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 96646c6e234d 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 
10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a0a2761 |
|

[jira] [Commented] (YARN-6009) RM fails to start during an upgrade - Failed to load/recover state (YarnException: Invalid application timeout, value=0 for type=LIFETIME)

2017-01-04 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800249#comment-15800249
 ] 

Rohith Sharma K S commented on YARN-6009:
-

bq. What is the impact of allowing an app to be recovered with a 0 timeout 
value?
No impact, just by passing to make service up. 

bq. Is that going to cause the recovered app to immediately expire?
Noting happens to app. App will NOT be monitored for timeout. Moreover, App 
with 0 timeout was not monitored before RM restart also.

bq. Also, you may as well do the validation before the queue mapping.
Both queue mapping and timeout check are pre-validation process. The order of 
preference should not matter.

> RM fails to start during an upgrade - Failed to load/recover state 
> (YarnException: Invalid application timeout, value=0 for type=LIFETIME)
> --
>
> Key: YARN-6009
> URL: https://issues.apache.org/jira/browse/YARN-6009
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Gour Saha
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-6009.01.patch
>
>
> ResourceManager fails to start during an upgrade with the following 
> exceptions - 
> Exception 1:
> {color:red}
> {code}
> 2016-12-09 14:57:23,508 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:initScheduler(328)) - Initialized CapacityScheduler 
> with calculator=class 
> org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, 
> minimumAllocation=<>, 
> maximumAllocation=<>, asynchronousScheduling=false, 
> asyncScheduleInterval=5ms
> 2016-12-09 14:57:23,509 WARN  ha.ActiveStandbyElector 
> (ActiveStandbyElector.java:becomeActive(863)) - Exception handling the 
> winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:129)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:859)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:463)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:611)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when 
> transitioning to Active mode
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:318)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127)
> ... 4 more
> Caused by: org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.yarn.exceptions.YarnException: Invalid application timeout, 
> value=0 for type=LIFETIME
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:991)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1032)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1028)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1028)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:313)
> ... 5 more
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Invalid 
> application timeout, value=0 for type=LIFETIME
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.validateApplicationTimeouts(RMServerUtils.java:305)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:365)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:330)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:463)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1184)
> at 
>

[jira] [Commented] (YARN-4779) Fix AM container allocation logic in SLS

2017-01-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800225#comment-15800225
 ] 

Hadoop QA commented on YARN-4779:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 11s{color} | {color:orange} hadoop-tools/hadoop-sls: The patch generated 13 
new + 74 unchanged - 9 fixed = 87 total (was 83) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
34s{color} | {color:red} hadoop-tools/hadoop-sls generated 1 new + 0 unchanged 
- 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} hadoop-tools_hadoop-sls generated 0 new + 20 
unchanged - 3 fixed = 20 total (was 23) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
52s{color} | {color:green} hadoop-sls in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-tools/hadoop-sls |
|  |  Null passed for non-null parameter of 
AMSimulator.notifyAMContainerLaunched(Container) in 
org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.notifyAMContainerLaunched(Container)
  Method invoked at MRAMSimulator.java:of 
AMSimulator.notifyAMContainerLaunched(Container) in 
org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.notifyAMContainerLaunched(Container)
  Method invoked at MRAMSimulator.java:[line 152] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-4779 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12833225/YARN-4779.4.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 9e2f8f449804 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 5ed63e3 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14559/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-sls.txt
 |
| findbugs |

[jira] [Commented] (YARN-4899) Queue metrics of SLS capacity scheduler only activated after app submit to the queue

2017-01-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800216#comment-15800216
 ] 

Hudson commented on YARN-4899:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11073 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11073/])
YARN-4899. Queue metrics of SLS capacity scheduler only activated after 
(wangda: rev 5ed63e3e9d9937cf7441b7ceb5feafbf486f3387)
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSCapacityScheduler.java


> Queue metrics of SLS capacity scheduler only activated after app submit to 
> the queue
> 
>
> Key: YARN-4899
> URL: https://issues.apache.org/jira/browse/YARN-4899
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Jonathan Hung
>  Labels: oct16-medium
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-4899.1.patch, YARN-4899.2.patch
>
>
> We should start recording queue metrics since cluster start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4779) Fix AM container allocation logic in SLS

2017-01-04 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800173#comment-15800173
 ] 

Sunil G commented on YARN-4779:
---

Path looks fine for me. I ll help to rebase the same.

> Fix AM container allocation logic in SLS
> 
>
> Key: YARN-4779
> URL: https://issues.apache.org/jira/browse/YARN-4779
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>  Labels: oct16-medium
> Attachments: YARN-4779.1.patch, YARN-4779.2.patch, YARN-4779.3.patch, 
> YARN-4779.4.patch
>
>
> Currently, SLS uses unmanaged AM for simulated map-reduce applications. And 
> first allocated container for each app is considered to be the master 
> container.
> This could be problematic when preemption happens. CapacityScheduler preempt 
> AM container at lowest priority, but the simulated AM container isn't 
> recognized by scheduler -- it is a normal container from scheduler's 
> perspective.
> This JIRA tries to fix this logic: do the real AM allocation instead of using 
> unmanaged AM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800162#comment-15800162
 ] 

Hadoop QA commented on YARN-5959:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 16 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 6s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m  
4s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
55s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
59s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
 2s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
48s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
22s{color} | {color:green} YARN-5085 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  9m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
17s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 59s{color} | {color:orange} root: The patch generated 37 new + 1695 
unchanged - 18 fixed = 1732 total (was 1713) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
31s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 912 unchanged - 1 fixed = 913 total (was 913) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
37s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
38s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 39m  
0s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
14s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
44s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
5s{color} | {color:green} hadoop-sls in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}159m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5959 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845664/YARN-5959-YARN-5085.005.patch
 |
| Optional

[jira] [Commented] (YARN-4899) Queue metrics of SLS capacity scheduler only activated after app submit to the queue

2017-01-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800148#comment-15800148
 ] 

Wangda Tan commented on YARN-4899:
--

Committing...

> Queue metrics of SLS capacity scheduler only activated after app submit to 
> the queue
> 
>
> Key: YARN-4899
> URL: https://issues.apache.org/jira/browse/YARN-4899
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Jonathan Hung
>  Labels: oct16-medium
> Attachments: YARN-4899.1.patch, YARN-4899.2.patch
>
>
> We should start recording queue metrics since cluster start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-4899) Queue metrics of SLS capacity scheduler only activated after app submit to the queue

2017-01-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4899:
-
Assignee: Jonathan Hung  (was: Wangda Tan)

> Queue metrics of SLS capacity scheduler only activated after app submit to 
> the queue
> 
>
> Key: YARN-4899
> URL: https://issues.apache.org/jira/browse/YARN-4899
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Jonathan Hung
>  Labels: oct16-medium
> Attachments: YARN-4899.1.patch, YARN-4899.2.patch
>
>
> We should start recording queue metrics since cluster start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-01-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800128#comment-15800128
 ] 

Wangda Tan commented on YARN-6050:
--

Thanks [~rkanter] for updating patch. A couple of comments/questions:

Not sure if code below works well:

{code}  
442   List amReqs =
443   submissionContext.getAMContainerResourceRequests();
444   if (amReqs == null || amReqs.isEmpty()) {
445 @SuppressWarnings("deprecation") ResourceRequest amReq =
446 submissionContext.getAMContainerResourceRequest();
{code}

Since now {{getAMContainerResourceRequests}} and 
{{getAMContainerResourceRequest}} shares the same data field, so if line 444 is 
null, line 445 should be null as well, correct? 

So I think we should not use {{getAMContainerResourceRequest}} at all, to make 
a backward-compatible behavior, if AM resource request has length == 1, we can 
change its resourceName to ANY instead of through exception, correct? 

And when length > 1, we can check existence of ANY request and throw exception.

Thoughts?

> AMs can't be scheduled on racks or nodes
> 
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch, 
> YARN-6050.003.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800110#comment-15800110
 ] 

Wangda Tan commented on YARN-5959:
--

+1, pending Jenkins, will wait 1 day before commit.

[~asuresh], could you update a patch for trunk?

> RM changes to support change of container ExecutionType
> ---
>
> Key: YARN-5959
> URL: https://issues.apache.org/jira/browse/YARN-5959
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5959-YARN-5085.001.patch, 
> YARN-5959-YARN-5085.002.patch, YARN-5959-YARN-5085.003.patch, 
> YARN-5959-YARN-5085.004.patch, YARN-5959-YARN-5085.005.patch, 
> YARN-5959.combined.001.patch, YARN-5959.wip.002.patch, 
> YARN-5959.wip.003.patch, YARN-5959.wip.patch
>
>
> RM side changes to allow an AM to ask for change of ExecutionType.
> Currently, there are two cases:
> # *Promotion* : OPPORTUNISTIC to GUARANTEED.
> # *Demotion* : GUARANTEED to OPPORTUNISTIC.
> This is similar in YARN-1197 which allows for change in Container resources. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps

2017-01-04 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800030#comment-15800030
 ] 

Varun Saxena commented on YARN-6027:


bq. We could consider the fromId as (user + flow), right?
Yes, we can do that. Pagination would require specifying both flow and user 
anyways. Although, the current order of row key means that same flow executed 
by different users may not come in a sequence from backend. In UI though, would 
user prefer same flow name appearing together if run with different users (when 
we are accessing UI with admin user) ? Frankly I do not necessarily have a 
strong case for changing the order other than this point. I guess once it 
becomes clear that flow is differentiated by user as well, even this should be 
fine.
I do acknowledge that pagination can still be achieved with current structure 
of row key.

By the way, an additional point is that  we will have to either introduce 
delimiter or provide 2 filters à la fromIdPrefix and fromId to achieve it. And 
if we use delimiters, we will have to encode or escape delimiter as well if its 
part of user (as both user and flow are variable length strings).

> Support fromId for flows/flowrun apps
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5831) Propagate allowPreemptionFrom flag all the way down to the app

2017-01-04 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799978#comment-15799978
 ] 

Yufei Gu commented on YARN-5831:


Thanks [~kasha] for the review. I uploaded new patch for your comments.

> Propagate allowPreemptionFrom flag all the way down to the app
> --
>
> Key: YARN-5831
> URL: https://issues.apache.org/jira/browse/YARN-5831
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
> Attachments: YARN-5831.001.patch, YARN-5831.002.patch, 
> YARN-5831.003.patch
>
>
> FairScheduler allows disallowing preemption from a queue. When checking if 
> preemption for an application is allowed, the new preemption code recurses 
> all the way to the root queue to check this flag. 
> Propagating this information all the way to the app will be more efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5831) Propagate allowPreemptionFrom flag all the way down to the app

2017-01-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-5831:
---
Attachment: YARN-5831.003.patch

> Propagate allowPreemptionFrom flag all the way down to the app
> --
>
> Key: YARN-5831
> URL: https://issues.apache.org/jira/browse/YARN-5831
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
> Attachments: YARN-5831.001.patch, YARN-5831.002.patch, 
> YARN-5831.003.patch
>
>
> FairScheduler allows disallowing preemption from a queue. When checking if 
> preemption for an application is allowed, the new preemption code recurses 
> all the way to the root queue to check this flag. 
> Propagating this information all the way to the app will be more efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5554) MoveApplicationAcrossQueues does not check user permission on the target queue

2017-01-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799926#comment-15799926
 ] 

Hadoop QA commented on YARN-5554:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 19s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 77 unchanged - 4 fixed = 78 total (was 81) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 58s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5554 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845659/YARN-5554.13.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d32e68812e1d 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 
10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a0a2761 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14554/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14554/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14554/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output |

[jira] [Updated] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-04 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5959:
--
Attachment: YARN-5959-YARN-5085.005.patch

Updating patch v005
Some minor updates: Moved ContainerUpdateRequests to its own class and removed 
updateErrors from it.

> RM changes to support change of container ExecutionType
> ---
>
> Key: YARN-5959
> URL: https://issues.apache.org/jira/browse/YARN-5959
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5959-YARN-5085.001.patch, 
> YARN-5959-YARN-5085.002.patch, YARN-5959-YARN-5085.003.patch, 
> YARN-5959-YARN-5085.004.patch, YARN-5959-YARN-5085.005.patch, 
> YARN-5959.combined.001.patch, YARN-5959.wip.002.patch, 
> YARN-5959.wip.003.patch, YARN-5959.wip.patch
>
>
> RM side changes to allow an AM to ask for change of ExecutionType.
> Currently, there are two cases:
> # *Promotion* : OPPORTUNISTIC to GUARANTEED.
> # *Demotion* : GUARANTEED to OPPORTUNISTIC.
> This is similar in YARN-1197 which allows for change in Container resources. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-01-04 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-6050:

Attachment: YARN-6050.003.patch

The 003 patch address the latest comments from [~leftnoteasy] and [~templedf]:
- Fixes javac warnings and practical checkstyle warnings
- Adds a check that you have exactly 1 ANY (offswitch) request
- Forces non-offswitch requests to have the same capability, execution type, 
num containers, and priority as the offswitch request
- Added tests for the above conditions to {{TestAppManager}}
- Updated previous tests in {{TestAppManager}} to use clones of the 
{{ResourceRequest}} objects because the code now modifies them and so otherwise 
checking that they were equal would always be true.

[~leftnoteasy], {{RMAppAttemptImpl}} already hardcodes the priority and number 
of containers to the correct values for an AM, so I didn't do that.

> AMs can't be scheduled on racks or nodes
> 
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch, 
> YARN-6050.003.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-1279) Expose a client API to allow clients to figure if log aggregation is complete

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-1279:
-
Fix Version/s: (was: 2.8.0)

> Expose a client API to allow clients to figure if log aggregation is complete
> -
>
> Key: YARN-1279
> URL: https://issues.apache.org/jira/browse/YARN-1279
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Arun C Murthy
>Assignee: Xuan Gong
> Fix For: 3.0.0-alpha1
>
> Attachments: YARN-1279.1.patch, YARN-1279.11.patch, 
> YARN-1279.2.patch, YARN-1279.2.patch, YARN-1279.3.patch, YARN-1279.3.patch, 
> YARN-1279.4.patch, YARN-1279.4.patch, YARN-1279.5.patch, YARN-1279.6.patch, 
> YARN-1279.7.patch, YARN-1279.8.patch, YARN-1279.8.patch, YARN-1279.9.patch
>
>
> Expose a client API to allow clients to figure if log aggregation is complete



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-04 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5959:
--
Attachment: YARN-5959-YARN-5085.004.patch

Updated patch. Do give it another review [~leftnoteasy]

bq.  Can we make scheduler.allocate to have a single ContainerUpdateRequests, 
which include 4 lists (increase/decrease/promotion/demotion) instead of making 
all of them become top-level parameter, similarly for Allocation.
I've created a holder class ContainerUpdateContext.ContainerUpdateRequests 
which is created in the RMServerUtils#validate.. method and passed around.
I don't think we need one for the Allocation object, since technically, it 
itself is a holder for all the lists.

Agree to all other suggestions, and its incorporated in the patch.

> RM changes to support change of container ExecutionType
> ---
>
> Key: YARN-5959
> URL: https://issues.apache.org/jira/browse/YARN-5959
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5959-YARN-5085.001.patch, 
> YARN-5959-YARN-5085.002.patch, YARN-5959-YARN-5085.003.patch, 
> YARN-5959-YARN-5085.004.patch, YARN-5959.combined.001.patch, 
> YARN-5959.wip.002.patch, YARN-5959.wip.003.patch, YARN-5959.wip.patch
>
>
> RM side changes to allow an AM to ask for change of ExecutionType.
> Currently, there are two cases:
> # *Promotion* : OPPORTUNISTIC to GUARANTEED.
> # *Demotion* : GUARANTEED to OPPORTUNISTIC.
> This is similar in YARN-1197 which allows for change in Container resources. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-04 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799772#comment-15799772
 ] 

Arun Suresh edited comment on YARN-5959 at 1/5/17 12:06 AM:


Updated patch. Do give it another review [~leftnoteasy]

bq.  Can we make scheduler.allocate to have a single ContainerUpdateRequests, 
which include 4 lists (increase/decrease/promotion/demotion) instead of making 
all of them become top-level parameter, similarly for Allocation.
I've created a holder class ContainerUpdateContext.ContainerUpdateRequests 
which is created in the RMServerUtils#validate.. method and passed around.
I don't think we need one for the Allocation object, since technically, it 
itself is a holder for all the lists of allocated/updated containers.

Agree to all other suggestions, and its incorporated in the patch.


was (Author: asuresh):
Updated patch. Do give it another review [~leftnoteasy]

bq.  Can we make scheduler.allocate to have a single ContainerUpdateRequests, 
which include 4 lists (increase/decrease/promotion/demotion) instead of making 
all of them become top-level parameter, similarly for Allocation.
I've created a holder class ContainerUpdateContext.ContainerUpdateRequests 
which is created in the RMServerUtils#validate.. method and passed around.
I don't think we need one for the Allocation object, since technically, it 
itself is a holder for all the lists.

Agree to all other suggestions, and its incorporated in the patch.

> RM changes to support change of container ExecutionType
> ---
>
> Key: YARN-5959
> URL: https://issues.apache.org/jira/browse/YARN-5959
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5959-YARN-5085.001.patch, 
> YARN-5959-YARN-5085.002.patch, YARN-5959-YARN-5085.003.patch, 
> YARN-5959-YARN-5085.004.patch, YARN-5959.combined.001.patch, 
> YARN-5959.wip.002.patch, YARN-5959.wip.003.patch, YARN-5959.wip.patch
>
>
> RM side changes to allow an AM to ask for change of ExecutionType.
> Currently, there are two cases:
> # *Promotion* : OPPORTUNISTIC to GUARANTEED.
> # *Demotion* : GUARANTEED to OPPORTUNISTIC.
> This is similar in YARN-1197 which allows for change in Container resources. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5554) MoveApplicationAcrossQueues does not check user permission on the target queue

2017-01-04 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799765#comment-15799765
 ] 

Daniel Templeton commented on YARN-5554:


One more detail: in the new comment on the old {{checkAccess()}}, "allow users 
access and view the apps" should be "allow users to access and view the apps."

I feel like we're asymptotically approaching the finish line. :)  Thanks for 
hanging in there.

> MoveApplicationAcrossQueues does not check user permission on the target queue
> --
>
> Key: YARN-5554
> URL: https://issues.apache.org/jira/browse/YARN-5554
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Haibo Chen
>Assignee: Wilfred Spiegelenburg
>  Labels: oct16-medium
> Attachments: YARN-5554.10.patch, YARN-5554.11.patch, 
> YARN-5554.12.patch, YARN-5554.13.patch, YARN-5554.2.patch, YARN-5554.3.patch, 
> YARN-5554.4.patch, YARN-5554.5.patch, YARN-5554.6.patch, YARN-5554.7.patch, 
> YARN-5554.8.patch, YARN-5554.9.patch
>
>
> moveApplicationAcrossQueues operation currently does not check user 
> permission on the target queue. This incorrectly allows one user to move 
> his/her own applications to a queue that the user has no access to



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5980) Update documentation for single node hbase deploy

2017-01-04 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799761#comment-15799761
 ] 

Sangjin Lee commented on YARN-5980:
---

Thanks for the patch [~vrushalic]!

One small nit: let's properly capitalize HBase ("hbase" -> "HBase").

Also, +1 on Varun's comment on the HBase version.

> Update documentation for single node hbase deploy
> -
>
> Key: YARN-5980
> URL: https://issues.apache.org/jira/browse/YARN-5980
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Vrushali C
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-5980.001.patch
>
>
> Per HBASE-17272, a single node hbase deployment (single jvm running daemons + 
> hdfs writes) will be added to hbase shortly. 
> We should update the timeline service documentation in the setup/deployment 
> context accordingly, this will help users who are a bit wary of hbase 
> deployments help get started with timeline service more easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5554) MoveApplicationAcrossQueues does not check user permission on the target queue

2017-01-04 Thread Wilfred Spiegelenburg (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-5554:

Attachment: YARN-5554.13.patch

Fixed up the comments as per the feedback

> MoveApplicationAcrossQueues does not check user permission on the target queue
> --
>
> Key: YARN-5554
> URL: https://issues.apache.org/jira/browse/YARN-5554
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Haibo Chen
>Assignee: Wilfred Spiegelenburg
>  Labels: oct16-medium
> Attachments: YARN-5554.10.patch, YARN-5554.11.patch, 
> YARN-5554.12.patch, YARN-5554.13.patch, YARN-5554.2.patch, YARN-5554.3.patch, 
> YARN-5554.4.patch, YARN-5554.5.patch, YARN-5554.6.patch, YARN-5554.7.patch, 
> YARN-5554.8.patch, YARN-5554.9.patch
>
>
> moveApplicationAcrossQueues operation currently does not check user 
> permission on the target queue. This incorrectly allows one user to move 
> his/her own applications to a queue that the user has no access to



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5830) Avoid preempting AM containers

2017-01-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799749#comment-15799749
 ] 

Hadoop QA commented on YARN-5830:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 22s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 4 new + 1 unchanged - 1 fixed = 5 total (was 2) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
32s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 46s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Should 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread$PreemptableContainers
 be a _static_ inner class?  At FSPreemptionThread.java:inner class?  At 
FSPreemptionThread.java:[lines 259-262] |
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5830 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845633/YARN-5830.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2a77407d1073 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a0a2761 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14553/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| findbugs |

[jira] [Commented] (YARN-4164) Retrospect update ApplicationPriority API return type

2017-01-04 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799750#comment-15799750
 ] 

Jian He commented on YARN-4164:
---

merged the patch to 2.8 too


> Retrospect update ApplicationPriority API return type
> -
>
> Key: YARN-4164
> URL: https://issues.apache.org/jira/browse/YARN-4164
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1
>
> Attachments: 0001-YARN-4164.patch, 0002-YARN-4164.patch, 
> 0003-YARN-4164.patch, 0004-YARN-4164.patch
>
>
> Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API 
> returns empty UpdateApplicationPriorityResponse response.
> But RM update priority to the cluster.max-priority if the given priority is 
> greater than cluster.max-priority. In this scenarios, need to intimate back 
> to client that updated  priority rather just keeping quite where client 
> assumes that given priority itself is taken.
> During application submission also has same scenario can happen, but I feel 
> when 
> explicitly invoke via ApplicationClientProtocol#updateApplicationPriority(), 
> response should have updated priority in response. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-4164) Retrospect update ApplicationPriority API return type

2017-01-04 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-4164:
--
Fix Version/s: 2.8.0

> Retrospect update ApplicationPriority API return type
> -
>
> Key: YARN-4164
> URL: https://issues.apache.org/jira/browse/YARN-4164
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1
>
> Attachments: 0001-YARN-4164.patch, 0002-YARN-4164.patch, 
> 0003-YARN-4164.patch, 0004-YARN-4164.patch
>
>
> Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API 
> returns empty UpdateApplicationPriorityResponse response.
> But RM update priority to the cluster.max-priority if the given priority is 
> greater than cluster.max-priority. In this scenarios, need to intimate back 
> to client that updated  priority rather just keeping quite where client 
> assumes that given priority itself is taken.
> During application submission also has same scenario can happen, but I feel 
> when 
> explicitly invoke via ApplicationClientProtocol#updateApplicationPriority(), 
> response should have updated priority in response. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3225) New parameter or CLI for decommissioning node gracefully in RMAdmin CLI

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799690#comment-15799690
 ] 

Junping Du commented on YARN-3225:
--

Sorry for missing this. I felt the same pain now in releasing 2.8.0. :(

> New parameter or CLI for decommissioning node gracefully in RMAdmin CLI
> ---
>
> Key: YARN-3225
> URL: https://issues.apache.org/jira/browse/YARN-3225
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful
>Reporter: Junping Du
>Assignee: Devaraj K
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-3225-1.patch, YARN-3225-2.patch, YARN-3225-3.patch, 
> YARN-3225-4.patch, YARN-3225-5.patch, YARN-3225.patch, YARN-914.patch
>
>
> New CLI (or existing CLI with parameters) should put each node on 
> decommission list to decommissioning status and track timeout to terminate 
> the nodes that haven't get finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3667) Fix findbugs warning Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3667:
-
Fix Version/s: (was: 2.8.0)

> Fix findbugs warning Inconsistent synchronization of 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
> -
>
> Key: YARN-3667
> URL: https://issues.apache.org/jira/browse/YARN-3667
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Minor
> Attachments: YARN-3667.000.patch
>
>
> Fix findbugs warning Inconsistent synchronization of 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
> This  findbugs warning is reported at 
> https://builds.apache.org/job/PreCommit-YARN-Build/7956/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3737) Add support for recovery of reserved apps (running under dynamic queues) to Fair Scheduler

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799686#comment-15799686
 ] 

Junping Du commented on YARN-3737:
--

Please mark fix version only when patch get committed in. thx!

> Add support for recovery of reserved apps (running under dynamic queues) to 
> Fair Scheduler
> --
>
> Key: YARN-3737
> URL: https://issues.apache.org/jira/browse/YARN-3737
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>
> YARN-3736 persists the current state of the Plan to the RMStateStore. This 
> JIRA covers recovery of the Plan, i.e. dynamic reservation queues with 
> associated apps as part Fair Scheduler failover mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3737) Add support for recovery of reserved apps (running under dynamic queues) to Fair Scheduler

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3737:
-
Fix Version/s: (was: 2.8.0)

> Add support for recovery of reserved apps (running under dynamic queues) to 
> Fair Scheduler
> --
>
> Key: YARN-3737
> URL: https://issues.apache.org/jira/browse/YARN-3737
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>
> YARN-3736 persists the current state of the Plan to the RMStateStore. This 
> JIRA covers recovery of the Plan, i.e. dynamic reservation queues with 
> associated apps as part Fair Scheduler failover mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3752) TestRMFailover fails due to intermittent UnknownHostException

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3752:
-
Fix Version/s: (was: 2.8.0)

> TestRMFailover fails due to intermittent UnknownHostException
> -
>
> Key: YARN-3752
> URL: https://issues.apache.org/jira/browse/YARN-3752
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
>
> Client fails to create connection due to UnknownHostException while client 
> retries to connect to next RM after failover in unit test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3843) Fair Scheduler should not accept apps with space keys as queue name

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3843:
-
Fix Version/s: (was: 2.8.0)

> Fair Scheduler should not accept apps with space keys as queue name
> ---
>
> Key: YARN-3843
> URL: https://issues.apache.org/jira/browse/YARN-3843
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.0, 2.5.0
>Reporter: Dongwook Kwon
>Priority: Minor
> Attachments: YARN-3843.01.patch
>
>
> As YARN-461, since empty string queue name is not valid, queue name with 
> space keys such as " " ,"   " should not be accepted either, also not as 
> prefix nor postfix. 
> e.g) "root.test.queuename  ", or "root.test. queuename"
> I have 2 specific cases kill RM with these space keys as part of queue name.
> 1) Without placement policy (hadoop 2.4.0 and above), 
> When a job is submitted with " "(space key) as queue name
> e.g) mapreduce.job.queuename=" "
> 2) With placement policy (hadoop 2.5.0 and above)
>  Once a job is submitted without space key as queue name, and submit another 
> job with space key.
> e.g) 1st time: mapreduce.job.queuename="root.test.user1" 
> 2nd time: mapreduce.job.queuename="root.test.user1 "
> {code}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.974 sec <<< 
> FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
> testQueueNameWithSpace(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler)
>   Time elapsed: 0.724 sec  <<< ERROR!
> org.apache.hadoop.metrics2.MetricsException: Metrics source 
> QueueMetrics,q0=root,q1=adhoc,q2=birvine already exists!
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:135)
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:112)
>   at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:218)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics.forQueue(FSQueueMetrics.java:96)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.(FSQueue.java:56)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.(FSLeafQueue.java:66)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.createQueue(QueueManager.java:169)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:120)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:88)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.assignToQueue(FairScheduler.java:660)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:569)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1127)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testQueueNameWithSpace(TestFairScheduler.java:627)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4164) Retrospect update ApplicationPriority API return type

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799668#comment-15799668
 ] 

Junping Du commented on YARN-4164:
--

This patch goes to branch-2 only instead of branch-2.8, set 2.9 as fix version.

> Retrospect update ApplicationPriority API return type
> -
>
> Key: YARN-4164
> URL: https://issues.apache.org/jira/browse/YARN-4164
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: 0001-YARN-4164.patch, 0002-YARN-4164.patch, 
> 0003-YARN-4164.patch, 0004-YARN-4164.patch
>
>
> Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API 
> returns empty UpdateApplicationPriorityResponse response.
> But RM update priority to the cluster.max-priority if the given priority is 
> greater than cluster.max-priority. In this scenarios, need to intimate back 
> to client that updated  priority rather just keeping quite where client 
> assumes that given priority itself is taken.
> During application submission also has same scenario can happen, but I feel 
> when 
> explicitly invoke via ApplicationClientProtocol#updateApplicationPriority(), 
> response should have updated priority in response. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-4164) Retrospect update ApplicationPriority API return type

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4164:
-
Fix Version/s: (was: 2.8.0)
   2.9.0

> Retrospect update ApplicationPriority API return type
> -
>
> Key: YARN-4164
> URL: https://issues.apache.org/jira/browse/YARN-4164
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: 0001-YARN-4164.patch, 0002-YARN-4164.patch, 
> 0003-YARN-4164.patch, 0004-YARN-4164.patch
>
>
> Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API 
> returns empty UpdateApplicationPriorityResponse response.
> But RM update priority to the cluster.max-priority if the given priority is 
> greater than cluster.max-priority. In this scenarios, need to intimate back 
> to client that updated  priority rather just keeping quite where client 
> assumes that given priority itself is taken.
> During application submission also has same scenario can happen, but I feel 
> when 
> explicitly invoke via ApplicationClientProtocol#updateApplicationPriority(), 
> response should have updated priority in response. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4329) Allow fetching exact reason as to why a submitted app is in ACCEPTED state in Fair Scheduler

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799661#comment-15799661
 ] 

Junping Du commented on YARN-4329:
--

This patch goes to branch-2 only instead of branch-2.8, set 2.9 as fix version.

> Allow fetching exact reason as to why a submitted app is in ACCEPTED state in 
> Fair Scheduler
> 
>
> Key: YARN-4329
> URL: https://issues.apache.org/jira/browse/YARN-4329
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Yufei Gu
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: Screen Shot 2016-10-18 at 3.13.59 PM.png, 
> YARN-4329.001.patch, YARN-4329.002.patch, YARN-4329.003.patch, 
> YARN-4329.004.patch, YARN-4329.005.patch
>
>
> Similar to YARN-3946, it would be useful to capture possible reason why the 
> Application is in accepted state in FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-4329) Allow fetching exact reason as to why a submitted app is in ACCEPTED state in Fair Scheduler

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4329:
-
Fix Version/s: (was: 2.8.0)
   2.9.0

> Allow fetching exact reason as to why a submitted app is in ACCEPTED state in 
> Fair Scheduler
> 
>
> Key: YARN-4329
> URL: https://issues.apache.org/jira/browse/YARN-4329
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Yufei Gu
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: Screen Shot 2016-10-18 at 3.13.59 PM.png, 
> YARN-4329.001.patch, YARN-4329.002.patch, YARN-4329.003.patch, 
> YARN-4329.004.patch, YARN-4329.005.patch
>
>
> Similar to YARN-3946, it would be useful to capture possible reason why the 
> Application is in accepted state in FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4832) NM side resource value should get updated if change applied in RM side

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799650#comment-15799650
 ] 

Junping Du commented on YARN-4832:
--

This patch get missed in branch-2.8 and I just backport it. [~raviprak], thanks 
for comments here. Looks like we are missing to check this part - just filed 
YARN-6055 to address that.

> NM side resource value should get updated if change applied in RM side
> --
>
> Key: YARN-4832
> URL: https://issues.apache.org/jira/browse/YARN-4832
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-4832-addendum.patch, YARN-4832-branch-2.patch, 
> YARN-4832-demo.patch, YARN-4832-v2.patch, YARN-4832-v3.patch, YARN-4832.patch
>
>
> Now, if we execute CLI to update node (single or multiple) resource in RM 
> side, NM will not receive any notification. It doesn't affect resource 
> scheduling but will make resource usage metrics reported by NM a bit weird. 
> We should sync up new resource between RM and NM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6055) ContainersMonitorImpl need be adjusted when NM resource changed.

2017-01-04 Thread Junping Du (JIRA)

Junping Du created YARN-6055:


 Summary: ContainersMonitorImpl need be adjusted when NM resource 
changed.
 Key: YARN-6055
 URL: https://issues.apache.org/jira/browse/YARN-6055
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Junping Du
Assignee: Junping Du


Per Ravi's comments in YARN-4832, we need to check some limits in 
containerMonitorImpl to make sure it get updated also when Resource updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5864) YARN Capacity Scheduler - Queue Priorities

2017-01-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799628#comment-15799628
 ] 

Wangda Tan commented on YARN-5864:
--

Thanks [~eepayne] to review the design doc.

bq. My understanding is that this containers on under-utilized queues won't be 
preempted unless a higher priority queue is asking.
It is true, but not all phase I/II means. 

Here's an example of Phase I/II: 

Queue A/B/C/D/E has priority A > B = C > D > E

Assume A is under utilized and has pending ask, B/C are over-utilized, and D/E 
are under utilized without pending ask .

To satisfy request of A:
- For phase I, we will first try to preempt from B/C since they're 
over-utilized (even if they have higher priority comparing to D/C), if allocate 
resource of B/C are not enough, or locality doesn't much ..
- Phase II, we will continue preempt resource from D/E. 

Hope this answers your question. 

> YARN Capacity Scheduler - Queue Priorities
> --
>
> Key: YARN-5864
> URL: https://issues.apache.org/jira/browse/YARN-5864
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5864.poc-0.patch, 
> YARN-CapacityScheduler-Queue-Priorities-design-v1.pdf
>
>
> Currently, Capacity Scheduler at every parent-queue level uses relative 
> used-capacities of the chil-queues to decide which queue can get next 
> available resource first.
> For example,
> - Q1 & Q2 are child queues under queueA
> - Q1 has 20% of configured capacity, 5% of used-capacity and
> - Q2 has 80% of configured capacity, 8% of used-capacity.
> In the situation, the relative used-capacities are calculated as below
> - Relative used-capacity of Q1 is 5/20 = 0.25
> - Relative used-capacity of Q2 is 8/80 = 0.10
> In the above example, per today’s Capacity Scheduler’s algorithm, Q2 is 
> selected by the scheduler first to receive next available resource.
> Simply ordering queues according to relative used-capacities sometimes causes 
> a few troubles because scarce resources could be assigned to less-important 
> apps first.
> # Latency sensitivity: This can be a problem with latency sensitive 
> applications where waiting till the ‘other’ queue gets full is not going to 
> cut it. The delay in scheduling directly reflects in the response times of 
> these applications.
> # Resource fragmentation for large-container apps: Today’s algorithm also 
> causes issues with applications that need very large containers. It is 
> possible that existing queues are all within their resource guarantees but 
> their current allocation distribution on each node may be such that an 
> application which needs large container simply cannot fit on those nodes.
> Services:
> # The above problem (2) gets worse with long running applications. With short 
> running apps, previous containers may eventually finish and make enough space 
> for the apps with large containers. But with long running services in the 
> cluster, the large containers’ application may never get resources on any 
> nodes even if its demands are not yet met.
> # Long running services are sometimes more picky w.r.t placement than normal 
> batch apps. For example, for a long running service in a separate queue (say 
> queue=service), during peak hours it may want to launch instances on 50% of 
> the cluster nodes. On each node, it may want to launch a large container, say 
> 200G memory per container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5864) YARN Capacity Scheduler - Queue Priorities

2017-01-04 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799610#comment-15799610
 ] 

Eric Payne commented on YARN-5864:
--

Thanks [~leftnoteasy] for the design doc. It makes sense, and I just have one 
comment:
{quote}
- Phase II: If the first phase doesnt yield enough resources, we proceed to the 
second phase where we look at under-utilized queues too.
-- Sort queues similar to the first phase, and continue reclamation from 
under-utilized queues.
{quote}
My understanding is that this containers on under-utilized queues won't be preem
pted unless a higher priority queue is asking. Can you please clarify that in th
is section?


> YARN Capacity Scheduler - Queue Priorities
> --
>
> Key: YARN-5864
> URL: https://issues.apache.org/jira/browse/YARN-5864
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5864.poc-0.patch, 
> YARN-CapacityScheduler-Queue-Priorities-design-v1.pdf
>
>
> Currently, Capacity Scheduler at every parent-queue level uses relative 
> used-capacities of the chil-queues to decide which queue can get next 
> available resource first.
> For example,
> - Q1 & Q2 are child queues under queueA
> - Q1 has 20% of configured capacity, 5% of used-capacity and
> - Q2 has 80% of configured capacity, 8% of used-capacity.
> In the situation, the relative used-capacities are calculated as below
> - Relative used-capacity of Q1 is 5/20 = 0.25
> - Relative used-capacity of Q2 is 8/80 = 0.10
> In the above example, per today’s Capacity Scheduler’s algorithm, Q2 is 
> selected by the scheduler first to receive next available resource.
> Simply ordering queues according to relative used-capacities sometimes causes 
> a few troubles because scarce resources could be assigned to less-important 
> apps first.
> # Latency sensitivity: This can be a problem with latency sensitive 
> applications where waiting till the ‘other’ queue gets full is not going to 
> cut it. The delay in scheduling directly reflects in the response times of 
> these applications.
> # Resource fragmentation for large-container apps: Today’s algorithm also 
> causes issues with applications that need very large containers. It is 
> possible that existing queues are all within their resource guarantees but 
> their current allocation distribution on each node may be such that an 
> application which needs large container simply cannot fit on those nodes.
> Services:
> # The above problem (2) gets worse with long running applications. With short 
> running apps, previous containers may eventually finish and make enough space 
> for the apps with large containers. But with long running services in the 
> cluster, the large containers’ application may never get resources on any 
> nodes even if its demands are not yet met.
> # Long running services are sometimes more picky w.r.t placement than normal 
> batch apps. For example, for a long running service in a separate queue (say 
> queue=service), during peak hours it may want to launch instances on 50% of 
> the cluster nodes. On each node, it may want to launch a large container, say 
> 200G memory per container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6040) Remove usage of ResourceRequest from AppSchedulerInfo / SchedulerApplicationAttempt

2017-01-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799601#comment-15799601
 ] 

Hadoop QA commented on YARN-6040:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 26 new + 947 unchanged - 20 fixed = 973 total (was 967) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 38m 52s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6040 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845628/YARN-6040.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c908ef19e28d 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a0a2761 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14552/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14552/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14552/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|

[jira] [Updated] (YARN-3866) AM-RM protocol changes to support container resizing

2017-01-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3866:
-
Target Version/s: 2.9.0  (was: 2.8.0)

> AM-RM protocol changes to support container resizing
> 
>
> Key: YARN-3866
> URL: https://issues.apache.org/jira/browse/YARN-3866
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: MENG DING
>Assignee: MENG DING
>Priority: Blocker
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-3866-YARN-1197.4.patch, 
> YARN-3866-branch-2.8.addendum.1.patch, YARN-3866-branch-2.8.addendum.1.patch, 
> YARN-3866-branch-2.8.addendum.2.patch, YARN-3866-branch-2.8.addendum.3.patch, 
> YARN-3866-branch-2.8.addendum.4.patch, YARN-3866-branch-2.addendum.4.patch, 
> YARN-3866.1.patch, YARN-3866.2.patch, YARN-3866.3.patch
>
>
> YARN-1447 and YARN-1448 are outdated. 
> This ticket deals with AM-RM Protocol changes to support container resize 
> according to the latest design in YARN-1197.
> 1) Add increase/decrease requests in AllocateRequest
> 2) Get approved increase/decrease requests from RM in AllocateResponse
> 3) Add relevant test cases



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-01-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799582#comment-15799582
 ] 

Wangda Tan commented on YARN-6050:
--

[~templedf], it is not only for AM, all existing ResourceRequest treats 
ResourceRequest with resourceName=* separately. Please file a ticket to track 
documentation / javadocs changes 

I think we need some additional checks / logics
- First, for AM request list, we need to make sure it has a valid offswitch 
request
- Second, as mentioned by Daniel, it's better to check asks resources and 
node-label-expression, if non-offswitch request has different 
resourceName/node-label-expression, we should set it to same as offswitch 
request
- Third, I think we still need to force priority and #asks of AM request, there 
could be some existing logics assume AM request has a special priority.

It's better to add a test to make sure check/validation/normalization of 
RMAppManager works as expected.

Thoughts? 

> AMs can't be scheduled on racks or nodes
> 
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6022) Revert changes of AbstractResourceRequest

2017-01-04 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799573#comment-15799573
 ] 

Daniel Templeton commented on YARN-6022:


Fair point.  Fine, let's just get this in.  +1  I'll commit tomorrow if no one 
else pops up with any concerns.

> Revert changes of AbstractResourceRequest
> -
>
> Key: YARN-6022
> URL: https://issues.apache.org/jira/browse/YARN-6022
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-6022.001.patch, YARN-6022.002.patch, 
> YARN-6022.003.patch
>
>
> YARN-5774 added AbstractResourceRequest to make easier internal scheduler 
> change, this is not a correct approach: For example, with this change, we 
> need to make AbstractResourceRequest to be public/stable. And end users can 
> use it like:
> {code}
> AbstractResourceRequest request = ...
> request.setCapability(...)
> {code}
> But AbstractResourceRequest should not be visible by application at all. 
> We need to revert it from branch-2.8 / branch-2 / trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5136) Error in handling event type APP_ATTEMPT_REMOVED to the scheduler

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799558#comment-15799558
 ] 

Junping Du commented on YARN-5136:
--

This patch goes to branch-2 only instead of branch-2.8, set 2.9 as fix version.

> Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
> -
>
> Key: YARN-5136
> URL: https://issues.apache.org/jira/browse/YARN-5136
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: tangshangwen
>Assignee: Wilfred Spiegelenburg
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5136.1.patch, YARN-5136.2.patch
>
>
> move app cause rm exit
> {noformat}
> 2016-05-24 23:20:47,202 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type APP_ATTEMPT_REMOVED to the scheduler
> java.lang.IllegalStateException: Given app to remove 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt@ea94c3b
>  does not exist in queue [root.bdp_xx.bdp_mart_xx_formal, 
> demand=, running= vCores:13422>, share=, w= weight=1.0>]
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.removeApp(FSLeafQueue.java:119)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:779)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1231)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:114)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:680)
> at java.lang.Thread.run(Thread.java:745)
> 2016-05-24 23:20:47,202 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e04_1464073905025_15410_01_001759 Container Transitioned from 
> ACQUIRED to RELEASED
> 2016-05-24 23:20:47,202 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5136) Error in handling event type APP_ATTEMPT_REMOVED to the scheduler

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5136:
-
Fix Version/s: (was: 2.8.0)
   2.9.0

> Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
> -
>
> Key: YARN-5136
> URL: https://issues.apache.org/jira/browse/YARN-5136
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: tangshangwen
>Assignee: Wilfred Spiegelenburg
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5136.1.patch, YARN-5136.2.patch
>
>
> move app cause rm exit
> {noformat}
> 2016-05-24 23:20:47,202 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type APP_ATTEMPT_REMOVED to the scheduler
> java.lang.IllegalStateException: Given app to remove 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt@ea94c3b
>  does not exist in queue [root.bdp_xx.bdp_mart_xx_formal, 
> demand=, running= vCores:13422>, share=, w= weight=1.0>]
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.removeApp(FSLeafQueue.java:119)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:779)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1231)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:114)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:680)
> at java.lang.Thread.run(Thread.java:745)
> 2016-05-24 23:20:47,202 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e04_1464073905025_15410_01_001759 Container Transitioned from 
> ACQUIRED to RELEASED
> 2016-05-24 23:20:47,202 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5176) More test cases for queuing of containers at the NM

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799555#comment-15799555
 ] 

Junping Du commented on YARN-5176:
--

This patch goes to branch-2 only instead of branch-2.8, set 2.9 as fix version.

> More test cases for queuing of containers at the NM
> ---
>
> Key: YARN-5176
> URL: https://issues.apache.org/jira/browse/YARN-5176
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-5176.001.patch, YARN-5176.002.patch, 
> YARN-5176.003.patch
>
>
> Extending {{TestQueuingContainerManagerImpl}} to include more test cases for 
> the queuing of containers at the NM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5176) More test cases for queuing of containers at the NM

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5176:
-
Fix Version/s: (was: 2.8.0)
   2.9.0

> More test cases for queuing of containers at the NM
> ---
>
> Key: YARN-5176
> URL: https://issues.apache.org/jira/browse/YARN-5176
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-5176.001.patch, YARN-5176.002.patch, 
> YARN-5176.003.patch
>
>
> Extending {{TestQueuingContainerManagerImpl}} to include more test cases for 
> the queuing of containers at the NM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5337) Fix OOM issue in DistributedShell. AM failed with "java.lang.OutOfMemoryError: GC overhead limit exceeded"

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5337:
-
Summary: Fix OOM issue in DistributedShell. AM failed with 
"java.lang.OutOfMemoryError: GC overhead limit exceeded"  (was: Dshell AM 
failed with "java.lang.OutOfMemoryError: GC overhead limit exceeded")

> Fix OOM issue in DistributedShell. AM failed with 
> "java.lang.OutOfMemoryError: GC overhead limit exceeded"
> --
>
> Key: YARN-5337
> URL: https://issues.apache.org/jira/browse/YARN-5337
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Jian He
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-5337.1.patch
>
>
> Please find AM logs with the exception
> {code} 
> INFO distributedshell.ApplicationMaster: Container completed successfully., 
> containerId=container_e49_1467633982200_0001_01_04
> Exception in thread "AMRM Callback Handler Thread" 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312)
> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at java.lang.Object.clone(Native Method)
>   at java.lang.reflect.Method.getParameterTypes(Method.java:264)
>   at 
> java.lang.reflect.Executable.getGenericParameterTypes(Executable.java:285)
>   at java.lang.reflect.Method.getGenericParameterTypes(Method.java:283)
>   at 
> org.codehaus.jackson.map.introspect.AnnotatedMethod.getParameterTypes(AnnotatedMethod.java:143)
>   at 
> org.codehaus.jackson.map.introspect.AnnotatedMethod.getParameterCount(AnnotatedMethod.java:139)
>   at 
> org.codehaus.jackson.map.introspect.POJOPropertiesCollector._addMethods(POJOPropertiesCollector.java:427)
>   at 
> org.codehaus.jackson.map.introspect.POJOPropertiesCollector.collect(POJOPropertiesCollector.java:219)
>   at 
> org.codehaus.jackson.map.introspect.BasicClassIntrospector.collectProperties(BasicClassIntrospector.java:160)
>   at 
> org.codehaus.jackson.map.introspect.BasicClassIntrospector.forSerialization(BasicClassIntrospector.java:96)
>   at 
> org.codehaus.jackson.map.introspect.BasicClassIntrospector.forSerialization(BasicClassIntrospector.java:16)
>   at 
> org.codehaus.jackson.map.SerializationConfig.introspect(SerializationConfig.java:973)
>   at 
> org.codehaus.jackson.map.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:251)
>   at 
> org.codehaus.jackson.map.ser.StdSerializerProvider._createUntypedSerializer(StdSerializerProvider.java:782)
>   at 
> org.codehaus.jackson.map.ser.StdSerializerProvider._createAndCacheUntypedSerializer(StdSerializerProvider.java:735)
>   at 
> org.codehaus.jackson.map.ser.StdSerializerProvider.findValueSerializer(StdSerializerProvider.java:344)
>   at 
> org.codehaus.jackson.map.ser.impl.PropertySerializerMap.findAndAddSerializer(PropertySerializerMap.java:39)
>   at 
> org.codehaus.jackson.map.ser.std.MapSerializer._findAndAddDynamic(MapSerializer.java:403)
>   at 
> org.codehaus.jackson.map.ser.std.MapSerializer.serializeFields(MapSerializer.java:257)
>   at 
> org.codehaus.jackson.map.ser.std.MapSerializer.serialize(MapSerializer.java:186)
>   at 
> org.codehaus.jackson.map.ser.std.MapSerializer.serialize(MapSerializer.java:23)
>   at 
> org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446)
>   at 
> org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150)
>   at 
> org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112)
>   at 
> org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:122)
>   at 
> org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:71)
>   at 
> org.codehaus.jackson.map.ser.std.AsArraySerializerBase.serialize(AsArraySerializerBase.java:86)
>   at 
> org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446)
>   at 
> org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150)
>   at 
> org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112)
>   at 
> org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:122)
>   at 
>

[jira] [Updated] (YARN-5840) Yarn queues not being tracked correctly by Yarn Timeline

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5840:
-
Fix Version/s: (was: 3.0.0-alpha1)
   (was: 2.8.0)

>  Yarn queues not being tracked correctly by Yarn Timeline
> -
>
> Key: YARN-5840
> URL: https://issues.apache.org/jira/browse/YARN-5840
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: ramtin
>Assignee: Weiwei Yang
>
> By creating Yarn sub-queues and mapping users/groups to these sub-queues when 
> the Job runs the Yarn client seems to capture the correct queue for that Job 
> but if you go to the Yarn Timeline Server to see these jobs, they all get 
> tagged to the "default" queue.
> This makes it hard for easily map the cluster consumption by different 
> departments which belong to different users



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5851) TestContainerManagerSecurity testContainerManager[1] failed

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799540#comment-15799540
 ] 

Junping Du commented on YARN-5851:
--

This patch goes to branch-2 only instead of branch-2.8, set 2.9 as fix version.

> TestContainerManagerSecurity testContainerManager[1] failed 
> 
>
> Key: YARN-5851
> URL: https://issues.apache.org/jira/browse/YARN-5851
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: yarn5851.001.patch
>
>
> ---
> Test set: org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 21.727 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
>   Time elapsed: 0.005 sec  <<< ERROR!
> java.lang.IllegalArgumentException: Can't get Kerberos realm
>   at sun.security.krb5.Config.getDefaultRealm(Config.java:1029)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.security.authentication.util.KerberosUtil.getDefaultRealm(KerberosUtil.java:88)
>   at 
> org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:63)
>   at 
> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:291)
>   at 
> org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:337)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.(TestContainerManagerSecurity.java:151)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5851) TestContainerManagerSecurity testContainerManager[1] failed

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5851:
-
Fix Version/s: (was: 2.8.0)
   2.9.0

> TestContainerManagerSecurity testContainerManager[1] failed 
> 
>
> Key: YARN-5851
> URL: https://issues.apache.org/jira/browse/YARN-5851
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: yarn5851.001.patch
>
>
> ---
> Test set: org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 21.727 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
>   Time elapsed: 0.005 sec  <<< ERROR!
> java.lang.IllegalArgumentException: Can't get Kerberos realm
>   at sun.security.krb5.Config.getDefaultRealm(Config.java:1029)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.security.authentication.util.KerberosUtil.getDefaultRealm(KerberosUtil.java:88)
>   at 
> org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:63)
>   at 
> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:291)
>   at 
> org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:337)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.(TestContainerManagerSecurity.java:151)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6022) Revert changes of AbstractResourceRequest

2017-01-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799539#comment-15799539
 ] 

Wangda Tan commented on YARN-6022:
--

[~templedf], I'm afraid not, since ResourceRequest doesn't have accept to 
scheduler Java project. Another approach is to use Resource.set(...) method to 
update fields of Resource in-place, however, when Resource is a read-only 
instance (like Resources.NONE), it throws runtime exception and bring down RM. 
So I think we may need to keep it like this.

> Revert changes of AbstractResourceRequest
> -
>
> Key: YARN-6022
> URL: https://issues.apache.org/jira/browse/YARN-6022
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-6022.001.patch, YARN-6022.002.patch, 
> YARN-6022.003.patch
>
>
> YARN-5774 added AbstractResourceRequest to make easier internal scheduler 
> change, this is not a correct approach: For example, with this change, we 
> need to make AbstractResourceRequest to be public/stable. And end users can 
> use it like:
> {code}
> AbstractResourceRequest request = ...
> request.setCapability(...)
> {code}
> But AbstractResourceRequest should not be visible by application at all. 
> We need to revert it from branch-2.8 / branch-2 / trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5890) FairScheduler should log information about AM-resource-usage and max-AM-share for queues

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799530#comment-15799530
 ] 

Junping Du commented on YARN-5890:
--

This patch goes to branch-2 only instead of 2.8, set 2.9 as fix version.

> FairScheduler should log information about AM-resource-usage and max-AM-share 
> for queues
> 
>
> Key: YARN-5890
> URL: https://issues.apache.org/jira/browse/YARN-5890
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5890.001.patch, YARN-5890.002.patch, 
> YARN-5890.003.patch, YARN-5890.004.patch, YARN-5890.005.patch, 
> YARN-5890.branch-2.001.patch
>
>
> There are several cases where jobs in a queue or stuck likely because of 
> maxAMShare. It is hard to debug these issues without any information.
> At the very least, we need to log both AM-resource-usage and max-AM-share for 
> queues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5890) FairScheduler should log information about AM-resource-usage and max-AM-share for queues

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5890:
-
Fix Version/s: (was: 2.8.0)
   2.9.0

> FairScheduler should log information about AM-resource-usage and max-AM-share 
> for queues
> 
>
> Key: YARN-5890
> URL: https://issues.apache.org/jira/browse/YARN-5890
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5890.001.patch, YARN-5890.002.patch, 
> YARN-5890.003.patch, YARN-5890.004.patch, YARN-5890.005.patch, 
> YARN-5890.branch-2.001.patch
>
>
> There are several cases where jobs in a queue or stuck likely because of 
> maxAMShare. It is hard to debug these issues without any information.
> At the very least, we need to log both AM-resource-usage and max-AM-share for 
> queues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5901) Fix race condition in TestGetGroups beforeclass setup()

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5901:
-
Fix Version/s: (was: 2.8.0)
   2.9.0

> Fix race condition in TestGetGroups beforeclass setup()
> ---
>
> Key: YARN-5901
> URL: https://issues.apache.org/jira/browse/YARN-5901
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5901-branch-2.03.patch, YARN-5901.02.patch, 
> YARN-5901.03.patch, yarn5901.001.patch
>
>
> In TestGetGroups, the class-level setup method spins up, in a child thread, a 
> resource manager that Yarn clients can talk to. But it checks whether the 
> resource manager is fully started by doing resourcemanager.getServiceState() 
> == STATE.STARTED. This is not reliable since resourcemanager.start() will 
> first trigger service state change in RM, and then starts up all the services 
> added to RM. We need to wait for RM to fully start before YARN clients  can 
> send requests. Otherwise, the tests can fail due to "connection refused"  
> exception when the main thread sends out client requests to RM and if the RPC 
> server has not fired up in the child thread.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5929) Missing scheduling policy in the FS queue metric.

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799525#comment-15799525
 ] 

Junping Du commented on YARN-5929:
--

This patch goes to branch-2 only instead of 2.8, set 2.9 as fix version.

> Missing scheduling policy in the FS queue metric. 
> --
>
> Key: YARN-5929
> URL: https://issues.apache.org/jira/browse/YARN-5929
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5929.001.patch, YARN-5929.002.patch, 
> YARN-5929.003.patch, YARN-5929.004.patch, YARN-5929.005.patch, 
> YARN-5929.006.patch
>
>
> It should be there since YARN-4878. But it doesn't. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5901) Fix race condition in TestGetGroups beforeclass setup()

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799528#comment-15799528
 ] 

Junping Du commented on YARN-5901:
--

This patch goes to branch-2 only instead of 2.8, set 2.9 as fix version.

> Fix race condition in TestGetGroups beforeclass setup()
> ---
>
> Key: YARN-5901
> URL: https://issues.apache.org/jira/browse/YARN-5901
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: unittest
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: YARN-5901-branch-2.03.patch, YARN-5901.02.patch, 
> YARN-5901.03.patch, yarn5901.001.patch
>
>
> In TestGetGroups, the class-level setup method spins up, in a child thread, a 
> resource manager that Yarn clients can talk to. But it checks whether the 
> resource manager is fully started by doing resourcemanager.getServiceState() 
> == STATE.STARTED. This is not reliable since resourcemanager.start() will 
> first trigger service state change in RM, and then starts up all the services 
> added to RM. We need to wait for RM to fully start before YARN clients  can 
> send requests. Otherwise, the tests can fail due to "connection refused"  
> exception when the main thread sends out client requests to RM and if the RPC 
> server has not fired up in the child thread.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5929) Missing scheduling policy in the FS queue metric.

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5929:
-
Fix Version/s: (was: 2.8.0)
   2.9.0

> Missing scheduling policy in the FS queue metric. 
> --
>
> Key: YARN-5929
> URL: https://issues.apache.org/jira/browse/YARN-5929
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5929.001.patch, YARN-5929.002.patch, 
> YARN-5929.003.patch, YARN-5929.004.patch, YARN-5929.005.patch, 
> YARN-5929.006.patch
>
>
> It should be there since YARN-4878. But it doesn't. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5942) "Overridden" is misspelled as "overriden" in FairScheduler.md

2017-01-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799523#comment-15799523
 ] 

Junping Du commented on YARN-5942:
--

This patch goes to branch-2 only instead of 2.8, set 2.9 as fix version.

> "Overridden" is misspelled as "overriden" in FairScheduler.md
> -
>
> Key: YARN-5942
> URL: https://issues.apache.org/jira/browse/YARN-5942
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: site
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Heather Sutherland
>Priority: Trivial
>  Labels: newbie
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: FairScheduler.md, YARN-5942.001.patch
>
>
> {noformat}
> % grep -i overriden 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md
> * **A queueMaxAppsDefault element**: which sets the default running app limit 
> for queues; overriden by maxRunningApps element in each queue.
> * **A queueMaxResourcesDefault element**: which sets the default max resource 
> limit for queue; overriden by maxResources element in each queue.
> * **A queueMaxAMShareDefault element**: which sets the default AM resource 
> limit for queue; overriden by maxAMShare element in each queue.
> * **A defaultQueueSchedulingPolicy element**: which sets the default 
> scheduling policy for queues; overriden by the schedulingPolicy element in 
> each queue if specified. Defaults to "fair".
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5942) "Overridden" is misspelled as "overriden" in FairScheduler.md

2017-01-04 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5942:
-
Fix Version/s: (was: 2.8.0)
   2.9.0

> "Overridden" is misspelled as "overriden" in FairScheduler.md
> -
>
> Key: YARN-5942
> URL: https://issues.apache.org/jira/browse/YARN-5942
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: site
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Heather Sutherland
>Priority: Trivial
>  Labels: newbie
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: FairScheduler.md, YARN-5942.001.patch
>
>
> {noformat}
> % grep -i overriden 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md
> * **A queueMaxAppsDefault element**: which sets the default running app limit 
> for queues; overriden by maxRunningApps element in each queue.
> * **A queueMaxResourcesDefault element**: which sets the default max resource 
> limit for queue; overriden by maxResources element in each queue.
> * **A queueMaxAMShareDefault element**: which sets the default AM resource 
> limit for queue; overriden by maxAMShare element in each queue.
> * **A defaultQueueSchedulingPolicy element**: which sets the default 
> scheduling policy for queues; overriden by the schedulingPolicy element in 
> each queue if specified. Defaults to "fair".
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799512#comment-15799512
 ] 

Wangda Tan commented on YARN-5959:
--

Thanks [~asuresh] for update the patch, generally looks good, some additional 
minor suggestions:

1) Can we make scheduler.allocate to have a single ContainerUpdateRequests, 
which include 4 lists (increase/decrease/promotion/demotion) instead of making 
all of them become top-level parameter, similarly for Allocation.

2) AMS: can we move 4 invokes of {{addToUpdatedContainers}} and 1 
{{addToUpdateContainerErrors}} to a separate method considering the long method.

3) Remove unused imports of OCAAMS

4) There's a duplicated method {{createRmContainer}} in AbstractYarnScheduler 
and OCAAMS, could we:
- Move it to SchedulerUtils? 
- And in addition, it will only allocate OpportunisticContainer, so could you 
rename to createOpportuniscRMContainer. I'm asking this because it cannot 
guarantee atomic of app and node, I want to make sure no guaranteed container 
allocation uses this method.

5) RMContainerImpl: 
- Use volatile to avoid synchronized lock for set/getContainer? Because all 
other methods are using R/W lock. 

> RM changes to support change of container ExecutionType
> ---
>
> Key: YARN-5959
> URL: https://issues.apache.org/jira/browse/YARN-5959
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5959-YARN-5085.001.patch, 
> YARN-5959-YARN-5085.002.patch, YARN-5959-YARN-5085.003.patch, 
> YARN-5959.combined.001.patch, YARN-5959.wip.002.patch, 
> YARN-5959.wip.003.patch, YARN-5959.wip.patch
>
>
> RM side changes to allow an AM to ask for change of ExecutionType.
> Currently, there are two cases:
> # *Promotion* : OPPORTUNISTIC to GUARANTEED.
> # *Demotion* : GUARANTEED to OPPORTUNISTIC.
> This is similar in YARN-1197 which allows for change in Container resources. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5830) Avoid preempting AM containers

2017-01-04 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799497#comment-15799497
 ] 

Yufei Gu commented on YARN-5830:


Thanks [~kasha] for the review and suggestion. I've uploaded the new patch for 
your comments.

> Avoid preempting AM containers
> --
>
> Key: YARN-5830
> URL: https://issues.apache.org/jira/browse/YARN-5830
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
> Attachments: YARN-5830.001.patch, YARN-5830.002.patch, 
> YARN-5830.003.patch
>
>
> While considering containers for preemption, avoid AM containers unless 
> absolutely necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5830) Avoid preempting AM containers

2017-01-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-5830:
---
Attachment: YARN-5830.003.patch

> Avoid preempting AM containers
> --
>
> Key: YARN-5830
> URL: https://issues.apache.org/jira/browse/YARN-5830
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
> Attachments: YARN-5830.001.patch, YARN-5830.002.patch, 
> YARN-5830.003.patch
>
>
> While considering containers for preemption, avoid AM containers unless 
> absolutely necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5554) MoveApplicationAcrossQueues does not check user permission on the target queue

2017-01-04 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799480#comment-15799480
 ] 

Daniel Templeton commented on YARN-5554:


Thanks, [~wilfreds].  Can you move that {{checkAccess()}} comment into the 
{{checkAccess()}} method, right before the _if_ it's talking about?  While 
you're at it, could you fix the language/grammar of the comment text you 
reformatted in the pre-existing {{checkAccess()}} method?

> MoveApplicationAcrossQueues does not check user permission on the target queue
> --
>
> Key: YARN-5554
> URL: https://issues.apache.org/jira/browse/YARN-5554
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Haibo Chen
>Assignee: Wilfred Spiegelenburg
>  Labels: oct16-medium
> Attachments: YARN-5554.10.patch, YARN-5554.11.patch, 
> YARN-5554.12.patch, YARN-5554.2.patch, YARN-5554.3.patch, YARN-5554.4.patch, 
> YARN-5554.5.patch, YARN-5554.6.patch, YARN-5554.7.patch, YARN-5554.8.patch, 
> YARN-5554.9.patch
>
>
> moveApplicationAcrossQueues operation currently does not check user 
> permission on the target queue. This incorrectly allows one user to move 
> his/her own applications to a queue that the user has no access to



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-01-04 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799469#comment-15799469
 ] 

Daniel Templeton commented on YARN-6050:


I don't love that we're only taking the node label expression and capacity from 
the first resource request.  Sounds like a recipe for confusion if the user 
doesn't make all the requests the same.  Can we enforce that the requests all 
agree with each other?  Or maybe provide a utility to create a family of 
consistent requests?  We should at least document that the requests must all 
agree or that we ignore all but the first one.

> AMs can't be scheduled on racks or nodes
> 
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6040) Remove usage of ResourceRequest from AppSchedulerInfo / SchedulerApplicationAttempt

2017-01-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-6040:
-
Attachment: YARN-6040.005.patch

Attached ver.5 patch addressed comments.

Now added API to get resourceName to make behavior not changed in 
FSPreemptionThread.

bq. From your comment, it seems you are planning on moving canDelayTo() method 
to a separate policy ? I was thinking the PlacementSet itself would encapsulate 
the delay scheduling policy, but we can discuss that later.
Yeah I was thinking to make it as a part of SchedulingPlacementSet, but 
implementation of delay policy could be independent and pluggable.

[~asuresh], mind to review again?

> Remove usage of ResourceRequest from AppSchedulerInfo / 
> SchedulerApplicationAttempt
> ---
>
> Key: YARN-6040
> URL: https://issues.apache.org/jira/browse/YARN-6040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6040.001.patch, YARN-6040.002.patch, 
> YARN-6040.003.patch, YARN-6040.004.patch, YARN-6040.005.patch
>
>
> As mentioned by YARN-5906, currently schedulers are using ResourceRequest 
> heavily so it will be very hard to adopt the new PowerfulResourceRequest 
> (YARN-4902).
> This JIRA is the 2nd step of refactoring, which remove usage of 
> ResourceRequest from AppSchedulingInfo / SchedulerApplicationAttempt. Instead 
> of returning ResourceRequest, it returns a lightweight and API-independent 
> object - {{PendingAsk}}.
> The only remained ResourceRequest API of AppSchedulingInfo will be used by 
> web service to get list of ResourceRequests.
> So after this patch, usage of ResourceRequest will be isolated inside 
> AppSchedulingInfo, so it will be more flexible to update internal data 
> structure and upgrade old ResourceRequest API to new.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5724) [Umbrella] Better Queue Management in YARN

2017-01-04 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799431#comment-15799431
 ] 

Naganarasimha G R commented on YARN-5724:
-

I have added some 
[comments|https://issues.apache.org/jira/browse/YARN-5556?focusedCommentId=15799398=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15799398]
 for the doc in YARN-5556 related to deletion of the queue, we can discuss more 
there .

> [Umbrella] Better Queue Management in YARN
> --
>
> Key: YARN-5724
> URL: https://issues.apache.org/jira/browse/YARN-5724
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: capacity scheduler
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: Designdocv1-Configuration-basedQueueManagementinYARN.pdf
>
>
> This serves as an umbrella ticket for tasks related to better queue 
> management in YARN.
> Today's the only way to manage the queue is through admins editing 
> configuration files and then issuing a refresh command. This will bring many 
> inconveniences. For example, the users can not create / delete /modify their 
> own queues without talking to site level admins.
> Even in today's approach (configuration-based), we still have several places 
> needed to improve: 
> *  It is possible today to add or modify queues without restarting the RM, 
> via a CS refresh. But for deleting queue, we have to restart the 
> resourcemanager.
> * When a queue is STOPPED, resources allocated to the queue can be handled 
> better. Currently, they'll only be used if the other queues are setup to go 
> over their capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6009) RM fails to start during an upgrade - Failed to load/recover state (YarnException: Invalid application timeout, value=0 for type=LIFETIME)

2017-01-04 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799419#comment-15799419
 ] 

Daniel Templeton commented on YARN-6009:


What is the impact of allowing an app to be recovered with a 0 timeout value?  
Looks like YARN-5611 changes the timeout to be absolute rather than relative.  
Is that going to cause the recovered app to immediately expire?  Also, you may 
as well do the validation before the queue mapping.

> RM fails to start during an upgrade - Failed to load/recover state 
> (YarnException: Invalid application timeout, value=0 for type=LIFETIME)
> --
>
> Key: YARN-6009
> URL: https://issues.apache.org/jira/browse/YARN-6009
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Gour Saha
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-6009.01.patch
>
>
> ResourceManager fails to start during an upgrade with the following 
> exceptions - 
> Exception 1:
> {color:red}
> {code}
> 2016-12-09 14:57:23,508 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:initScheduler(328)) - Initialized CapacityScheduler 
> with calculator=class 
> org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, 
> minimumAllocation=<>, 
> maximumAllocation=<>, asynchronousScheduling=false, 
> asyncScheduleInterval=5ms
> 2016-12-09 14:57:23,509 WARN  ha.ActiveStandbyElector 
> (ActiveStandbyElector.java:becomeActive(863)) - Exception handling the 
> winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:129)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:859)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:463)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:611)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when 
> transitioning to Active mode
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:318)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127)
> ... 4 more
> Caused by: org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.yarn.exceptions.YarnException: Invalid application timeout, 
> value=0 for type=LIFETIME
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:991)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1032)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1028)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1028)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:313)
> ... 5 more
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Invalid 
> application timeout, value=0 for type=LIFETIME
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.validateApplicationTimeouts(RMServerUtils.java:305)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:365)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:330)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:463)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1184)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 13 more
> {code}
> {color}
>

[jira] [Commented] (YARN-5556) Support for deleting queues without requiring a RM restart

2017-01-04 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799398#comment-15799398
 ] 

Naganarasimha G R commented on YARN-5556:
-

Hi [~xgong] and [~wangda],
I have few queries based on the design document attached in YARN-5724,
# So user needs to delete a queue(say a2) then he needs to remove the queue 
from its parent's "yarn.scheduler.capacity..queues" config and 
also mention its state(yarn.scheduler.capacity..state) as 
{{DELETED}} right ?
# How to delete intermediate queues? i presume we need *NOT* configure state 
for each of its children right ? or do we plan to support delete of only leaf 
queue?
# Do we need to consider the moving of queues(along with its apps) from one 
queue hiearchy to another ? IMO it complicates but not sure about the real 
world usecases.
# In case of HA, i think it further complicates as if both the RM's are 
initialiased with old queue settings and then if new queue is updated then CS 
is aware of deleted queue else if the RM starts of with updated xml(with 
deleted queue) then deleted queue information is not available and if failover 
happens to this RM then apps running on the deleted queue cannot be recovered 
as the queue doesnt exist. so do we need to start maintaining the deleted queue 
in statestore or need handling of creating queue objects for the queues whose 
state has been marked as deleted (then we need to consider 2nd point) ?
# More of a test scenario, i have a queue with apps running, now i delete the 
queue which will make it go into drain state (or new state as Deleted but queue 
is not deleted until all apps under it are finished) but apps take some time to 
finish and now xml is again updated with new queue which has same name and path 
as of one which was deleted earlier, so do we need to support addition of this 
new queue or dont allow if earlier queue is in process of deletion ?
# do we need to consider showing of the deleted queues in the webui ? may be in 
another jira but the code needs to be updated. 
# for the below comment in the doc :
bq. "For the resources of the deleted/stopped queue, users should explicitly 
distribute them away to its siblings."
* While we allow running apps to complete do we allow pending container 
requests to be catered for these apps ? if so deleted queue's capacity is 
considered to be 0% and max cap as per its config ?
* if deleted queue's capacity is not explicitly redistributed to its siblings i 
presume we need to throw exception ?
* Do we need to consider preemption of resources from these deleted queues when 
there is shortage of resources?
* What should happen to the pending(not activated) apps in the queue, kill or 
give a chance to complere like running apps ?

I can port/rebase based on my original patch (which just goes ahead and deletes 
the queue if apps are not present), but i presume that scope has got changed 
hence once these points are clarified we can decide the scope of this jira and 
then will work upon it.

> Support for deleting queues without requiring a RM restart
> --
>
> Key: YARN-5556
> URL: https://issues.apache.org/jira/browse/YARN-5556
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Xuan Gong
>Assignee: Naganarasimha G R
> Attachments: YARN-5556.v1.001.patch, YARN-5556.v1.002.patch, 
> YARN-5556.v1.003.patch, YARN-5556.v1.004.patch
>
>
> Today, we could add or modify queues without restarting the RM, via a CS 
> refresh. But for deleting queue, we have to restart the ResourceManager. We 
> could support for deleting queues without requiring a RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps

2017-01-04 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799386#comment-15799386
 ] 

Sangjin Lee commented on YARN-6027:
---

bq. At least pagination should be supported for single day flow activities.

OK, I do see your point there. A single day activity having a large number of 
flows active is not something that can be ignored, especially on a large/busy 
cluster. I'm +1 on having pagination support on a single-day result.

bq. And also I see that flow entities contains all the flow run details. Do we 
really need to embed flowruns details in flow entities? Does not it become 
heavy?

As Varun pointed out, it is not likely there will be a large number of flow 
runs for a given flow. Most likely we're dealing with one or a few runs at 
most. We could add an option of not returning flow runs if desired, but I think 
this is not as critical. Having pagination support may also reduce the need for 
this, right?

bq. Sangjin Lee, you remember why the user was kept before flow name in row 
key? To achieve user level offline aggregation?

If my memory serves right, it was mostly for consistency. A flow is uniquely 
identified as a user + flow really. In most tables (the application table might 
be an exception) we retain the order of user and flow name. Does that pose a 
challenge in implementing pagination? We could consider the fromId as (user + 
flow), right?

> Support fromId for flows/flowrun apps
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6054) TimelineServer fails to start when some LevelDb state files are missing.

2017-01-04 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799324#comment-15799324
 ] 

Ravi Prakash commented on YARN-6054:


Thanks to Jason's pointer for repairing the LevelDb 
[here|https://issues.apache.org/jira/browse/YARN-2873?focusedCommentId=14216259=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14216259],
 when we tried to "repair" the levelDb, the TS came up just fine.

> TimelineServer fails to start when some LevelDb state files are missing.
> 
>
> Key: YARN-6054
> URL: https://issues.apache.org/jira/browse/YARN-6054
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Ravi Prakash
>
> We encountered an issue recently where the TimelineServer failed to start 
> because some state files went missing.
> {code}
> 2016-11-21 20:46:43,134 INFO org.apache.hadoop.service.AbstractService: 
> Service 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
>  failed in state INITED
> ; cause: org.apache.hadoop.service.ServiceStateException: 
> org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 
> missing files; e.g.: /timelines
> erver/leveldb-timeline-store.ldb/127897.sst
> org.apache.hadoop.service.ServiceStateException: 
> org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 
> missing files; e.g.: /timelineserver/lev
> eldb-timeline-store.ldb/127897.sst
> 2016-11-21 20:46:43,135 FATAL 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer:
>  Error starting ApplicationHistoryServer
> org.apache.hadoop.service.ServiceStateException: 
> org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 
> missing files; e.g.: 
> /timelineserver/leveldb-timeline-store.ldb/127897.sst
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:172)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:182)
> Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: 
> Corruption: 9 missing files; e.g.: 
> /timelineserver/leveldb-timeline-store.ldb/127897.sst
> at 
> org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
> at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
> at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
> at 
> org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore.serviceInit(LeveldbTimelineStore.java:229)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> ... 5 more
> 2016-11-21 20:46:43,136 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status -1
> {code}
> Ideally we shouldn't have any missing state files. However I'd posit that the 
> TimelineServer should have graceful degradation instead of failing to start 
> at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6054) TimelineServer fails to start when some LevelDb state files are missing.

2017-01-04 Thread Ravi Prakash (JIRA)

Ravi Prakash created YARN-6054:
--

 Summary: TimelineServer fails to start when some LevelDb state 
files are missing.
 Key: YARN-6054
 URL: https://issues.apache.org/jira/browse/YARN-6054
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0-alpha2
Reporter: Ravi Prakash


We encountered an issue recently where the TimelineServer failed to start 
because some state files went missing.

{code}
2016-11-21 20:46:43,134 INFO org.apache.hadoop.service.AbstractService: Service 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
 failed in state INITED
; cause: org.apache.hadoop.service.ServiceStateException: 
org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 missing 
files; e.g.: /timelines
erver/leveldb-timeline-store.ldb/127897.sst
org.apache.hadoop.service.ServiceStateException: 
org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 missing 
files; e.g.: /timelineserver/lev
eldb-timeline-store.ldb/127897.sst

2016-11-21 20:46:43,135 FATAL 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer:
 Error starting ApplicationHistoryServer
org.apache.hadoop.service.ServiceStateException: 
org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 missing 
files; e.g.: 
/timelineserver/leveldb-timeline-store.ldb/127897.sst
at 
org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:172)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:182)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 
9 missing files; e.g.: 
/timelineserver/leveldb-timeline-store.ldb/127897.sst
at 
org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
at 
org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore.serviceInit(LeveldbTimelineStore.java:229)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
... 5 more
2016-11-21 20:46:43,136 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status -1
{code}
Ideally we shouldn't have any missing state files. However I'd posit that the 
TimelineServer should have graceful degradation instead of failing to start at 
all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5724) [Umbrella] Better Queue Management in YARN

2017-01-04 Thread Naganarasimha G R (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-5724:

Component/s: capacity scheduler

> [Umbrella] Better Queue Management in YARN
> --
>
> Key: YARN-5724
> URL: https://issues.apache.org/jira/browse/YARN-5724
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: capacity scheduler
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: Designdocv1-Configuration-basedQueueManagementinYARN.pdf
>
>
> This serves as an umbrella ticket for tasks related to better queue 
> management in YARN.
> Today's the only way to manage the queue is through admins editing 
> configuration files and then issuing a refresh command. This will bring many 
> inconveniences. For example, the users can not create / delete /modify their 
> own queues without talking to site level admins.
> Even in today's approach (configuration-based), we still have several places 
> needed to improve: 
> *  It is possible today to add or modify queues without restarting the RM, 
> via a CS refresh. But for deleting queue, we have to restart the 
> resourcemanager.
> * When a queue is STOPPED, resources allocated to the queue can be handled 
> better. Currently, they'll only be used if the other queues are setup to go 
> over their capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5923) Unable to access logs for a running application if YARN_ACL_ENABLE is enabled

2017-01-04 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-5923:
--
Fix Version/s: 3.0.0-alpha2

> Unable to access logs for a running application if YARN_ACL_ENABLE is enabled
> -
>
> Key: YARN-5923
> URL: https://issues.apache.org/jira/browse/YARN-5923
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5923.1.patch, YARN-5923.2.patch, YARN-5923.3.patch
>
>
> 2016-11-07 23:20:41,423 WARN  webapp.GenericExceptionHandler 
> (GenericExceptionHandler.java:toResponse(98)) - INTERNAL_SERVER_ERROR
> javax.ws.rs.WebApplicationException: 
> org.apache.hadoop.yarn.exceptions.YarnException: User [dr.who] is not 
> authorized to view the logs for application application_1478218837976_0068
> at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.NMWebServices.getContainerLogsInfo(NMWebServices.java:226)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
> at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185)
> at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
> at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
> at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
> at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
> at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
> at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
> at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.NMWebAppFilter.doFilter(NMWebAppFilter.java:72)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
> at 
> com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
> at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
> at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
> at 
> com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1294)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at 
>

[jira] [Updated] (YARN-5999) AMRMClientAsync will stop if any exceptions thrown on allocate call

2017-01-04 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-5999:
--
Fix Version/s: 3.0.0-alpha2

> AMRMClientAsync will stop if any exceptions thrown on allocate call 
> 
>
> Key: YARN-5999
> URL: https://issues.apache.org/jira/browse/YARN-5999
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5999.1.patch
>
>
> Currently, for any exceptions thrown on the allocate call of AMRMClientAsync, 
>  it will stop both heartbeat thread and the callback handler thread, leaving 
> AMRMClient in an unusable state.  Caller has to instantiate a new AMRMClient. 
> IMO, the threads should keep on running, it should be up to the caller 
> whether to stop the AMRMClient or not.
> {code}
>   try {
> response = client.allocate(progress);
>   } catch (ApplicationAttemptNotFoundException e) {
> handler.onShutdownRequest();
> LOG.info("Shutdown requested. Stopping callback.");
> return;
>   } catch (Throwable ex) {
> LOG.error("Exception on heartbeat", ex);
> savedException = ex;
> // interrupt handler thread in case it waiting on the queue
> handlerThread.interrupt();
> return;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6009) RM fails to start during an upgrade - Failed to load/recover state (YarnException: Invalid application timeout, value=0 for type=LIFETIME)

2017-01-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799125#comment-15799125
 ] 

Hadoop QA commented on YARN-6009:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m  4s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6009 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845480/YARN-6009.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0083599930d9 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a0a2761 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14551/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14551/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/14551/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RM fails to start during an upgrade - Failed to load/recover state 
> (YarnException: Invalid application

[jira] [Commented] (YARN-6022) Revert changes of AbstractResourceRequest

2017-01-04 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799053#comment-15799053
 ] 

Daniel Templeton commented on YARN-6022:


The last patch looks clean.  Looking at it, though, 
{{request.setCapability(scheduler.getNormalizedResource(request.getCapability()))}}
 seems like a really clumsy construct.  Would it be less awkward to add a 
@Private normalize method to the request classes so we could call 
{{request.normalize(scheduler)}} instead?

> Revert changes of AbstractResourceRequest
> -
>
> Key: YARN-6022
> URL: https://issues.apache.org/jira/browse/YARN-6022
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-6022.001.patch, YARN-6022.002.patch, 
> YARN-6022.003.patch
>
>
> YARN-5774 added AbstractResourceRequest to make easier internal scheduler 
> change, this is not a correct approach: For example, with this change, we 
> need to make AbstractResourceRequest to be public/stable. And end users can 
> use it like:
> {code}
> AbstractResourceRequest request = ...
> request.setCapability(...)
> {code}
> But AbstractResourceRequest should not be visible by application at all. 
> We need to revert it from branch-2.8 / branch-2 / trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6009) RM fails to start during an upgrade - Failed to load/recover state (YarnException: Invalid application timeout, value=0 for type=LIFETIME)

2017-01-04 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15799037#comment-15799037
 ] 

Rohith Sharma K S commented on YARN-6009:
-

There is open JIRA for test failure i.e YARN-5548. Test failure is unrelated to 
patch. 

> RM fails to start during an upgrade - Failed to load/recover state 
> (YarnException: Invalid application timeout, value=0 for type=LIFETIME)
> --
>
> Key: YARN-6009
> URL: https://issues.apache.org/jira/browse/YARN-6009
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Gour Saha
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-6009.01.patch
>
>
> ResourceManager fails to start during an upgrade with the following 
> exceptions - 
> Exception 1:
> {color:red}
> {code}
> 2016-12-09 14:57:23,508 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:initScheduler(328)) - Initialized CapacityScheduler 
> with calculator=class 
> org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, 
> minimumAllocation=<>, 
> maximumAllocation=<>, asynchronousScheduling=false, 
> asyncScheduleInterval=5ms
> 2016-12-09 14:57:23,509 WARN  ha.ActiveStandbyElector 
> (ActiveStandbyElector.java:becomeActive(863)) - Exception handling the 
> winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:129)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:859)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:463)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:611)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when 
> transitioning to Active mode
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:318)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127)
> ... 4 more
> Caused by: org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.yarn.exceptions.YarnException: Invalid application timeout, 
> value=0 for type=LIFETIME
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:991)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1032)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1028)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1028)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:313)
> ... 5 more
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Invalid 
> application timeout, value=0 for type=LIFETIME
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.validateApplicationTimeouts(RMServerUtils.java:305)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:365)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:330)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:463)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1184)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 13 more
> {code}
> {color}
> Exception 2:
> {color:red}
> {code}
> 2016-12-09 14:57:26,162 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(790)) - 
> application_1477927786494_0008 State change from NEW to FINISHED
> 2016-12-09

[jira] [Commented] (YARN-6040) Remove usage of ResourceRequest from AppSchedulerInfo / SchedulerApplicationAttempt

2017-01-04 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798997#comment-15798997
 ] 

Arun Suresh commented on YARN-6040:
---

Thanks for the patch [~leftnoteasy],

Couple of comments:
# Maybe rename PendingAsk::allocationNumber to PendingAsk::count / cardinality ?
# From your comment, it seems you are planning on moving canDelayTo() method to 
a separate policy ? I was thinking the PlacementSet itself would encapsulate 
the delay scheduling policy, but we can discuss that later.
# In the FSPreemptionThread::run() method, the potentialNodes (line 104 in the 
patch), you should be doing a getNodesByResourceName() on the rack name. I 
understand though that it requires adding the resourcename into the PendingAsk.

> Remove usage of ResourceRequest from AppSchedulerInfo / 
> SchedulerApplicationAttempt
> ---
>
> Key: YARN-6040
> URL: https://issues.apache.org/jira/browse/YARN-6040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6040.001.patch, YARN-6040.002.patch, 
> YARN-6040.003.patch, YARN-6040.004.patch
>
>
> As mentioned by YARN-5906, currently schedulers are using ResourceRequest 
> heavily so it will be very hard to adopt the new PowerfulResourceRequest 
> (YARN-4902).
> This JIRA is the 2nd step of refactoring, which remove usage of 
> ResourceRequest from AppSchedulingInfo / SchedulerApplicationAttempt. Instead 
> of returning ResourceRequest, it returns a lightweight and API-independent 
> object - {{PendingAsk}}.
> The only remained ResourceRequest API of AppSchedulingInfo will be used by 
> web service to get list of ResourceRequests.
> So after this patch, usage of ResourceRequest will be isolated inside 
> AppSchedulingInfo, so it will be more flexible to update internal data 
> structure and upgrade old ResourceRequest API to new.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6009) RM fails to start during an upgrade - Failed to load/recover state (YarnException: Invalid application timeout, value=0 for type=LIFETIME)

2017-01-04 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798965#comment-15798965
 ] 

Jian He commented on YARN-6009:
---

patch looks good to me. The UT failure passed locally for me.
Retry the jenkins

> RM fails to start during an upgrade - Failed to load/recover state 
> (YarnException: Invalid application timeout, value=0 for type=LIFETIME)
> --
>
> Key: YARN-6009
> URL: https://issues.apache.org/jira/browse/YARN-6009
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Gour Saha
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-6009.01.patch
>
>
> ResourceManager fails to start during an upgrade with the following 
> exceptions - 
> Exception 1:
> {color:red}
> {code}
> 2016-12-09 14:57:23,508 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:initScheduler(328)) - Initialized CapacityScheduler 
> with calculator=class 
> org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, 
> minimumAllocation=<>, 
> maximumAllocation=<>, asynchronousScheduling=false, 
> asyncScheduleInterval=5ms
> 2016-12-09 14:57:23,509 WARN  ha.ActiveStandbyElector 
> (ActiveStandbyElector.java:becomeActive(863)) - Exception handling the 
> winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:129)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:859)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:463)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:611)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when 
> transitioning to Active mode
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:318)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127)
> ... 4 more
> Caused by: org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.yarn.exceptions.YarnException: Invalid application timeout, 
> value=0 for type=LIFETIME
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:991)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1032)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1028)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1028)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:313)
> ... 5 more
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Invalid 
> application timeout, value=0 for type=LIFETIME
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.validateApplicationTimeouts(RMServerUtils.java:305)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:365)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:330)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:463)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1184)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 13 more
> {code}
> {color}
> Exception 2:
> {color:red}
> {code}
> 2016-12-09 14:57:26,162 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(790)) - 
> application_1477927786494_0008 State change from NEW to FINISHED
> 2016-12-09 14:57:26,162 ERROR

[jira] [Commented] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798859#comment-15798859
 ] 

Hadoop QA commented on YARN-5959:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 15 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 11m 
47s{color} | {color:red} root in YARN-5085 failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
39s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
23s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
25s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
17s{color} | {color:green} YARN-5085 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} YARN-5085 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
17s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 10s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 39 new + 1421 unchanged - 17 fixed = 1460 total (was 1438) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
28s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 3 new + 913 unchanged - 0 fixed = 916 total (was 913) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
35s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 42m  
7s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m  
2s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}117m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5959 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12845558/YARN-5959-YARN-5085.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 34744e0833bc 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Updated] (YARN-6041) Opportunistic containers : Combined patch for branch-2

2017-01-04 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-6041:
--
Attachment: YARN-6041-branch-2.003.patch

Updating patch based on [~kasha]'s suggestions

> Opportunistic containers : Combined patch for branch-2 
> ---
>
> Key: YARN-6041
> URL: https://issues.apache.org/jira/browse/YARN-6041
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-6041-branch-2.001.patch, 
> YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch
>
>
> This is a combined patch targeting branch-2 of the following JIRAs which have 
> already been committed to trunk :
> YARN-5938. Refactoring OpportunisticContainerAllocator to use 
> SchedulerRequestKey instead of Priority and other misc fixes
> YARN-5646. Add documentation and update config parameter names for scheduling 
> of OPPORTUNISTIC containers.
> YARN-5982. Simplify opportunistic container parameters and metrics.
> YARN-5918. Handle Opportunistic scheduling allocate request failure when NM 
> is lost.
> YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager 
> container lifecycle.
> YARN-5823. Update NMTokens in case of requests with only opportunistic 
> containers.
> YARN-5377. Fix 
> TestQueuingContainerManager.testKillMultipleOpportunisticContainers.
> YARN-2995. Enhance UI to show cluster resource utilization of various 
> container Execution types.
> YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http 
> Address.
> YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method 
> to handle OPPORTUNISTIC container requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5959) RM changes to support change of container ExecutionType

2017-01-04 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5959:
--
Attachment: YARN-5959-YARN-5085.003.patch

Updating patch.

Thanks for the detailed review [~leftnoteasy].
I agree with all your suggestions. I've updated the patch as follows:
# Reverted all changes to the {{AbstractResourceRequest}}. We don't need the 
extra {{containerToUpdate}} field anymore, as you pointed out, it was possible 
to accomplish everything via the {{SchedulerRequestKey}}
# Modified the call flow remove the update Execution type call from the OCAAMS 
and moved it into the Scheduler itself.
# Modified the Allocation object to handle the promoted and demoted containers.
# Split the ContainerUpdateType::UPDATE_EXECUTION_TYPE to 
PROMOTE_EXECUTION_TYPE and DEMOTE_EXECUTION_TYPE.

The size of the patch has increased a bit since I modified the allocate() 
method  of the scheduler to take the promoted and demoted lists. And this 
touches a few test classes.

> RM changes to support change of container ExecutionType
> ---
>
> Key: YARN-5959
> URL: https://issues.apache.org/jira/browse/YARN-5959
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5959-YARN-5085.001.patch, 
> YARN-5959-YARN-5085.002.patch, YARN-5959-YARN-5085.003.patch, 
> YARN-5959.combined.001.patch, YARN-5959.wip.002.patch, 
> YARN-5959.wip.003.patch, YARN-5959.wip.patch
>
>
> RM side changes to allow an AM to ask for change of ExecutionType.
> Currently, there are two cases:
> # *Promotion* : OPPORTUNISTIC to GUARANTEED.
> # *Demotion* : GUARANTEED to OPPORTUNISTIC.
> This is similar in YARN-1197 which allows for change in Container resources. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6027) Support fromId for flows/flowrun apps

2017-01-04 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798450#comment-15798450
 ] 

Varun Saxena edited comment on YARN-6027 at 1/4/17 3:24 PM:


Yes, flow activity is meant to display activity for a day. So if a flow is 
active on two days, it will be shown in each of those 2 days. It is not a list 
of flows but a list of flow activities. So the behavior is expected. We need to 
drill down from flow activities to flow runs to applications.
We can in UI, show them as flow activities only clubbed by date. Page can be 
designed something like below ? Flows will also be differentiated by user.
{noformat}
 __
|  Date   |   List of Flows|
|_||
|  20 Jan |  flow1 |
| |  flow2 |
|_||
| ||
|  19 Jan |  flow2 |
| |  flow3 |
|_|| 
{noformat}

Frankly, it is unlikely that the number of flows per day would run into 
thousands if somebody is specifying flows in YARN tags. But as we are designing 
a generic system we should not make assumptions. In an ad-hoc query system, 
each query may very well be taken as a separate flow if flows are not 
specified. And that run into several thousands per day.
So some sort of pagination support can be added.

By the way, I am not sure why user comes before flow name in the row key. I 
think the order in row key can be reversed and we can then use fromFlowId as a 
new filter along with date range (with reasonable restrictions) to achieve 
pagination. 
And pagination can be done within a day. I mean in example given above, we can 
have a sub-table under List of flows column (in UI)
[~sjlee0], you remember why the user was kept before flow name in row key? To 
achieve user level offline aggregation?

Regarding flow run details under a specific flow activity, I had in fact raised 
a JIRA to limit the number of flow runs returned under a flow activity. But 
realistically speaking, number of runs may not be that large in a day. Also, it 
may not be feasible to do it at HBase side with current schema design. This can 
be done at reader side though to reduce payload. We anyways send minimal 
information about the flow run as we only keep run id and version in flow 
activity table. 
Also, we need to see how we design our UI. If we want to support drilling down 
to flow runs for a specific date, we would infact need all the flow runs for a 
day. How to get this info in chunks and whether its required, we can see.
  
I am not sure what you meant by using flow run information in flow entities 
should be treated as a filter. Can you elaborate ?


was (Author: varun_saxena):
Yes, flow activity is meant to display activity for a day. So if a flow is 
active on two days, it will be shown in each of those 2 days. It is not a list 
of flows but a list of flow activities. So the behavior is expected. We need to 
drill down from flow activities to flow runs to applications.
We can in UI, show them as flow activities only clubbed by date. Page can be 
designed something like below ? List of flows will be differentiated by user.
{noformat}
 __
|  Date   |   List of Flows|
|_||
|  20 Jan |  flow1 |
| |  flow2 |
|_||
| ||
|  19 Jan |  flow2 |
| |  flow3 |
|_|| 
{noformat}

Frankly, it is unlikely that the number of flows per day would run into 
thousands if somebody is specifying flows in YARN tags. But as we are designing 
a generic system we should not make assumptions. In an ad-hoc query system, 
each query may very well be taken as a separate flow if flows are not 
specified. And that run into several thousands per day.
So some sort of pagination support can be added.

By the way, I am not sure why user comes before flow name in the row key. I 
think the order in row key can be reversed and we can then use fromFlowId as a 
new filter along with date range (with reasonable restrictions) to achieve 
pagination. 
And pagination can be done within a day. I mean in example given above, we can 
have a sub-table under List of flows column (in UI)
[~sjlee0], you remember why the user was kept before flow name in row key? To 
achieve user level offline aggregation?

Regarding flow run details under a specific flow activity, I had in fact raised 
a JIRA to limit the number of flow runs returned under a flow activity. But 
realistically speaking, number of runs may not be that large in a day. Also, it 
may not be feasible to do it at HBase side with current schema design. This can 
be done at reader side though to reduce payload. We anyways send minimal 
information about

[jira] [Comment Edited] (YARN-6027) Support fromId for flows/flowrun apps

2017-01-04 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798450#comment-15798450
 ] 

Varun Saxena edited comment on YARN-6027 at 1/4/17 2:59 PM:


Yes, flow activity is meant to display activity for a day. So if a flow is 
active on two days, it will be shown in each of those 2 days. It is not a list 
of flows but a list of flow activities. So the behavior is expected. We need to 
drill down from flow activities to flow runs to applications.
We can in UI, show them as flow activities only clubbed by date. Page can be 
designed something like below ? List of flows will be differentiated by user.
{noformat}
 __
|  Date   |   List of Flows|
|_||
|  20 Jan |  flow1 |
| |  flow2 |
|_||
| ||
|  19 Jan |  flow2 |
| |  flow3 |
|_|| 
{noformat}

Frankly, it is unlikely that the number of flows per day would run into 
thousands if somebody is specifying flows in YARN tags. But as we are designing 
a generic system we should not make assumptions. In an ad-hoc query system, 
each query may very well be taken as a separate flow if flows are not 
specified. And that run into several thousands per day.
So some sort of pagination support can be added.

By the way, I am not sure why user comes before flow name in the row key. I 
think the order in row key can be reversed and we can then use fromFlowId as a 
new filter along with date range (with reasonable restrictions) to achieve 
pagination. 
And pagination can be done within a day. I mean in example given above, we can 
have a sub-table under List of flows column (in UI)
[~sjlee0], you remember why the user was kept before flow name in row key? To 
achieve user level offline aggregation?

Regarding flow run details under a specific flow activity, I had in fact raised 
a JIRA to limit the number of flow runs returned under a flow activity. But 
realistically speaking, number of runs may not be that large in a day. Also, it 
may not be feasible to do it at HBase side with current schema design. This can 
be done at reader side though to reduce payload. We anyways send minimal 
information about the flow run as we only keep run id and version in flow 
activity table. 
Also, we need to see how we design our UI. If we want to support drilling down 
to flow runs for a specific date, we would infact need all the flow runs for a 
day. How to get this info in chunks and whether its required, we can see.
  
I am not sure what you meant by using flow run information in flow entities 
should be treated as a filter. Can you elaborate ?


was (Author: varun_saxena):
Yes, flow activity is meant to display activity for a day. So if a flow is 
active on two days, it will be shown in each of those 2 days. It is not a list 
of flows but a list of flow activities. So the behavior is expected. We need to 
drill down from flow activities to flow runs to applications.
We can in UI, show them as flow activities only clubbed by date. Page can be 
designed something like below ? List of flows will be differentiated by user.
{noformat}
 __
|  Date   |   List of Flows|
|_||
|  20 Jan |  flow1 |
| |  flow2 |
|_||
| ||
|  19 Jan |  flow2 |
| |  flow3 |
|_|| 
{noformat}

Frankly, it is unlikely that the number of flows per day would run into 
thousands if somebody is specifying flows in YARN tags. But as we are designing 
a generic system we should not make assumptions. In an ad-hoc query system, 
each query may very well be taken as a separate flow if flows are not 
specified. And that run into several thousands per day.
So some sort of pagination support can be added.

By the way, I am not sure why user comes before flow name in the row key. I 
think the order in row key can be reversed and we can then use fromFlowId as a 
new filter along with date range (with reasonable restrictions) to achieve 
pagination. 
And pagination can be done within a day. I mean in example given above, we can 
have a sub-table under List of flows column (in UI)
[~sjlee0], you remember why the user was kept before flow name in row key? To 
achieve user level offline aggregation?

Regarding flow run details under a specific flow activity, I had in fact raised 
a JIRA to limit the number of flow runs returned under a flow activity. But 
realistically speaking, number of runs may not be that large in a day. Also, it 
may not be feasible to do it at HBase side with current schema design. This can 
be done at reader side though to reduce payload. We anyways send minimal 
information

[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps

2017-01-04 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798450#comment-15798450
 ] 

Varun Saxena commented on YARN-6027:


Yes, flow activity is meant to display activity for a day. So if a flow is 
active on two days, it will be shown in each of those 2 days. It is not a list 
of flows but a list of flow activities. So the behavior is expected. We need to 
drill down from flow activities to flow runs to applications.
We can in UI, show them as flow activities only clubbed by date. Page can be 
designed something like below ? List of flows will be differentiated by user.
{noformat}
 __
|  Date   |   List of Flows|
|_||
|  20 Jan |  flow1 |
| |  flow2 |
|_||
| ||
|  19 Jan |  flow2 |
| |  flow3 |
|_|| 
{noformat}

Frankly, it is unlikely that the number of flows per day would run into 
thousands if somebody is specifying flows in YARN tags. But as we are designing 
a generic system we should not make assumptions. In an ad-hoc query system, 
each query may very well be taken as a separate flow if flows are not 
specified. And that run into several thousands per day.
So some sort of pagination support can be added.

By the way, I am not sure why user comes before flow name in the row key. I 
think the order in row key can be reversed and we can then use fromFlowId as a 
new filter along with date range (with reasonable restrictions) to achieve 
pagination. 
And pagination can be done within a day. I mean in example given above, we can 
have a sub-table under List of flows column (in UI)
[~sjlee0], you remember why the user was kept before flow name in row key? To 
achieve user level offline aggregation?

Regarding flow run details under a specific flow activity, I had in fact raised 
a JIRA to limit the number of flow runs returned under a flow activity. But 
realistically speaking, number of runs may not be that large in a day. Also, it 
may not be feasible to do it at HBase side with current schema design. This can 
be done at reader side though to reduce payload. We anyways send minimal 
information about the flow run as we only keep run id and version in flow 
activity table. 
Also, we need to see how we design our UI. If we want to support drilling down 
to flow runs for a specific date, we would infact need all the flow runs.
  
I am not sure what you meant by using flow run information in flow entities 
should be treated as a filter. Can you elaborate ?

> Support fromId for flows/flowrun apps
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2017-01-04 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798347#comment-15798347
 ] 

Jason Lowe commented on YARN-2902:
--

bq. In trunk we "ignore" a non existing directory in delete_as_user() whan the 
stat returns an ENOENT we do not do that in branch-2.   I tracked it back to 
the backport of this jira into branch-2. The native code part of the change was 
not ported back into branch-2.

I'm not seeing that. The "backport" for branch-2 was a cherry-pick:
{noformat}
commit c75d8b164f05033d0c6ed4876cf0b8d57224aa8c
Author: Jason Lowe 
Date:   Thu Oct 29 16:33:29 2015 +

YARN-2902. Killing a container that is localizing can orphan resources in 
the DOWNLOADING state. Contributed by Varun Saxena
(cherry picked from commit e2267de2076245bd9857f6a30e3c731df017fef8)
{noformat}

And the diff between trunk and branch-2 shows no difference at all in the 
delete_as_user function.

Ah, I believe you mean branch-2.6 instead of branch-2.  There is a huge 
difference between trunk and branch-2.6 for delete_as_user, and I agree it 
appears the ENOENT change was dropped in the 2.6 patch.

bq. If it was an oversight does this require a new jira to backport the native 
code or does it get handled as an addendum to this one.

It would be a new JIRA against 2.6.4.  This change has already shipped.


> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.7.2, 2.6.4, 3.0.0-alpha1
>
> Attachments: YARN-2902-branch-2.6.01.patch, YARN-2902.002.patch, 
> YARN-2902.03.patch, YARN-2902.04.patch, YARN-2902.05.patch, 
> YARN-2902.06.patch, YARN-2902.07.patch, YARN-2902.08.patch, 
> YARN-2902.09.patch, YARN-2902.10.patch, YARN-2902.11.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5980) Update documentation for single node hbase deploy

2017-01-04 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798326#comment-15798326
 ] 

Varun Saxena commented on YARN-5980:


Thanks [~vrushalic] for the patch. Not related to the patch but we say in the 
line preceding the ones added that "The version of Apache HBase that is 
supported with Timeline Service v.2 is 1.1.x. The 1.0.x versions do not work 
with Timeline Service v.2. The 1.2.x versions have not been tested."

Post YARN-5976, we now support HBase 1.2.4 so these lines need to be updated.

> Update documentation for single node hbase deploy
> -
>
> Key: YARN-5980
> URL: https://issues.apache.org/jira/browse/YARN-5980
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Vrushali C
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-5980.001.patch
>
>
> Per HBASE-17272, a single node hbase deployment (single jvm running daemons + 
> hdfs writes) will be added to hbase shortly. 
> We should update the timeline service documentation in the setup/deployment 
> context accordingly, this will help users who are a bit wary of hbase 
> deployments help get started with timeline service more easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps

2017-01-04 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798315#comment-15798315
 ] 

Rohith Sharma K S commented on YARN-6027:
-

Thanks folks for the discussion. 

I had offline talk with Sunil regarding Yarn UI integration with ATSv2. Some of 
discussion points are
# When flows are queried without any filters, there are duplicate flows 
entities are sent from reader. Each duplicated flow contains aggregated flow 
runs details. CMIIAW, I think this is because when same flow is run daily,  
there will be 2(assuming 2 days it has run) entries in FlowActivityTable. 
*While reading, if no filters are given then full table scan happens for 
FlowActivityTable.* This result in duplicate entries of same flow name. To me, 
*current behavior of retrieving flows should be restricted to current day 
only*(not even for last 24 hours, which can cause duplicated entries). 
# Now lets take for single day flow activities, if number of flows run is huge, 
lets say 1000, then the REST API result is only 100 flow names where 100 is 
limit.  User can query by increasing limit to 1000, but it is not ideal for UI 
rendering which would go into toss with many issues like browser OOM. Issues is 
UI does not know how many flow are exist, and better solution here is to render 
page by page for a single day. *At least pagination should be supported for 
single day flow activities.* I know that in current HBase schema of 
flowActivity table, pagination would be difficult to achieve but from API layer 
there should be filter for it. Otherwise it is very pain full for UI developers 
who relay on ATSv2 data. 
# Date range and limit filter do not solve UI rendering issues which pagination 
solves. It can only minimizes number of flows. Date range is supported, but 
with in a day, ranges are not supported like 10 AM to 11 AM range. 
# And also I see that flow entities contains all the flow run details. Do we 
really need to embed flowruns details in flow entities? Does not it become 
heavy? I think, flowrun information in flow entities should treated as filter. 
However  there is a separate API to get all the flowruns.

> Support fromId for flows/flowrun apps
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6033) Add support for sections in container-executor configuration file

2017-01-04 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798202#comment-15798202
 ] 

Varun Vasudev commented on YARN-6033:
-

[~sunilg], [~sidharta-s] - can you please take a look? Thanks!

> Add support for sections in container-executor configuration file
> -
>
> Key: YARN-6033
> URL: https://issues.apache.org/jira/browse/YARN-6033
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-6033-YARN-5673.001.patch, 
> YARN-6033-YARN-5673.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6052) Yarn RM UI % of Queue at application level is wrong

2017-01-04 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798205#comment-15798205
 ] 

Sunil G commented on YARN-6052:
---

Looks like a bug. [~Prabhu Joseph], could you please provide a patch. I can 
help to review. You can look {{FiCaSchedulerApp.getResourceUsageReport()}} 

> Yarn RM UI % of Queue at application level is wrong
> ---
>
> Key: YARN-6052
> URL: https://issues.apache.org/jira/browse/YARN-6052
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: RM_UI.png
>
>
> Test Case:
> yarn.scheduler.capacity.root.capacity=100
> yarn.scheduler.capacity.root.queues=default,dummy
> yarn.scheduler.capacity.root.default.capacity=20
> yarn.scheduler.capacity.root.dummy.capacity=80
> yarn.scheduler.capacity.root.dummy.child.capacity=50
> yarn.scheduler.capacity.root.dummy.child2.capacity=50
> Memory Total is 20GB, default queue share is 4GB and dummy queue share is 
> 16GB. Child and Child1 queue gets 8GB share each.
> A map reduce job is submitted  to child2 queue which asks 2 containers of 512 
> MB. Now cluster Memory Used is 1GB.
> Root queue usage = 100 / (total memory / used memory)  = 100 / (20 / 1) =  5%
> Dummy queue usage = 100 / (16 /1) = 6.3%
> Dummy.Child2 queue usage = 100 / (8/1) = 12.5%
> At application level, % of queue is calculated as 100 / (50% of root queue 
> capacity) = 100 / (50% of 20GB) = 10.0 instead of 
> 100 / (50% of dummy queue capacity) = 100 / (50% of 16GB) = 100 / 8 = 12.5
> Where 50% is dummy.child2 capacity
> Attached RM UI screenshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6053) RM Web Service shows startedTime , finsihedTime as zero when RM is kerberized and ACL is setup

2017-01-04 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-6053:

Attachment: RM_UI_ACL.png
RM_WEB_SERVICE_start_stop.png
RM_UI_start_stop.png

> RM Web Service shows startedTime , finsihedTime as zero when RM is kerberized 
> and ACL is setup
> --
>
> Key: YARN-6053
> URL: https://issues.apache.org/jira/browse/YARN-6053
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Priority: Minor
> Attachments: RM_UI_ACL.png, RM_UI_start_stop.png, 
> RM_WEB_SERVICE_start_stop.png
>
>
> RM UI is Kerberized and ACL is setup, a user pjoseph has logged into RM UI 
> and able to see the other user prabhu job’s startTime and finishTime but 
> won’t be able to read the attempts of the application which is expected as 
> ACL is setup. But on using RM Web Services, 
> http://kerberos-3.openstacklocal:8088/ws/v1/cluster/apps/application_1482325548661_0002
>  the startedTime,
> finishedTime and elapsedTime are 0 [AppInfo.java sets this to zero if user 
> does not have access]. We can display the correct values as anyway RM UI 
> shows them.
> Attached output of RM UI and RM WebService.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 113 matches

Mail list logo