date:20170710

[jira] [Commented] (YARN-6102) On failover RM can crash due to unregistered event to AsyncDispatcher

2017-07-10 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080787#comment-16080787
 ] 

Sunil G commented on YARN-6102:
---

bq.How about recreating RMContext for each transition?
I think this is a better and cleaner solution here. Other option will delay the 
switchover given some events are pending in dispatcher queue.
Also this solution does not impact the new active services, give old events 
will play out with old context and which will be taken back by gc later.


> On failover RM can crash due to unregistered event to AsyncDispatcher
> -
>
> Key: YARN-6102
> URL: https://issues.apache.org/jira/browse/YARN-6102
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Ajith S
>Assignee: Ajith S
>Priority: Critical
> Attachments: eventOrder.JPG, YARN-6102.01.patch
>
>
> {code}2017-01-17 16:42:17,911 FATAL [AsyncDispatcher event handler] 
> event.AsyncDispatcher (AsyncDispatcher.java:dispatch(200)) - Error in 
> dispatcher thread
> java.lang.Exception: No handler for registered for class 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEventType
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:196)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:120)
> at java.lang.Thread.run(Thread.java:745)
> 2017-01-17 16:42:17,914 INFO  [AsyncDispatcher ShutDown handler] 
> event.AsyncDispatcher (AsyncDispatcher.java:run(303)) - Exiting, bbye..{code}
> The same stack i was also noticed in {{TestResourceTrackerOnHA}} exits 
> abnormally, after some analysis, i was able to reproduce.
> Once the nodeHeartBeat is sent to RM, inside 
> {{org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)}},
>  before sending it to dispatcher through
> {{this.rmContext.getDispatcher().getEventHandler().handle(nodeStatusEvent);}} 
> if RM failover is called, the dispatcher is reset
> The new dispatcher is however first started and then the events are 
> registered at 
> {{org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(boolean)}}
> So event order will look like
> 1. Send Node heartbeat to {{ResourceTrackerService}}
> 2. In {{ResourceTrackerService.nodeHeartbeat}}, before passing to dispatcher 
> call RM failover
> 3. In RM Failover, current active will reset dispatcher @reinitialize i.e ( 
> {{resetDispatcher();}} + {{createAndInitActiveServices();}} )
> Now between {{resetDispatcher();}} and {{createAndInitActiveServices();}} , 
> the {{ResourceTrackerService.nodeHeartbeat}} invokes dipatcher
> This will cause the above error as at point of time when {{STATUS_UPDATE}} 
> event is given to dispatcher in {{ResourceTrackerService}} , the new 
> dispatcher(from the failover) may be started but not yet registered for events
> Using same steps(with pausing JVM at debug), i was able to reproduce this in 
> production cluster also. for {{STATUS_UPDATE}} active service event, when the 
> service is yet to forward the event to RM dispatcher but a failover is called 
> and dispatcher reset is between {{resetDispatcher();}} & 
> {{createAndInitActiveServices();}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6705) Add separate NM preemption thresholds for cpu and memory

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6705:
-
Attachment: (was: YARN-6675.00.patch)

> Add separate NM preemption thresholds for cpu and memory
> 
>
> Key: YARN-6705
> URL: https://issues.apache.org/jira/browse/YARN-6705
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-6705-YARN-1011.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6706) Refactor ContainerScheduler to make oversubscription change easier

2017-07-10 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080835#comment-16080835
 ] 

Haibo Chen commented on YARN-6706:
--

Would appreciate reviews from [~arun.sur...@gmail.com] and [~kkaranasos] to see 
if the changes look OK to you folks.

> Refactor ContainerScheduler to make oversubscription change easier
> --
>
> Key: YARN-6706
> URL: https://issues.apache.org/jira/browse/YARN-6706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-6706.01.patch, YARN-6706-YARN-1011.00.patch, 
> YARN-6706-YARN-1011.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5412) Create a proxy chain for ResourceManager REST API in the Router

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080940#comment-16080940
 ] 

Hadoop QA commented on YARN-5412:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-2915 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 3s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
17s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
18s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} YARN-2915 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
18s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 59s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 22 new + 235 unchanged - 1 fixed = 257 total (was 236) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
30s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 4 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-server-router in the patch failed. {color} 
|
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
38s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
44s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m  4s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 31s{color} 
| {color:red} hadoop-yarn-server-router in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}125m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesHttpStaticUserPermissions
 |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication |
|   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication
 |
|   | hadoop.yarn.server.router.webapp.TestRouterWebServices |
|   | hadoop.yarn.server.router.rmadmin.TestRouterRMAdminService |
|

[jira] [Commented] (YARN-6705) Add separate NM preemption thresholds for cpu and memory

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080987#comment-16080987
 ] 

Hadoop QA commented on YARN-6705:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} YARN-1011 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
59s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
18s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} YARN-1011 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
58s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in 
YARN-1011 has 1 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
47s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in YARN-1011 has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} YARN-1011 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
25s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 46s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 3 new + 206 unchanged - 0 fixed = 209 total (was 206) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
27s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m  
6s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6705 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876469/YARN-6705-YARN-1011.00.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 5ba458c2dc23 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12

[jira] [Created] (YARN-6797) TimelineWriter does not fully consume the post response

2017-07-10 Thread Jason Lowe (JIRA)

Jason Lowe created YARN-6797:


 Summary: TimelineWriter does not fully consume the post response
 Key: YARN-6797
 URL: https://issues.apache.org/jira/browse/YARN-6797
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineclient
Affects Versions: 2.8.1
Reporter: Jason Lowe
Assignee: Jason Lowe


TimelineWriter does not fully consume the response to the POST request, and 
that ends up preventing the HTTP client from being reused for the next write of 
an entity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6705) Add separate NM preemption thresholds for cpu and memory

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081343#comment-16081343
 ] 

Hadoop QA commented on YARN-6705:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} YARN-1011 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
21s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
52s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} YARN-1011 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
15s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in 
YARN-1011 has 1 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
58s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in YARN-1011 has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} YARN-1011 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
59s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 72m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6705 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876512/YARN-6705-YARN-1011.01.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 21bf04f7a74c 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Commented] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081342#comment-16081342
 ] 

Hadoop QA commented on YARN-1014:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
44s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 44s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitorResourceChange
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-1014 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876520/YARN-1014.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 7d9ab9c0ce39 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 5496a34 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/16360/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/16360/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16360/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output |

[jira] [Updated] (YARN-6378) Negative usedResources memory in CapacityScheduler

2017-07-10 Thread Ravi Prakash (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-6378:
---
Affects Version/s: (was: 2.7.2)
   2.6.0

> Negative usedResources memory in CapacityScheduler
> --
>
> Key: YARN-6378
> URL: https://issues.apache.org/jira/browse/YARN-6378
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Thomas Nystrand, we found that on two of our clusters configured 
> with the CapacityScheduler, usedResources occasionally becomes negative. 
> e.g.
> {code}
> 2017-03-15 11:10:09,449 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignedContainer application attempt=appattempt_1487222361993_17177_01 
> container=Container: [ContainerId: container_1487222361993_17177_01_14, 
> NodeId: :27249, NodeHttpAddress: :8042, Resource: 
> , Priority: 2, Token: null, ] queue=: 
> capacity=0.2, absoluteCapacity=0.2, usedResources=, 
> usedCapacity=0.03409091, absoluteUsedCapacity=0.006818182, numApps=1, 
> numContainers=3 clusterResource= type=RACK_LOCAL
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6378) Negative usedResources memory in CapacityScheduler

2017-07-10 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081355#comment-16081355
 ] 

Ravi Prakash edited comment on YARN-6378 at 7/10/17 11:43 PM:
--

Hi Filipp! Thanks for your report. Is the occurrence of the first negative 
usedResources correlated with applications being moved between queues in your 
case too? You can check this easily from the ResoureManager logs


was (Author: raviprak):
Hi Filipp! Thanks for your report. Is the occurrence of the first negative 
usedResources correlated with applications being moved between queues in your 
case too?

> Negative usedResources memory in CapacityScheduler
> --
>
> Key: YARN-6378
> URL: https://issues.apache.org/jira/browse/YARN-6378
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Thomas Nystrand, we found that on two of our clusters configured 
> with the CapacityScheduler, usedResources occasionally becomes negative. 
> e.g.
> {code}
> 2017-03-15 11:10:09,449 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignedContainer application attempt=appattempt_1487222361993_17177_01 
> container=Container: [ContainerId: container_1487222361993_17177_01_14, 
> NodeId: :27249, NodeHttpAddress: :8042, Resource: 
> , Priority: 2, Token: null, ] queue=: 
> capacity=0.2, absoluteCapacity=0.2, usedResources=, 
> usedCapacity=0.03409091, absoluteUsedCapacity=0.006818182, numApps=1, 
> numContainers=3 clusterResource= type=RACK_LOCAL
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6378) Negative usedResources memory in CapacityScheduler

2017-07-10 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081355#comment-16081355
 ] 

Ravi Prakash commented on YARN-6378:


Hi Filipp! Thanks for your report. Is the occurrence of the first negative 
usedResources correlated with applications being moved between queues in your 
case too?

> Negative usedResources memory in CapacityScheduler
> --
>
> Key: YARN-6378
> URL: https://issues.apache.org/jira/browse/YARN-6378
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Thomas Nystrand, we found that on two of our clusters configured 
> with the CapacityScheduler, usedResources occasionally becomes negative. 
> e.g.
> {code}
> 2017-03-15 11:10:09,449 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignedContainer application attempt=appattempt_1487222361993_17177_01 
> container=Container: [ContainerId: container_1487222361993_17177_01_14, 
> NodeId: :27249, NodeHttpAddress: :8042, Resource: 
> , Priority: 2, Token: null, ] queue=: 
> capacity=0.2, absoluteCapacity=0.2, usedResources=, 
> usedCapacity=0.03409091, absoluteUsedCapacity=0.006818182, numApps=1, 
> numContainers=3 clusterResource= type=RACK_LOCAL
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5412) Create a proxy chain for ResourceManager REST API in the Router

2017-07-10 Thread Carlo Curino (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081358#comment-16081358
 ] 

Carlo Curino commented on YARN-5412:



Giovanni, thanks for the contribution. Please find below a partial review 
(patch is large going through it slowly, and posting comments 
so you can parallelize addressing them).

BUG?
# {{DefaultRequestInterceptorREST.init}} does not propagate correctly to 
downstream interceptors... you might want to have a "localInit" method that is 
implemented to locally initialize the components, while the 
{{AbstractRESTRequestInterceptor}} takes care of propagating to downstream, by 
invoking the outward {{init}}

MUST HAVE CLEANUPS
# In the next version of patch, please remove the changes you already pushed in 
YARN-6792
# {{DefaultRequestInterceptorREST}} there is lots of repeting code doing the 
"check status" stuff, which could be solved with a single method that receives 
in input a string-constant and the .class of the expected outcome.
# {{RESTRequestInterceptor}} there are a few (uncommented) methods, which I am 
not sure belong here. Shall we move them to RMWebServiceProtocol? Help me 
understand why e.g., getContainer should be here.

NITS
# {{yarn-default.xml}}  the comment is not very clear about how the list of 
classes make up the pipeline.
# {{AbstractRESTRequestInterceptor}} javadoc errors "can can"
# {{(RMWSConsts.STATES}} is a confusing name, something like CONTAINER_STATES? 
# {{RESTRequestInterceptor}} the javadoc of {{setNextInterceptor}} is 
confusing. It says the last interceptor does not receive this call, while it 
will likely receive it with a null value, correct?

GOOD TO HAVE
# {{RMWebAppUtil.isStaticUser}} is invoked a lot in {{RMWebServices}}, and 
every time it performs a read from conf. Can we cache the value of staticUser, 
so this method gets much lighter to invoke?
# {{DefaultRequestInterceptorREST}} Can we try to use generics to remove lots 
of the almost-code-repetition we currently have?

QUESTIONS:
# can we scope-down to tests the import of nodemanager project?
# why does {{RESTRequestInterceptor.init}} has a "user" param? Javadoc is 
incomplete, and it is not used by {{DefaultRequestInterceptorREST}} so hard to 
understand.


[...] more to come (I will re-post the entire list adding to it, so numbering 
remains consistent... we can -strike-out- elements as you do them

> Create a proxy chain for ResourceManager REST API in the Router
> ---
>
> Key: YARN-5412
> URL: https://issues.apache.org/jira/browse/YARN-5412
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
> Attachments: YARN-5412-YARN-2915.1.patch, 
> YARN-5412-YARN-2915.proto.patch
>
>
> As detailed in the proposal in the umbrella JIRA, we are introducing a new 
> component that routes client request to appropriate ResourceManager(s). This 
> JIRA tracks the creation of a proxy for ResourceManager REST API in the 
> Router. This provides a placeholder for:
> 1) throttling mis-behaving clients (YARN-1546)
> 3) mask the access to multiple RMs (YARN-3659)
> We are planning to follow the interceptor pattern like we did in YARN-2884 to 
> generalize the approach and have only dynamically coupling for Federation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6706) Refactor ContainerScheduler to make oversubscription change easier

2017-07-10 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080813#comment-16080813
 ] 

Karthik Kambatla commented on YARN-6706:


I feel we should get this into trunk directly instead of YARN-1011 branch.

The current patch looks okay, Can we add a few more things:
# Make ResourceUtilizationTracker pluggable. That way, we could use a different 
tracker when oversubscription is enabled.
# ContainerScheduler
##  Why do we need maxOppQueueLength given queuingLimit?
## Is there value in splitting runningContainers into runningGuaranteed and 
runningOpportunistic?
## getOpportunisticContainersStatus method implementation feels awkward. How 
about capturing the state in the field here, and have metrics etc. pull from 
here? 
## startContainersFromQueue: Local variable resourcesAvailable is unnecessary
# OpportunisticContainersStatus
## Let us clearly differentiate between allocated, used and utilized. Maybe, we 
should rename current *Used* methods to *Allocated*?
## I prefer either full name Opportunistic (in method) or Opp (shortest name 
that makes sense). Opport is neither short nor fully descriptive.
## Have we considered folding ContainerQueuingLimit class into this?

/cc [~asuresh] and [~kkaranasos]

> Refactor ContainerScheduler to make oversubscription change easier
> --
>
> Key: YARN-6706
> URL: https://issues.apache.org/jira/browse/YARN-6706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-6706-YARN-1011.00.patch, 
> YARN-6706-YARN-1011.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6788) Improve performance of resource profile branch

2017-07-10 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080868#comment-16080868
 ] 

Wangda Tan commented on YARN-6788:
--

Thanks [~sunilg], comments:

Resource:
- {{public abstract Integer getResourceIndex(String resource)}} should be 
internal API, how about only add it to 
{{ResourceUtils.getResourceTypeIndex(String resourceName)}}. We should try to 
avoid store any local information of resource_type_to_index.

ResourcePBImpl:
- What does the readOnlyResources means?

bq. In Resource class, there are few impls for SimpleResource. I also had an 
offline talk with Varun Vasudev. May be its better to move all resource related 
impl from ResourcePBImpl to Resource class itself and can keep only proto 
related changes in PBImpl class. So SimpleResource could look into memory and 
cpu resource info objects.
I suggest to not take this shortcut, we have several built-in resource types 
are adding to YARN, such as GPU/FPGA, I don't want to get unexpected 
performance regression after adding them.

Resources/DominantResourceCalculator (Maybe there're more places could be 
changed)
Now they're using either {{setResourceValue(name, value)}}, or 
{{getResourceInformation(rName)}}. Both of them will do frequent map-looking 
operations. Instead of doing this, can we add a public (marked as {{@private}}) 
APIs to Resource object, which support get/setResourceInformation/Value with 
index. Internally we can use it to do computations.
To me all name-related fields should not be used while doing computations. 
String-based names should be only used for human-readability, such as 
UI/message, etc. 

(Not performance related), there're many places in the code hardcoded to use 
{{set/getVirtualCores}} and assume there're only 2 resource types. I suggest to 
review all of them before merge the branch. 

> Improve performance of resource profile branch
> --
>
> Key: YARN-6788
> URL: https://issues.apache.org/jira/browse/YARN-6788
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Blocker
> Attachments: YARN-6788-YARN-3926.001.patch
>
>
> Currently we could see a 15% performance delta with this branch. 
> Few performance improvements to improve the same.
> Also this patch will handle 
> [comments|https://issues.apache.org/jira/browse/YARN-6761?focusedCommentId=16075418=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16075418]
>  from [~leftnoteasy].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081290#comment-16081290
 ] 

Hadoop QA commented on YARN-1014:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
42s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 15s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 47s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitorResourceChange
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-1014 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876510/YARN-1014.00.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 572f5bb2d371 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 5496a34 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/16358/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16358/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit |

[jira] [Commented] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081351#comment-16081351
 ] 

Hadoop QA commented on YARN-6777:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
56s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 52s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 234 unchanged - 0 fixed = 236 total (was 234) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
37s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
31s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 49s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}103m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 |
|   | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation |
|   | 
hadoop.yarn.server.resourcemanager.TestOpportunisticContainerAllocatorAMService 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6777 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876332/YARN-6777.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 2a130b9c66ff 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Commented] (YARN-5731) Preemption does not work in few corner cases when reservations are placed

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081357#comment-16081357
 ] 

Hadoop QA commented on YARN-5731:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
53s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
24s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_131 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m  9s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_131. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}175m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_131 Failed junit tests | 
hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicyForReservedContainers
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSurgicalPreemption
 |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | 
hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicy
 |
| JDK v1.7.0_131 Failed junit tests | 
hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicyForReservedContainers
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSurgicalPreemption
 |
|   |

[jira] [Comment Edited] (YARN-6593) [API] Introduce Placement Constraint object

2017-07-10 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081175#comment-16081175
 ] 

Wangda Tan edited comment on YARN-6593 at 7/10/17 11:48 PM:


Thanks [~kkaranasos], 

I just took a closer look at  the patch, some questions/comments (I 
consolidated all my above questions/comments to this one, so you only need to 
look at this comment and after).

1) For classes/methods are marked to be {{@public}}, it should be user-facing 
APIs or wire-protocol-format. I'm not sure if we should mark following classes 
to {{@private}}.
- Visitable/Visitor/PlacementConstraintTransformations
- PlacementConstraintToProtoConverter (Methods to transform a Constraint class 
to protocol should not be {{@public}}, but behavior of the process should be 
compatible).

And do you think we should explicitly add comment to following class to say 
they are marked to {{@public}} only because of wire-format compatibility?

2) Is there any reason to put PlacementConstraints in {{hadoop-yarn-common}} 
project instead of {{hadoop-yarn-api}}?

3) I'm not sure when you plan to use 
{{SingleConstraintTransformer}}/{{SpecializedConstraintTransformer}}. If you 
agree that they should not be user-facing, do you think we should move them to 
a separate JIRA while doing implementation?

4) Can we revert changes to TestContainerLaunch? 

5) In general: more javadocs need to be added for user-facing APIs, we can do 
this once we have a general agreement on APIs.

= EDITs (July 10), added few more points: =

6) As I commented on YARN-6594, one of the biggest issue of old ResourceRequest 
is it cannot properly handle resource update. 
For example, at time T1 app request "host1" with #container=2, and at time T2 
app decide to request "host1" for 2 more container, so it request "host1" with 
#containers=4, however, at the same time, scheduler allocated 2 containers 
already, so scheduler receive the request and update pending #containers=4. 
When this happens, app can receive much more allocated containers than 
originally requested.

There're two separate problems:
6.1) Partially update a placement request within the tree. To alleviate problem 
above, we can partially update single or multiple placement constraints, by 
doing this I think we can allocate unique-constraint-id for each contraint 
internally and every time send diff to RM, end-user should not even notice that.

For example:
{code}
A placement constraint specified by user
and {
  OR {
AND {
  target-constraint { host IN "host1" },
  cardinality-constraint { max_cardinality = 2}
},
AND {
  target-constraint { host IN "host2" }, 
  cardinality-constraint { max_cardinality = 4 }
}
}
{code}

unique-constraint-id (UCID) will be added in client and will be sent to server 
for each constraint. 
{code}
A placement constraint specified by user
and (ucid=1) {
  OR (ucid=2) {
AND (ucid=3) {
  target-constraint (ucid=4) { host IN "host1" },
  cardinality-constraint (ucid=5) { max_cardinality = 2}
},
AND (ucid=6) {
  target-constraint (ucid=7) { host IN "host2" }, 
  cardinality-constraint (ucid=8) { max_cardinality = 4 }
}
}
{code}

If application updated max_cardinality to 3 for host1, only diff will be sent 
to scheduler:
{code}
diff: [
  {
cardinality-constraint (ucid=5) { max_cardinality = 3} 
  }
]
{code}

6.2) Increment/decrement #pending asks.
Only partially update constraint is not enough because it cannot avoid the 
problem mentioned in 6). In addition to proposed in 6.1, do you think in the 
protocol we can say: +2 for maximum-cardinality for ucid=5? When scheduler 
received such requests, it will look at allocated-but-not-yet-acquired 
containers to avoid allocate more containers than requested.

Requesting: [~asuresh]/[~subru]/[~chris.douglas]/[~sunilg] for more suggestions.


was (Author: leftnoteasy):
Thanks [~kkaranasos], 

I just took a closer look at  the patch, some questions/comments (I 
consolidated all my above questions/comments to this one, so you only need to 
look at this comment and after).

1) For classes/methods are marked to be {{@public}}, it should be user-facing 
APIs or wire-protocol-format. I'm not sure if we should mark following classes 
to {{@private}}.
- Visitable/Visitor/PlacementConstraintTransformations
- PlacementConstraintToProtoConverter (Methods to transform a Constraint class 
to protocol should not be {{@public}}, but behavior of the process should be 
compatible).

And do you think we should explicitly add comment to following class to say 
they are marked to {{@public}} only because of wire-format compatibility?

2) Is there any reason to put PlacementConstraints in {{hadoop-yarn-common}} 
project instead of {{hadoop-yarn-api}}?

3) I'm not sure when you plan to use 
{{SingleConstraintTransformer}}/{{SpecializedConstraintTransformer}}. If you 
agree

[jira] [Updated] (YARN-6705) Add separate NM preemption thresholds for cpu and memory

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6705:
-
Attachment: YARN-6705-YARN-1011.01.patch

New patch to address the checkstyle issue. Findbug warnings are unrelated.

> Add separate NM preemption thresholds for cpu and memory
> 
>
> Key: YARN-6705
> URL: https://issues.apache.org/jira/browse/YARN-6705
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-6705-YARN-1011.00.patch, 
> YARN-6705-YARN-1011.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6776) Refactor ApplicaitonMasterService to move actual processing logic to a separate class

2017-07-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081239#comment-16081239
 ] 

Hudson commented on YARN-6776:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11983 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11983/])
YARN-6776. Refactor ApplicaitonMasterService to move actual processing (arun 
suresh: rev 5496a34c0cb2b1a83cfa6b0aba5a77b05ff2d8f0)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/DefaultAMSProcessor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestOpportunisticContainerAllocatorAMService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/OpportunisticContainerAllocatorAMService.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/ams/ApplicationMasterServiceProcessor.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/ams/ApplicationMasterServiceUtils.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/ams/package-info.java


> Refactor ApplicaitonMasterService to move actual processing logic to a 
> separate class
> -
>
> Key: YARN-6776
> URL: https://issues.apache.org/jira/browse/YARN-6776
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Minor
> Fix For: 3.0.0-beta1
>
> Attachments: YARN-6776.001.patch, YARN-6776.002.patch, 
> YARN-6776.003.patch, YARN-6776.004.patch
>
>
> Minor refactoring to move the processing logic of the 
> {{ApplicationMasterService}} into a separate class.
> The per appattempt locking as well as the extraction of the appAttemptId etc. 
> will remain in the ApplicationMasterService 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-6798) NM startup failure with old state store due to version mismatch

2017-07-10 Thread Ray Chiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang reassigned YARN-6798:


Assignee: Ray Chiang

> NM startup failure with old state store due to version mismatch
> ---
>
> Key: YARN-6798
> URL: https://issues.apache.org/jira/browse/YARN-6798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha4
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>
> YARN-6703 rolled back the state store version number for the RM from 2.0 to 
> 1.4.
> YARN-6127 bumped the version for the NM to 3.0
> private static final Version CURRENT_VERSION_INFO = 
> Version.newInstance(3, 0);
> YARN-5049 bumped the version for the NM to 2.0
> private static final Version CURRENT_VERSION_INFO = 
> Version.newInstance(2, 0);
> During an upgrade, all NMs died after upgrading a C6 cluster from alpha2 to 
> alpha4.
> {noformat}
> 2017-07-07 15:48:17,259 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
> Incompatible version for NM state: expecting NM state version 3.0, but 
> loading version 2.0
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:246)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:307)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:748)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:809)
> Caused by: java.io.IOException: Incompatible version for NM state: expecting 
> NM state version 3.0, but loading version 2.0
> at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.checkVersion(NMLeveldbStateStoreService.java:1454)
> at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:1308)
> at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:307)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> ... 5 more
> 2017-07-07 15:48:17,277 INFO 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down NodeManager at xxx.gce.cloudera.com/aa.bb.cc.dd
> /
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6706) Refactor ContainerScheduler to make oversubscription change easier

2017-07-10 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081275#comment-16081275
 ] 

Haibo Chen commented on YARN-6706:
--

bq. Not sure if startPendingContainers() before and after en-queuing is usefull
You are referring to the else statement in 
ContainerScheduler#scheduleContainer() right? The reason of doing 
startPendingContainers() both before and after en-queuing is so that we always 
respect the max queue limit for OPPORTUNISTIC containers.
If we were to always do enqueueing first and then startPendingContainers(), we 
could end up going over the OPPR container queue length.

For GUARANTEED containers, killOpportunisticContainers is needed if the 
GUARANTEED container stays in the queue after startPendingContainers(). If I 
misunderstood your comment, can you elaborate a little more.

> Refactor ContainerScheduler to make oversubscription change easier
> --
>
> Key: YARN-6706
> URL: https://issues.apache.org/jira/browse/YARN-6706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-6706.01.patch, YARN-6706-YARN-1011.00.patch, 
> YARN-6706-YARN-1011.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-1014:
-
Attachment: YARN-1014.01.patch

Update the patch to address the checkstyle issue. Findbug warnings are 
unrelated.

> Configure OOM Killer to kill OPPORTUNISTIC containers first
> ---
>
> Key: YARN-1014
> URL: https://issues.apache.org/jira/browse/YARN-1014
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Arun C Murthy
>Assignee: Karthik Kambatla
> Attachments: YARN-1014.00.patch, YARN-1014.01.patch
>
>
> YARN-2882 introduces the notion of OPPORTUNISTIC containers. These containers 
> should be killed first should the system run out of memory. 
> -
> Previous description:
> Once RM allocates 'speculative containers' we need to get LCE to schedule 
> them at lower priorities via cgroups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6594) [API] Introduce SchedulingRequest object

2017-07-10 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081309#comment-16081309
 ] 

Wangda Tan commented on YARN-6594:
--

[~arun.sur...@gmail.com]

Agree with most of your points. Regarding to ContainerRequest, I'm not sure we 
should keep it if we decide to deprecate AMRMClient. Same as my above comment, 
it's highly bound to MR-like semantics.

I think we should not remove any of the AMRMClient/ContainerRequest logics, 
just deprecate them. We can add support in RM side to convert placement-request 
/ container-request / resource-request to the unified representation. 
Applications can use AMRMClient as-is if they don't want to move.

I didn't see any problems to support logics like AM re-registers in the new 
impl.

> [API] Introduce SchedulingRequest object
> 
>
> Key: YARN-6594
> URL: https://issues.apache.org/jira/browse/YARN-6594
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6594.001.patch
>
>
> This JIRA introduces a new SchedulingRequest object.
> It will be part of the {{AllocateRequest}} and will be used to define sizing 
> (e.g., number of allocations, size of allocations) and placement constraints 
> for allocations.
> Applications can use either this new object (when rich placement constraints 
> are required) or the existing {{ResourceRequest}} object.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-1014:
-
Attachment: YARN-1014.02.patch

Patch updated to fix the unit test failure

> Configure OOM Killer to kill OPPORTUNISTIC containers first
> ---
>
> Key: YARN-1014
> URL: https://issues.apache.org/jira/browse/YARN-1014
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Arun C Murthy
>Assignee: Karthik Kambatla
> Attachments: YARN-1014.00.patch, YARN-1014.01.patch, 
> YARN-1014.02.patch
>
>
> YARN-2882 introduces the notion of OPPORTUNISTIC containers. These containers 
> should be killed first should the system run out of memory. 
> -
> Previous description:
> Once RM allocates 'speculative containers' we need to get LCE to schedule 
> them at lower priorities via cgroups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6788) Improve performance of resource profile branch

2017-07-10 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-6788:
--
Priority: Blocker  (was: Major)

> Improve performance of resource profile branch
> --
>
> Key: YARN-6788
> URL: https://issues.apache.org/jira/browse/YARN-6788
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Blocker
>
> Currently we could see a 15% performance delta with this branch. 
> Few performance improvements to improve the same.
> Also this patch will handle 
> [comments|https://issues.apache.org/jira/browse/YARN-6761?focusedCommentId=16075418=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16075418]
>  from [~leftnoteasy].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6790) Yarn : Exception from container-launch : Container failed with state: EXITED_WITH_FAILURE

2017-07-10 Thread saurab (JIRA)

saurab created YARN-6790:


 Summary: Yarn : Exception from container-launch : Container failed 
with state: EXITED_WITH_FAILURE
 Key: YARN-6790
 URL: https://issues.apache.org/jira/browse/YARN-6790
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: hadoop-2.8.0, tez-0.8.5
ram 8gb, Dell inspiron-15 3000 series intell-i5
Reporter: saurab


I wanted to run hive queries through jdbc, but I am getting

java.sql.SQLException: Error while processing statement: FAILED: Execution 
Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
Then I looked nodemanager log. Here are some key notes to consider

1)Container container_1499666177243_0001_02_01 transitioned from RUNNING to 
EXITED_WITH_FAILURERESULT=FAILURE
2)DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE
And here is complete stack trace

2017-07-10 11:41:34,149 WARN 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception 
from container-launch with container ID: container_1499666177243_0001_02_01 
and exit code: 1
ExitCodeException exitCode=1: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
at org.apache.hadoop.util.Shell.run(Shell.java:869)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:236)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:305)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:84)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from 
container-launch.
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: 
container_1499666177243_0001_02_01
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 1
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Stack trace: 
ExitCodeException exitCode=1: 
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
org.apache.hadoop.util.Shell.run(Shell.java:869)
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:236)
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:305)
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:84)
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
2017-07-10 11:41:34,152 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
2017-07-10 11:41:34,153 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
2017-07-10 11:41:34,153 INFO 
org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
java.lang.Thread.run(Thread.java:748)
2017-07-10 11:41:34,153 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
 Container exited with a non-zero exit code 1
2017-07-10 11:41:34,156 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
 Container container_1499666177243_0001_02_01 transitioned from RUNNING to 
EXITED_WITH_FAILURE
2017-07-10 11:41:34,156 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
 Cleaning up container container_1499666177243_0001_02_01
2017-07-10 11:41:34,199 WARN 
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=saurab   
OPERATION=Container Finished - Failed

[jira] [Commented] (YARN-6102) On failover RM can crash due to unregistered event to AsyncDispatcher

2017-07-10 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080048#comment-16080048
 ] 

Rohith Sharma K S commented on YARN-6102:
-

I think there are 2 solution to solve this issue. 
# Ensure RPC threads are stopped. This would be expensive where in every thread 
need to join or any other mechanism.
# How about recreating RMContext for each transition?  This allows to take 
always new context and deal with it. cc :/ [~jianhe]


> On failover RM can crash due to unregistered event to AsyncDispatcher
> -
>
> Key: YARN-6102
> URL: https://issues.apache.org/jira/browse/YARN-6102
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Ajith S
>Assignee: Ajith S
>Priority: Critical
> Attachments: eventOrder.JPG
>
>
> {code}2017-01-17 16:42:17,911 FATAL [AsyncDispatcher event handler] 
> event.AsyncDispatcher (AsyncDispatcher.java:dispatch(200)) - Error in 
> dispatcher thread
> java.lang.Exception: No handler for registered for class 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEventType
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:196)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:120)
> at java.lang.Thread.run(Thread.java:745)
> 2017-01-17 16:42:17,914 INFO  [AsyncDispatcher ShutDown handler] 
> event.AsyncDispatcher (AsyncDispatcher.java:run(303)) - Exiting, bbye..{code}
> The same stack i was also noticed in {{TestResourceTrackerOnHA}} exits 
> abnormally, after some analysis, i was able to reproduce.
> Once the nodeHeartBeat is sent to RM, inside 
> {{org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)}},
>  before sending it to dispatcher through
> {{this.rmContext.getDispatcher().getEventHandler().handle(nodeStatusEvent);}} 
> if RM failover is called, the dispatcher is reset
> The new dispatcher is however first started and then the events are 
> registered at 
> {{org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(boolean)}}
> So event order will look like
> 1. Send Node heartbeat to {{ResourceTrackerService}}
> 2. In {{ResourceTrackerService.nodeHeartbeat}}, before passing to dispatcher 
> call RM failover
> 3. In RM Failover, current active will reset dispatcher @reinitialize i.e ( 
> {{resetDispatcher();}} + {{createAndInitActiveServices();}} )
> Now between {{resetDispatcher();}} and {{createAndInitActiveServices();}} , 
> the {{ResourceTrackerService.nodeHeartbeat}} invokes dipatcher
> This will cause the above error as at point of time when {{STATUS_UPDATE}} 
> event is given to dispatcher in {{ResourceTrackerService}} , the new 
> dispatcher(from the failover) may be started but not yet registered for events
> Using same steps(with pausing JVM at debug), i was able to reproduce this in 
> production cluster also. for {{STATUS_UPDATE}} active service event, when the 
> service is yet to forward the event to RM dispatcher but a failover is called 
> and dispatcher reset is between {{resetDispatcher();}} & 
> {{createAndInitActiveServices();}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6761) Fix build for YARN-3926 branch

2017-07-10 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080013#comment-16080013
 ] 

Daniel Templeton commented on YARN-6761:


Thanks for getting the patch in, [~sunilg].  Sorry I didn't get a chance to 
review it before the commit.  Your patch did exactly what I would have done. :)

I have a question about the way the {{ResourceInformation}} information objects 
are handled in this patch and in general.  Why do we need local copies of them? 
 Shouldn't there only be one master source of truth for resource information, 
and shouldn't everyone just refer to that master source of truth?  I assume 
that's where you're going with YARN-6789.

> Fix build for YARN-3926 branch
> --
>
> Key: YARN-6761
> URL: https://issues.apache.org/jira/browse/YARN-6761
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: YARN-3926
>
> Attachments: YARN-6761-YARN-3926.001.patch
>
>
> After rebasing to trunk, due to the addition of YARN-6679, compilation of the 
> YARN-3926 branch is broken.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-6785) Resource.SimpleResource does not implement the new Resource methods

2017-07-10 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton resolved YARN-6785.

Resolution: Duplicate

> Resource.SimpleResource does not implement the new Resource methods
> ---
>
> Key: YARN-6785
> URL: https://issues.apache.org/jira/browse/YARN-6785
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: YARN-3926
>Reporter: Daniel Templeton
>Priority: Blocker
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6761) Fix build for YARN-3926 branch

2017-07-10 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080016#comment-16080016
 ] 

Sunil G commented on YARN-6761:
---

Thanks [~templedf] 

Yes, you are correct. In YARN-6788, I made a single source of truth and 
referring the same. Hence I could save a bit of extra overhead in map iterator 
and etc. I ll be uploading a patch today for same with test results.

> Fix build for YARN-3926 branch
> --
>
> Key: YARN-6761
> URL: https://issues.apache.org/jira/browse/YARN-6761
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: YARN-3926
>
> Attachments: YARN-6761-YARN-3926.001.patch
>
>
> After rebasing to trunk, due to the addition of YARN-6679, compilation of the 
> YARN-3926 branch is broken.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5655) TestContainerManagerSecurity#testNMTokens is asserting

2017-07-10 Thread Sonia Garudi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079964#comment-16079964
 ] 

Sonia Garudi commented on YARN-5655:


Any update on this ? I am seeing this error in trunk .

> TestContainerManagerSecurity#testNMTokens is asserting
> --
>
> Key: YARN-5655
> URL: https://issues.apache.org/jira/browse/YARN-5655
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jason Lowe
>Assignee: Robert Kanter
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: mvn_test.out, YARN-5655.001.patch
>
>
> TestContainerManagerSecurity has been failing recently in 2.8:
> {noformat}
> Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 80.928 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
>   Time elapsed: 44.478 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:394)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:337)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)
> testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
>   Time elapsed: 34.964 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:333)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6771) Use classloader inside configuration class to make new classes

2017-07-10 Thread Jongyoul Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jongyoul Lee updated YARN-6771:
---
Affects Version/s: 3.0.0-alpha4

> Use classloader inside configuration class to make new classes 
> ---
>
> Key: YARN-6771
> URL: https://issues.apache.org/jira/browse/YARN-6771
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.1, 3.0.0-alpha4
>Reporter: Jongyoul Lee
> Fix For: 2.8.2
>
> Attachments: YARN-6771-1.patch, YARN-6771-2.patch, YARN-6771.patch
>
>
> While running {{RpcClientFactoryPBImpl.getClient}}, 
> {{RpcClientFactoryPBImpl}} uses {{localConf.getClassByName}}. But in case of 
> using custom classloader, we have to use {{conf.getClassByName}} because 
> custom classloader is already stored in {{Configuration}} class.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5412) Create a proxy chain for ResourceManager REST API in the Router

2017-07-10 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080700#comment-16080700
 ] 

Giovanni Matteo Fumarola commented on YARN-5412:


Attached v1 - with some fixed checkstyle warnings and the correct pom.xml for 
hadoop-yarn-server.router package. 

> Create a proxy chain for ResourceManager REST API in the Router
> ---
>
> Key: YARN-5412
> URL: https://issues.apache.org/jira/browse/YARN-5412
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
> Attachments: YARN-5412-YARN-2915.1.patch, 
> YARN-5412-YARN-2915.proto.patch
>
>
> As detailed in the proposal in the umbrella JIRA, we are introducing a new 
> component that routes client request to appropriate ResourceManager(s). This 
> JIRA tracks the creation of a proxy for ResourceManager REST API in the 
> Router. This provides a placeholder for:
> 1) throttling mis-behaving clients (YARN-1546)
> 3) mask the access to multiple RMs (YARN-3659)
> We are planning to follow the interceptor pattern like we did in YARN-2884 to 
> generalize the approach and have only dynamically coupling for Federation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6675) Add NM support to launch opportunistic containers based on oversubscription

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6675:
-
Attachment: YARN-6675.00.patch

Attached a patch that is built on top of YARN-6706-YARN-1011.01.patch

> Add NM support to launch opportunistic containers based on oversubscription
> ---
>
> Key: YARN-6675
> URL: https://issues.apache.org/jira/browse/YARN-6675
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-6675.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6793) Duplicated reservation in Fair Scheduler preemption

2017-07-10 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6793:
---
Description: 
There is a delay between preemption happen and containers are killed. If some 
resources released from nodes which are supposed to be preempted at that time 
are not enough for the resource request, reservation happens again at that node.
E.g. scheduler reserves  in node 1 for app 1. It will 
take 15s by default to kill containers in node 1 for fulfill that resource 
requests. If  was released from node 1 before the 
killing, scheduler reserves  again in node 1 for app1. 
The second reservation may never be unreserved. 

  was:
There is a delay between preemption happen and containers are killed. If some 
resources released from nodes which are supposed to be preempted at that time 
are not enough for the resource request, reservation happens again at that node.
E.g. scheduler reserves  in node 1 for app 1. It will 
take 15s by default to kill containers in node 1 for fulfill that resource 
requests. If  was released from node 1 before the 
killing, scheduler reserves  again in node 1 for app1.  


> Duplicated reservation in Fair Scheduler preemption 
> 
>
> Key: YARN-6793
> URL: https://issues.apache.org/jira/browse/YARN-6793
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.1, 3.0.0-alpha3
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Critical
>
> There is a delay between preemption happen and containers are killed. If some 
> resources released from nodes which are supposed to be preempted at that time 
> are not enough for the resource request, reservation happens again at that 
> node.
> E.g. scheduler reserves  in node 1 for app 1. It will 
> take 15s by default to kill containers in node 1 for fulfill that resource 
> requests. If  was released from node 1 before the 
> killing, scheduler reserves  again in node 1 for app1. 
> The second reservation may never be unreserved. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6706) Refactor ContainerScheduler to make oversubscription change easier

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6706:
-
Attachment: YARN-6706.01.patch

Upload a patch against trunk, which is the same as YARN-6706-YARN-1011.01.patch 
since it applies cleanly.

> Refactor ContainerScheduler to make oversubscription change easier
> --
>
> Key: YARN-6706
> URL: https://issues.apache.org/jira/browse/YARN-6706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-6706.01.patch, YARN-6706-YARN-1011.00.patch, 
> YARN-6706-YARN-1011.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6705) Add separate NM preemption thresholds for cpu and memory

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6705:
-
Attachment: YARN-6675.00.patch

> Add separate NM preemption thresholds for cpu and memory
> 
>
> Key: YARN-6705
> URL: https://issues.apache.org/jira/browse/YARN-6705
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-6705-YARN-1011.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6706) Refactor ContainerScheduler to make oversubscription change easier

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080916#comment-16080916
 ] 

Hadoop QA commented on YARN-6706:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
46s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m  
8s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6706 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876466/YARN-6706.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux dc1855100dc5 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 09653ea |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/16352/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16352/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16352/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Refactor ContainerScheduler to make oversubscription change easier
> --
>
> Key: YARN-6706
> URL: https://issues.apache.org/jira/browse/YARN-6706
> Project:

[jira] [Commented] (YARN-5892) Support user-specific minimum user limit percentage in Capacity Scheduler

2017-07-10 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080842#comment-16080842
 ] 

Sunil G commented on YARN-5892:
---

bq.I tried backporting YARN-3140 to branch-2.8, but I had several deadlock 
problems.
Ok, Thats worrying. I could understand the urgency, so i am fine with approach 
suggested here. I ll help in review this patch soon

> Support user-specific minimum user limit percentage in Capacity Scheduler
> -
>
> Key: YARN-5892
> URL: https://issues.apache.org/jira/browse/YARN-5892
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Eric Payne
>Assignee: Eric Payne
> Fix For: 3.0.0-alpha3
>
> Attachments: Active users highlighted.jpg, YARN-5892.001.patch, 
> YARN-5892.002.patch, YARN-5892.003.patch, YARN-5892.004.patch, 
> YARN-5892.005.patch, YARN-5892.006.patch, YARN-5892.007.patch, 
> YARN-5892.008.patch, YARN-5892.009.patch, YARN-5892.010.patch, 
> YARN-5892.012.patch, YARN-5892.013.patch, YARN-5892.014.patch, 
> YARN-5892.015.patch, YARN-5892.branch-2.015.patch
>
>
> Currently, in the capacity scheduler, the {{minimum-user-limit-percent}} 
> property is per queue. A cluster admin should be able to set the minimum user 
> limit percent on a per-user basis within the queue.
> This functionality is needed so that when intra-queue preemption is enabled 
> (YARN-4945 / YARN-2113), some users can be deemed as more important than 
> other users, and resources from VIP users won't be as likely to be preempted.
> For example, if the {{getstuffdone}} queue has a MULP of 25 percent, but user 
> {{jane}} is a power user of queue {{getstuffdone}} and needs to be guaranteed 
> 75 percent, the properties for {{getstuffdone}} and {{jane}} would look like 
> this:
> {code}
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.minimum-user-limit-percent
> 25
>   
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.jane.minimum-user-limit-percent
> 75
>   
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6705) Add separate NM preemption thresholds for cpu and memory

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6705:
-
Attachment: YARN-6705-YARN-1011.00.patch

> Add separate NM preemption thresholds for cpu and memory
> 
>
> Key: YARN-6705
> URL: https://issues.apache.org/jira/browse/YARN-6705
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-6705-YARN-1011.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5892) Support user-specific minimum user limit percentage in Capacity Scheduler

2017-07-10 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080811#comment-16080811
 ] 

Eric Payne commented on YARN-5892:
--

Thanks [~sunilg]
bq. YARN-3140 is something i would like to see in branch-2.8 as well. I ll pool 
some bandwidth for same, but i might need some more time (considering tests).
Sorry, but I don't think my statement about YARN-3140 was clear.
- I tried backporting YARN-3140 to branch-2.8, but I had several deadlock 
problems.
- I also tried backporting YARN-3140 to branch-2, and I also ran into deadlock 
problems, but not as many and I didn't really try to look into it as hard as 
branch-2.8
- We are anxious to get the user weights feature backported soon.
I don't think it is worth the effort to try to backport YARN-3140 to 
branch-2.8. However, for branch-2, I will make time to try the backport again.
{quote}
bq. On a related note (if no locking is implemented), I think the following 
logic is incorrect
May be we can pull this to another ticket and retrospect there. I think we need 
a second look in this in detail.
{quote}
{code}
  count += (user != null) ? user.getWeight() : 1.0f;
{code}
It's just a matter of changing the 1 to 0. I would like to go ahead and do that 
in the next patch.

> Support user-specific minimum user limit percentage in Capacity Scheduler
> -
>
> Key: YARN-5892
> URL: https://issues.apache.org/jira/browse/YARN-5892
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Eric Payne
>Assignee: Eric Payne
> Fix For: 3.0.0-alpha3
>
> Attachments: Active users highlighted.jpg, YARN-5892.001.patch, 
> YARN-5892.002.patch, YARN-5892.003.patch, YARN-5892.004.patch, 
> YARN-5892.005.patch, YARN-5892.006.patch, YARN-5892.007.patch, 
> YARN-5892.008.patch, YARN-5892.009.patch, YARN-5892.010.patch, 
> YARN-5892.012.patch, YARN-5892.013.patch, YARN-5892.014.patch, 
> YARN-5892.015.patch, YARN-5892.branch-2.015.patch
>
>
> Currently, in the capacity scheduler, the {{minimum-user-limit-percent}} 
> property is per queue. A cluster admin should be able to set the minimum user 
> limit percent on a per-user basis within the queue.
> This functionality is needed so that when intra-queue preemption is enabled 
> (YARN-4945 / YARN-2113), some users can be deemed as more important than 
> other users, and resources from VIP users won't be as likely to be preempted.
> For example, if the {{getstuffdone}} queue has a MULP of 25 percent, but user 
> {{jane}} is a power user of queue {{getstuffdone}} and needs to be guaranteed 
> 75 percent, the properties for {{getstuffdone}} and {{jane}} would look like 
> this:
> {code}
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.minimum-user-limit-percent
> 25
>   
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.jane.minimum-user-limit-percent
> 75
>   
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6776) Refactor ApplicaitonMasterService to move actual processing logic to a separate class

2017-07-10 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-6776:
--
Attachment: YARN-6776.004.patch

Updating patch to fix unused import.
The testcase failures are unrelated. 

> Refactor ApplicaitonMasterService to move actual processing logic to a 
> separate class
> -
>
> Key: YARN-6776
> URL: https://issues.apache.org/jira/browse/YARN-6776
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Minor
> Attachments: YARN-6776.001.patch, YARN-6776.002.patch, 
> YARN-6776.003.patch, YARN-6776.004.patch
>
>
> Minor refactoring to move the processing logic of the 
> {{ApplicationMasterService}} into a separate class.
> The per appattempt locking as well as the extraction of the appAttemptId etc. 
> will remain in the ApplicationMasterService 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6792) Incorrect XML convertion in NodeIDsInfo and LabelsToNodesInfo

2017-07-10 Thread Giovanni Matteo Fumarola (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-6792:
---
Attachment: YARN-6792.v1.patch

> Incorrect XML convertion in NodeIDsInfo and LabelsToNodesInfo
> -
>
> Key: YARN-6792
> URL: https://issues.apache.org/jira/browse/YARN-6792
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Blocker
> Attachments: YARN-6792.v1.patch
>
>
> NodeIDsInfo contains a typo and there is a missing constructor in 
> LabelsToNodesInfo. These bugs does not allow a correct conversation in XML of 
>  LabelsToNodesInfo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6771) Use classloader inside configuration class to make new classes

2017-07-10 Thread Jongyoul Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jongyoul Lee updated YARN-6771:
---
Target Version/s: 3.0.0-alpha4, 2.8.2  (was: 2.8.2)

> Use classloader inside configuration class to make new classes 
> ---
>
> Key: YARN-6771
> URL: https://issues.apache.org/jira/browse/YARN-6771
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.1, 3.0.0-alpha4
>Reporter: Jongyoul Lee
> Fix For: 2.8.2
>
> Attachments: YARN-6771-1.patch, YARN-6771-2.patch, YARN-6771.patch
>
>
> While running {{RpcClientFactoryPBImpl.getClient}}, 
> {{RpcClientFactoryPBImpl}} uses {{localConf.getClassByName}}. But in case of 
> using custom classloader, we have to use {{conf.getClassByName}} because 
> custom classloader is already stored in {{Configuration}} class.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5892) Support user-specific minimum user limit percentage in Capacity Scheduler

2017-07-10 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080777#comment-16080777
 ] 

Sunil G commented on YARN-5892:
---

Thanks [~eepayne]
YARN-3140 is something i would like to see in branch-2.8 as well. I ll pool 
some bandwidth for same, but i might need some more time (considering tests).

bq.I don't think locking is necessary.
Its not a logical problem i think. I was mentioning about improving it. we 
already have a reentrant lock there.

bq.On a related note (if no locking is implemented), I think the following 
logic is incorrect
May be we can pull this to another ticket and retrospect there. I think we need 
a second look in this in detail.

> Support user-specific minimum user limit percentage in Capacity Scheduler
> -
>
> Key: YARN-5892
> URL: https://issues.apache.org/jira/browse/YARN-5892
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Eric Payne
>Assignee: Eric Payne
> Fix For: 3.0.0-alpha3
>
> Attachments: Active users highlighted.jpg, YARN-5892.001.patch, 
> YARN-5892.002.patch, YARN-5892.003.patch, YARN-5892.004.patch, 
> YARN-5892.005.patch, YARN-5892.006.patch, YARN-5892.007.patch, 
> YARN-5892.008.patch, YARN-5892.009.patch, YARN-5892.010.patch, 
> YARN-5892.012.patch, YARN-5892.013.patch, YARN-5892.014.patch, 
> YARN-5892.015.patch, YARN-5892.branch-2.015.patch
>
>
> Currently, in the capacity scheduler, the {{minimum-user-limit-percent}} 
> property is per queue. A cluster admin should be able to set the minimum user 
> limit percent on a per-user basis within the queue.
> This functionality is needed so that when intra-queue preemption is enabled 
> (YARN-4945 / YARN-2113), some users can be deemed as more important than 
> other users, and resources from VIP users won't be as likely to be preempted.
> For example, if the {{getstuffdone}} queue has a MULP of 25 percent, but user 
> {{jane}} is a power user of queue {{getstuffdone}} and needs to be guaranteed 
> 75 percent, the properties for {{getstuffdone}} and {{jane}} would look like 
> this:
> {code}
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.minimum-user-limit-percent
> 25
>   
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.jane.minimum-user-limit-percent
> 75
>   
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6792) Incorrect XML convertion in NodeIDsInfo and LabelsToNodesInfo

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080695#comment-16080695
 ] 

Hadoop QA commented on YARN-6792:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 0 new + 6 unchanged - 1 fixed = 6 total (was 7) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 43m 46s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6792 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876442/YARN-6792.v1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux a16ac08fddfc 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 09653ea |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/16350/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16350/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16350/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Incorrect XML

[jira] [Created] (YARN-6795) Add per-node max allocation threshold with respect to its capacity

2017-07-10 Thread Haibo Chen (JIRA)

Haibo Chen created YARN-6795:


 Summary: Add per-node max allocation threshold with respect to its 
capacity
 Key: YARN-6795
 URL: https://issues.apache.org/jira/browse/YARN-6795
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6796) Add unit test for NM to launch OPPORTUNISTIC container for oversubscription

2017-07-10 Thread Haibo Chen (JIRA)

Haibo Chen created YARN-6796:


 Summary: Add unit test for NM to launch OPPORTUNISTIC container 
for oversubscription
 Key: YARN-6796
 URL: https://issues.apache.org/jira/browse/YARN-6796
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5412) Create a proxy chain for ResourceManager REST API in the Router

2017-07-10 Thread Giovanni Matteo Fumarola (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-5412:
---
Attachment: YARN-5412-YARN-2915.1.patch

> Create a proxy chain for ResourceManager REST API in the Router
> ---
>
> Key: YARN-5412
> URL: https://issues.apache.org/jira/browse/YARN-5412
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
> Attachments: YARN-5412-YARN-2915.1.patch, 
> YARN-5412-YARN-2915.proto.patch
>
>
> As detailed in the proposal in the umbrella JIRA, we are introducing a new 
> component that routes client request to appropriate ResourceManager(s). This 
> JIRA tracks the creation of a proxy for ResourceManager REST API in the 
> Router. This provides a placeholder for:
> 1) throttling mis-behaving clients (YARN-1546)
> 3) mask the access to multiple RMs (YARN-3659)
> We are planning to follow the interceptor pattern like we did in YARN-2884 to 
> generalize the approach and have only dynamically coupling for Federation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6793) Duplicated reservation in Fair Scheduler preemption

2017-07-10 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6793:
---
Description: 
There is a delay between preemption happen and containers are killed. If some 
resources released from nodes which are supposed to be preempted at that time 
are not enough for the resource request, reservation happens again at that node.
E.g. scheduler reserves  in node 1 for app 1. It will 
take 15s by default to kill containers in node 1 for fulfill that resource 
requests. If  was released from node 1 before the 
killing, scheduler reserves  again in node 1 for app1.  

  was:
There is a delay between preemption happen and containers are killed. If some 
resources released from nodes which are supposed to be preempted at that time 
are not enough for the resource request, reservation happens again at that node.
E.g. scheduler reserves  in node 1 for app 1. It will 
take 15s by default to kill containers in node 1 for fulfill that resource 
requests. If  released from node 1 before the killing, 
scheduler reserve  for app1 again in node 1. 


> Duplicated reservation in Fair Scheduler preemption 
> 
>
> Key: YARN-6793
> URL: https://issues.apache.org/jira/browse/YARN-6793
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.1, 3.0.0-alpha3
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Critical
>
> There is a delay between preemption happen and containers are killed. If some 
> resources released from nodes which are supposed to be preempted at that time 
> are not enough for the resource request, reservation happens again at that 
> node.
> E.g. scheduler reserves  in node 1 for app 1. It will 
> take 15s by default to kill containers in node 1 for fulfill that resource 
> requests. If  was released from node 1 before the 
> killing, scheduler reserves  again in node 1 for app1.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6793) Duplicated reservation in Fair Scheduler preemption

2017-07-10 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6793:
---
Description: 
There is a delay between preemption happen and containers are killed. If some 
resources released from nodes which are supposed to be preempted at that time 
are not enough for the resource request, reservation happens again at that node.
E.g. scheduler reserves  in node 1 for app 1. It will 
take 15s by default to kill containers in node 1 for fulfill that resource 
requests. If  released from node 1 before the killing, 
scheduler reserve  for app1 again in node 1. 

  was:
There is a delay between preemption happen and containers are killed. If some 
resources released from nodes which are supposed to be preempted at that time 
are not enough for the resource request, reservation happens again at that node.
E.g. scheduler reserve  in node 1 for app 1. It will take 
15s by default to kill containers in node 1 for fulfill that resource requests. 
If  released from node 1 before the killing, scheduler 
reserve  for app1 again in node 1. 


> Duplicated reservation in Fair Scheduler preemption 
> 
>
> Key: YARN-6793
> URL: https://issues.apache.org/jira/browse/YARN-6793
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.1, 3.0.0-alpha3
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Critical
>
> There is a delay between preemption happen and containers are killed. If some 
> resources released from nodes which are supposed to be preempted at that time 
> are not enough for the resource request, reservation happens again at that 
> node.
> E.g. scheduler reserves  in node 1 for app 1. It will 
> take 15s by default to kill containers in node 1 for fulfill that resource 
> requests. If  released from node 1 before the killing, 
> scheduler reserve  for app1 again in node 1. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6793) Duplicated reservation in Fair Scheduler preemption

2017-07-10 Thread Yufei Gu (JIRA)

Yufei Gu created YARN-6793:
--

 Summary: Duplicated reservation in Fair Scheduler preemption 
 Key: YARN-6793
 URL: https://issues.apache.org/jira/browse/YARN-6793
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 3.0.0-alpha3, 2.8.1
Reporter: Yufei Gu
Assignee: Yufei Gu
Priority: Critical


There is a delay between preemption happen and containers are killed. If some 
resources released from nodes which are supposed to be preempted at that time 
are not enough for the resource request, reservation happens again at that node.
E.g. scheduler reserve  in node 1 for app 1. It will take 
15s by default to kill containers in node 1 for fulfill that resource requests. 
If  released from node 1 before the killing, scheduler 
reserve  for app1 again in node 1. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6796) Add unit test for NM to launch OPPORTUNISTIC container for oversubscription

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6796:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-1011

> Add unit test for NM to launch OPPORTUNISTIC container for oversubscription
> ---
>
> Key: YARN-6796
> URL: https://issues.apache.org/jira/browse/YARN-6796
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-6505) Define the strings used in SLS JSON input file format

2017-07-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-6505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák reassigned YARN-6505:
---

Assignee: Gergely Novák

> Define the strings used in SLS JSON input file format
> -
>
> Key: YARN-6505
> URL: https://issues.apache.org/jira/browse/YARN-6505
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Reporter: Yufei Gu
>Assignee: Gergely Novák
>  Labels: newbie
> Attachments: YARN-6505.001.patch
>
>
> We could put them in a Java file like what YarnConfiguration does.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6792) Incorrect XML convertion in NodeIDsInfo and LabelsToNodesInfo

2017-07-10 Thread Giovanni Matteo Fumarola (JIRA)

Giovanni Matteo Fumarola created YARN-6792:
--

 Summary: Incorrect XML convertion in NodeIDsInfo and 
LabelsToNodesInfo
 Key: YARN-6792
 URL: https://issues.apache.org/jira/browse/YARN-6792
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Giovanni Matteo Fumarola
Assignee: Giovanni Matteo Fumarola
Priority: Blocker


NodeIDsInfo contains a typo and there is a missing constructor in 
LabelsToNodesInfo. These bugs does not allow a correct conversation in XML of  
LabelsToNodesInfo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5412) Create a proxy chain for ResourceManager REST API in the Router

2017-07-10 Thread Giovanni Matteo Fumarola (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-5412:
---
Attachment: YARN-5412-YARN-2915.1.patch

> Create a proxy chain for ResourceManager REST API in the Router
> ---
>
> Key: YARN-5412
> URL: https://issues.apache.org/jira/browse/YARN-5412
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
> Attachments: YARN-5412-YARN-2915.1.patch, 
> YARN-5412-YARN-2915.proto.patch
>
>
> As detailed in the proposal in the umbrella JIRA, we are introducing a new 
> component that routes client request to appropriate ResourceManager(s). This 
> JIRA tracks the creation of a proxy for ResourceManager REST API in the 
> Router. This provides a placeholder for:
> 1) throttling mis-behaving clients (YARN-1546)
> 3) mask the access to multiple RMs (YARN-3659)
> We are planning to follow the interceptor pattern like we did in YARN-2884 to 
> generalize the approach and have only dynamically coupling for Federation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6793) Duplicated reservation in Fair Scheduler preemption

2017-07-10 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6793:
---
Description: 
There is a delay between preemption happen and containers are killed. If 
resources released from nodes before container killing are not enough for the 
resource request preemption asking for, reservation happens again at that node.
E.g. scheduler reserves  in node 1 for app 1. It will 
take 15s by default to kill containers in node 1 for fulfill that resource 
requests. If  was released from node 1 before the 
killing, scheduler reserves  again in node 1 for app1. 
The second reservation may never be unreserved. 

  was:
There is a delay between preemption happen and containers are killed. If some 
resources released from nodes which are supposed to be preempted at that time 
are not enough for the resource request, reservation happens again at that node.
E.g. scheduler reserves  in node 1 for app 1. It will 
take 15s by default to kill containers in node 1 for fulfill that resource 
requests. If  was released from node 1 before the 
killing, scheduler reserves  again in node 1 for app1. 
The second reservation may never be unreserved. 


> Duplicated reservation in Fair Scheduler preemption 
> 
>
> Key: YARN-6793
> URL: https://issues.apache.org/jira/browse/YARN-6793
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.1, 3.0.0-alpha3
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Critical
>
> There is a delay between preemption happen and containers are killed. If 
> resources released from nodes before container killing are not enough for the 
> resource request preemption asking for, reservation happens again at that 
> node.
> E.g. scheduler reserves  in node 1 for app 1. It will 
> take 15s by default to kill containers in node 1 for fulfill that resource 
> requests. If  was released from node 1 before the 
> killing, scheduler reserves  again in node 1 for app1. 
> The second reservation may never be unreserved. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6794) promote OPPORTUNISITC containers locally at the node where they're running

2017-07-10 Thread Haibo Chen (JIRA)

Haibo Chen created YARN-6794:


 Summary: promote OPPORTUNISITC containers locally at the node 
where they're running
 Key: YARN-6794
 URL: https://issues.apache.org/jira/browse/YARN-6794
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, scheduler
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6793) Duplicated reservation in Fair Scheduler preemption

2017-07-10 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6793:
---
Description: 
There is a delay between preemption happen and containers are killed. If 
resources released from nodes before container killing are not enough for the 
resource request preemption asking for, reservation happens again at that node.
E.g. scheduler reserves  in node 1 for app 1 while 
preemption. It will take 15s by default to kill containers in node 1 for 
fulfill that resource requests. If  was released from 
node 1 before the killing, scheduler reserves  again in 
node 1 for app1. The second reservation may never be unreserved. 

  was:
There is a delay between preemption happen and containers are killed. If 
resources released from nodes before container killing are not enough for the 
resource request preemption asking for, reservation happens again at that node.
E.g. scheduler reserves  in node 1 for app 1. It will 
take 15s by default to kill containers in node 1 for fulfill that resource 
requests. If  was released from node 1 before the 
killing, scheduler reserves  again in node 1 for app1. 
The second reservation may never be unreserved. 


> Duplicated reservation in Fair Scheduler preemption 
> 
>
> Key: YARN-6793
> URL: https://issues.apache.org/jira/browse/YARN-6793
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.1, 3.0.0-alpha3
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Critical
>
> There is a delay between preemption happen and containers are killed. If 
> resources released from nodes before container killing are not enough for the 
> resource request preemption asking for, reservation happens again at that 
> node.
> E.g. scheduler reserves  in node 1 for app 1 while 
> preemption. It will take 15s by default to kill containers in node 1 for 
> fulfill that resource requests. If  was released from 
> node 1 before the killing, scheduler reserves  again in 
> node 1 for app1. The second reservation may never be unreserved. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4767) Network issues can cause persistent RM UI outage

2017-07-10 Thread Sergey Bahchissaraitsev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080459#comment-16080459
 ] 

Sergey Bahchissaraitsev commented on YARN-4767:
---

[~templedf] Can I ask why it was not applied to the 2.8.0 branch?

> Network issues can cause persistent RM UI outage
> 
>
> Key: YARN-4767
> URL: https://issues.apache.org/jira/browse/YARN-4767
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.7.2
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-4767.001.patch, YARN-4767.002.patch, 
> YARN-4767.003.patch, YARN-4767.004.patch, YARN-4767.005.patch, 
> YARN-4767.006.patch, YARN-4767.007.patch, YARN-4767.008.patch, 
> YARN-4767.009.patch, YARN-4767.010.patch, YARN-4767.011.patch
>
>
> If a network issue causes an AM web app to resolve the RM proxy's address to 
> something other than what's listed in the allowed proxies list, the 
> AmIpFilter will 302 redirect the RM proxy's request back to the RM proxy.  
> The RM proxy will then consume all available handler threads connecting to 
> itself over and over, resulting in an outage of the web UI.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6794) promote OPPORTUNISITC containers locally at the node where they're running

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6794:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-1011

> promote OPPORTUNISITC containers locally at the node where they're running
> --
>
> Key: YARN-6794
> URL: https://issues.apache.org/jira/browse/YARN-6794
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, scheduler
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6792) Incorrect XML convertion in NodeIDsInfo and LabelsToNodesInfo

2017-07-10 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080617#comment-16080617
 ] 

Sunil G commented on YARN-6792:
---

I guess we need to add some more test cases here as well.

> Incorrect XML convertion in NodeIDsInfo and LabelsToNodesInfo
> -
>
> Key: YARN-6792
> URL: https://issues.apache.org/jira/browse/YARN-6792
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Blocker
> Attachments: YARN-6792.v1.patch
>
>
> NodeIDsInfo contains a typo and there is a missing constructor in 
> LabelsToNodesInfo. These bugs does not allow a correct conversation in XML of 
>  LabelsToNodesInfo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6505) Define the strings used in SLS JSON input file format

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080519#comment-16080519
 ] 

Hadoop QA commented on YARN-6505:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m  
5s{color} | {color:green} hadoop-sls in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6505 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876425/YARN-6505.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 202bf2f7c0ff 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 09653ea |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16349/testReport/ |
| modules | C: hadoop-tools/hadoop-sls U: hadoop-tools/hadoop-sls |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16349/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Define the strings used in SLS JSON input file format
> -
>
> Key: YARN-6505
> URL: https://issues.apache.org/jira/browse/YARN-6505
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Reporter: Yufei Gu
>Assignee: Gergely Novák
>  Labels: newbie
> Attachments: YARN-6505.001.patch
>
>
> We could put them in a Java file like what YarnConfiguration does.



--
This message was sent by Atlassian JIRA

[jira] [Updated] (YARN-6670) Add separate NM overallocation thresholds for cpu and memory

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6670:
-
Fix Version/s: YARN-1011

> Add separate NM overallocation thresholds for cpu and memory
> 
>
> Key: YARN-6670
> URL: https://issues.apache.org/jira/browse/YARN-6670
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Fix For: YARN-1011
>
> Attachments: YARN-6670-YARN-1011.00.patch, 
> YARN-6670-YARN-1011.01.patch, YARN-6670-YARN-1011.02.patch, 
> YARN-6670-YARN-1011.03.patch, YARN-6670-YARN-1011.04.patch, 
> YARN-6670-YARN-1011.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6670) Add separate NM overallocation thresholds for cpu and memory

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6670:
-
Target Version/s: YARN-1011  (was: 3.0.0-beta1)

> Add separate NM overallocation thresholds for cpu and memory
> 
>
> Key: YARN-6670
> URL: https://issues.apache.org/jira/browse/YARN-6670
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Fix For: YARN-1011
>
> Attachments: YARN-6670-YARN-1011.00.patch, 
> YARN-6670-YARN-1011.01.patch, YARN-6670-YARN-1011.02.patch, 
> YARN-6670-YARN-1011.03.patch, YARN-6670-YARN-1011.04.patch, 
> YARN-6670-YARN-1011.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6788) Improve performance of resource profile branch

2017-07-10 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-6788:
--
Attachment: YARN-6788-YARN-3926.001.patch

A first cut to improve performance and some cleanup.

Few early calls:
# Instead of a map to keep resource information, a simple array is used. 
(faster iteration and access with given index)
# Resource Information Index is stored in a map which will help to access the 
above mentioned array (Kept in ResourceUtils class as a static single point of 
access)
# Performance is within 3% delta with trunk on various runs.
# In Resource class, there are few impls for SimpleResource. I also had an 
offline talk with [~vvasudev]. May be its better to move all resource related 
impl from ResourcePBImpl to Resource class itself and can keep only proto 
related changes in PBImpl class. So SimpleResource could look into memory and 
cpu resource info objects. 
# For now, I am using {{tmpResource}} available in Resource class to get index.

Once an initial round of review is done and a confirmation on last point above, 
I ll update the  second patch.

cc/[~leftnoteasy] [~templedf] and [~vvasudev]

> Improve performance of resource profile branch
> --
>
> Key: YARN-6788
> URL: https://issues.apache.org/jira/browse/YARN-6788
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Blocker
> Attachments: YARN-6788-YARN-3926.001.patch
>
>
> Currently we could see a 15% performance delta with this branch. 
> Few performance improvements to improve the same.
> Also this patch will handle 
> [comments|https://issues.apache.org/jira/browse/YARN-6761?focusedCommentId=16075418=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16075418]
>  from [~leftnoteasy].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5412) Create a proxy chain for ResourceManager REST API in the Router

2017-07-10 Thread Giovanni Matteo Fumarola (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-5412:
---
Attachment: (was: YARN-5412-YARN-2915.1.patch)

> Create a proxy chain for ResourceManager REST API in the Router
> ---
>
> Key: YARN-5412
> URL: https://issues.apache.org/jira/browse/YARN-5412
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
> Attachments: YARN-5412-YARN-2915.1.patch, 
> YARN-5412-YARN-2915.proto.patch
>
>
> As detailed in the proposal in the umbrella JIRA, we are introducing a new 
> component that routes client request to appropriate ResourceManager(s). This 
> JIRA tracks the creation of a proxy for ResourceManager REST API in the 
> Router. This provides a placeholder for:
> 1) throttling mis-behaving clients (YARN-1546)
> 3) mask the access to multiple RMs (YARN-3659)
> We are planning to follow the interceptor pattern like we did in YARN-2884 to 
> generalize the approach and have only dynamically coupling for Federation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6786) ResourcePBImpl imports cleanup

2017-07-10 Thread Yeliang Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081623#comment-16081623
 ] 

Yeliang Cang commented on YARN-6786:


Thanks, [~sunilg], for the review!

> ResourcePBImpl imports cleanup
> --
>
> Key: YARN-6786
> URL: https://issues.apache.org/jira/browse/YARN-6786
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: YARN-3926
>Reporter: Daniel Templeton
>Assignee: Yeliang Cang
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-6786-YARN-3926.001.patch
>
>
> There is an unused import and an import *.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2017-07-10 Thread Manikandan R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081657#comment-16081657
 ] 

Manikandan R commented on YARN-6492:


[~Naganarasimha], [~jhung] Thanks.

Attached patch. In addition to junit test cases, ran couple of tests in local 
setup as well and was able to see the expected partition metrics as part of jmx 
metrics.

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
> Attachments: YARN-6492.001.patch
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6492) Generate queue metrics for each partition

2017-07-10 Thread Manikandan R (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-6492:
---
Attachment: YARN-6492.001.patch

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
> Attachments: YARN-6492.001.patch
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6801) NPE in RM while setting collectors map in NodeHeartbeatResponse

2017-07-10 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081680#comment-16081680
 ] 

Rohith Sharma K S commented on YARN-6801:
-

Thanks [~vrushalic] for working on this issue.
+1 LGTM. 

Out of curiosity, what is scenario causing applicationId not present in global 
data structure i.e *rmContext.getRMApps()* and present in RMnode#runningApps? 
Is it during upgrade ? 

> NPE in RM while setting collectors map in NodeHeartbeatResponse
> ---
>
> Key: YARN-6801
> URL: https://issues.apache.org/jira/browse/YARN-6801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2
>Reporter: Vrushali C
>Assignee: Vrushali C
> Attachments: YARN-6801-YARN-5355.001.patch
>
>
> Null Pointer Exception seen in 
> ResourceTrackerService#setAppCollectorsMapToResponse call 
> {code}
> 2017-06-22 22:24:01,437 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 49 on 8031, call 
> org.apache.hadoop.yarn.server.api.ResourceTrackerPB.nodeHeartbeat from 
> 10.35.172.116:44399 Call#3929 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.setAppCollectorsMapToResponse(ResourceTrackerService.java:467)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(ResourceTrackerService.java:447)
> at 
> org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceTrackerPBServiceImpl.nodeHeartbeat(ResourceTrackerPBServiceImpl.java:68)
> at 
> org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService$2.callBlockingMethod(ResourceTracker.java:81)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2080)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1645)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2078)
> {code}
> It correlates to RM invoking setAppCollectorsMapToResponse and calling 
> {code}
>   AppCollectorData appCollectorData = 
> rmApps.get(appId).getCollectorData();
> {code}
> If the app object is not present in the list of running app ids, then this 
> will throw NPE.
> Filing jira to fix it. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare

2017-07-10 Thread daemon (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daemon updated YARN-6769:
-
Attachment: YARN-6769.003.patch

> Put the no demand queue after the most in FairSharePolicy#compare
> -
>
> Key: YARN-6769
> URL: https://issues.apache.org/jira/browse/YARN-6769
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: daemon
>Assignee: daemon
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: YARN-6769.001.patch, YARN-6769.002.patch, 
> YARN-6769.003.patch
>
>
> When use fairsheduler as RM scheduler, before assign container we will sort 
> all queues or applications. 
> We will use FairSharePolicy#compare as the comparator， but the comparator is 
> not so perfect.
> It have a problem as blow:
> 1. when a queue use resource over minShare(minResources), it will put behind 
> the queue whose demand is zeor.
> so it will greater opportunity to get the resource although it do not want. 
> It will waste schedule time when assign container
> to queue or application.
> I have fix it, and I will upload the patch to the jira.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081426#comment-16081426
 ] 

Hadoop QA commented on YARN-1014:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
47s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
15s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-1014 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876535/YARN-1014.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux dcacb152e548 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 5496a34 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/16361/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16361/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16361/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Configure OOM Killer to kill OPPORTUNISTIC containers first
> ---
>
> Key: YARN-1014
> URL: https://issues.apache.org/jira/browse/YARN-1014
> Project: Hadoop YARN
>

[jira] [Updated] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-1014:
-
Target Version/s: 3.0.0-beta1

> Configure OOM Killer to kill OPPORTUNISTIC containers first
> ---
>
> Key: YARN-1014
> URL: https://issues.apache.org/jira/browse/YARN-1014
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Arun C Murthy
>Assignee: Karthik Kambatla
> Attachments: YARN-1014.00.patch, YARN-1014.01.patch, 
> YARN-1014.02.patch
>
>
> YARN-2882 introduces the notion of OPPORTUNISTIC containers. These containers 
> should be killed first should the system run out of memory. 
> -
> Previous description:
> Once RM allocates 'speculative containers' we need to get LCE to schedule 
> them at lower priorities via cgroups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2017-07-10 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081456#comment-16081456
 ] 

Haibo Chen commented on YARN-1014:
--

Findbug warnings are unrelated.

> Configure OOM Killer to kill OPPORTUNISTIC containers first
> ---
>
> Key: YARN-1014
> URL: https://issues.apache.org/jira/browse/YARN-1014
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Arun C Murthy
>Assignee: Karthik Kambatla
> Attachments: YARN-1014.00.patch, YARN-1014.01.patch, 
> YARN-1014.02.patch
>
>
> YARN-2882 introduces the notion of OPPORTUNISTIC containers. These containers 
> should be killed first should the system run out of memory. 
> -
> Previous description:
> Once RM allocates 'speculative containers' we need to get LCE to schedule 
> them at lower priorities via cgroups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6800) Add opportunity to start containers while periodically checking for preemption

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6800:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-1011

> Add opportunity to start containers while periodically checking for preemption
> --
>
> Key: YARN-6800
> URL: https://issues.apache.org/jira/browse/YARN-6800
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6801) NPE in RM while setting collectors map in NodeHeartbeatResponse

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081566#comment-16081566
 ] 

Hadoop QA commented on YARN-6801:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} YARN-5355 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
48s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} YARN-5355 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
0s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 in YARN-5355 has 8 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} YARN-5355 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 38m 
14s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ac17dc |
| JIRA Issue | YARN-6801 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876546/YARN-6801-YARN-5355.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux a81b1f01c527 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-5355 / 3829100 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/16362/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-warnings.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16362/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16362/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> NPE in RM while setting collectors map in NodeHeartbeatResponse
>

[jira] [Commented] (YARN-6769) Put the no demand queue after the most in FairSharePolicy#compare

2017-07-10 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081450#comment-16081450
 ] 

Yufei Gu commented on YARN-6769:


The patch looks good to me generally. Some thoughts:
# The logic in FairSharePolicy#compare looks good. However, the comment is 
verbose. Can you rephrase to just clarify that schedulable with no demand is 
less needy, and update the method Java doc as well?
# Why we need to change the recursive condition to {{genSchedulable.size() == 
4}} in method {{generateAndTest}}?
# Could you fix the checkstyle issue found by Hadoop QA

> Put the no demand queue after the most in FairSharePolicy#compare
> -
>
> Key: YARN-6769
> URL: https://issues.apache.org/jira/browse/YARN-6769
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: daemon
>Assignee: daemon
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: YARN-6769.001.patch, YARN-6769.002.patch
>
>
> When use fairsheduler as RM scheduler, before assign container we will sort 
> all queues or applications. 
> We will use FairSharePolicy#compare as the comparator， but the comparator is 
> not so perfect.
> It have a problem as blow:
> 1. when a queue use resource over minShare(minResources), it will put behind 
> the queue whose demand is zeor.
> so it will greater opportunity to get the resource although it do not want. 
> It will waste schedule time when assign container
> to queue or application.
> I have fix it, and I will upload the patch to the jira.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6801) NPE in RM while setting collectors map in NodeHeartbeatResponse

2017-07-10 Thread Vrushali C (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-6801:
-
Attachment: YARN-6801-YARN-5355.001.patch

Attaching v001 

> NPE in RM while setting collectors map in NodeHeartbeatResponse
> ---
>
> Key: YARN-6801
> URL: https://issues.apache.org/jira/browse/YARN-6801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2
>Reporter: Vrushali C
>Assignee: Vrushali C
> Attachments: YARN-6801-YARN-5355.001.patch
>
>
> Null Pointer Exception seen in 
> ResourceTrackerService#setAppCollectorsMapToResponse call 
> {code}
> 2017-06-22 22:24:01,437 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 49 on 8031, call 
> org.apache.hadoop.yarn.server.api.ResourceTrackerPB.nodeHeartbeat from 
> 10.35.172.116:44399 Call#3929 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.setAppCollectorsMapToResponse(ResourceTrackerService.java:467)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(ResourceTrackerService.java:447)
> at 
> org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceTrackerPBServiceImpl.nodeHeartbeat(ResourceTrackerPBServiceImpl.java:68)
> at 
> org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService$2.callBlockingMethod(ResourceTracker.java:81)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2080)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1645)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2078)
> {code}
> It correlates to RM invoking setAppCollectorsMapToResponse and calling 
> {code}
>   AppCollectorData appCollectorData = 
> rmApps.get(appId).getCollectorData();
> {code}
> If the app object is not present in the list of running app ids, then this 
> will throw NPE.
> Filing jira to fix it. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6801) NPE in RM while setting collectors map in NodeHeartbeatResponse

2017-07-10 Thread Vrushali C (JIRA)

Vrushali C created YARN-6801:


 Summary: NPE in RM while setting collectors map in 
NodeHeartbeatResponse
 Key: YARN-6801
 URL: https://issues.apache.org/jira/browse/YARN-6801
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-5355, YARN-5355-branch-2
Reporter: Vrushali C
Assignee: Vrushali C



Null Pointer Exception seen in 
ResourceTrackerService#setAppCollectorsMapToResponse call 

{code}
2017-06-22 22:24:01,437 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
49 on 8031, call 
org.apache.hadoop.yarn.server.api.ResourceTrackerPB.nodeHeartbeat from 
10.35.172.116:44399 Call#3929 Retry#0
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.setAppCollectorsMapToResponse(ResourceTrackerService.java:467)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(ResourceTrackerService.java:447)
at 
org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceTrackerPBServiceImpl.nodeHeartbeat(ResourceTrackerPBServiceImpl.java:68)
at 
org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService$2.callBlockingMethod(ResourceTracker.java:81)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2080)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1645)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2078)
{code}

It correlates to RM invoking setAppCollectorsMapToResponse and calling 

{code}
  AppCollectorData appCollectorData = rmApps.get(appId).getCollectorData();
{code}

If the app object is not present in the list of running app ids, then this will 
throw NPE.

Filing jira to fix it. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6800) Add opportunity to start containers while periodically checking for preemption

2017-07-10 Thread Haibo Chen (JIRA)

Haibo Chen created YARN-6800:


 Summary: Add opportunity to start containers while periodically 
checking for preemption
 Key: YARN-6800
 URL: https://issues.apache.org/jira/browse/YARN-6800
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6378) Negative usedResources memory in CapacityScheduler

2017-07-10 Thread Ravi Prakash (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-6378:
---
Target Version/s: 2.8.2  (was: 2.8.1)

> Negative usedResources memory in CapacityScheduler
> --
>
> Key: YARN-6378
> URL: https://issues.apache.org/jira/browse/YARN-6378
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Thomas Nystrand, we found that on two of our clusters configured 
> with the CapacityScheduler, usedResources occasionally becomes negative. 
> e.g.
> {code}
> 2017-03-15 11:10:09,449 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignedContainer application attempt=appattempt_1487222361993_17177_01 
> container=Container: [ContainerId: container_1487222361993_17177_01_14, 
> NodeId: :27249, NodeHttpAddress: :8042, Resource: 
> , Priority: 2, Token: null, ] queue=: 
> capacity=0.2, absoluteCapacity=0.2, usedResources=, 
> usedCapacity=0.03409091, absoluteUsedCapacity=0.006818182, numApps=1, 
> numContainers=3 clusterResource= type=RACK_LOCAL
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6625) yarn application -list returns a tracking URL for AM that doesn't work in secured and HA environment

2017-07-10 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081434#comment-16081434
 ] 

Robert Kanter commented on YARN-6625:
-

I think the approach in the v2 patch is better and simpler.  Here's some 
feedback:
- I think we should flip around the logic in {{isValidUrl}}.  There's a lot of 
other types of bad response codes we could get (e.g. 500 if there's some kind 
of unhandled exception) and really only one good type (i.e. 200).  
- Let's use constants defined in {{HttpURLConnection}} for the codes instead of 
magic numbers.  
http://download.java.net/jdk7/archive/b123/docs/api/java/net/HttpURLConnection.html#HTTP_OK
- For for-each loops, I think the formatting is usually {{for (String rmId : 
rmIds)}} not {{for (String rmId: rmIds)}}
- Once {{isValidUrl}} returns {{true}} in the loop in {{findRedirectUrl}}, we 
should be able to {{break}}, right?  No reason to keep searching.
- The {{else}} and {{continue}} in the loop in {{findRedirectUrl}} doesn't seem 
to be needed.
- This newer solution should be unit testable.  You can setup a good and bad 
(mock) servlet and call {{findRedirectUrl}} to see that you get the good one.

> yarn application -list returns a tracking URL for AM that doesn't work in 
> secured and HA environment
> 
>
> Key: YARN-6625
> URL: https://issues.apache.org/jira/browse/YARN-6625
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: amrmproxy
>Affects Versions: 3.0.0-alpha2
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6625.001.patch, YARN-6625.002.patch
>
>
> The tracking URL given at the command line should work secured or not. The 
> tracking URLs are like http://node-2.abc.com:47014 and AM web server supposed 
> to redirect it to a RM address like this 
> http://node-1.abc.com:8088/proxy/application_1494544954891_0002/, but it 
> fails to do that because the connection is rejected when AM is talking to RM 
> admin service to get HA status.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6776) Refactor ApplicaitonMasterService to move actual processing logic to a separate class

2017-07-10 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081194#comment-16081194
 ] 

Arun Suresh commented on YARN-6776:
---

Thanks for the rev Subru.. Committing this shortly

> Refactor ApplicaitonMasterService to move actual processing logic to a 
> separate class
> -
>
> Key: YARN-6776
> URL: https://issues.apache.org/jira/browse/YARN-6776
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Minor
> Attachments: YARN-6776.001.patch, YARN-6776.002.patch, 
> YARN-6776.003.patch, YARN-6776.004.patch
>
>
> Minor refactoring to move the processing logic of the 
> {{ApplicationMasterService}} into a separate class.
> The per appattempt locking as well as the extraction of the appAttemptId etc. 
> will remain in the ApplicationMasterService 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6594) [API] Introduce SchedulingRequest object

2017-07-10 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081225#comment-16081225
 ] 

Arun Suresh commented on YARN-6594:
---

I agree.. AMRMClient is starting to get very heavy.
But fundamentally, I guess the biggest reason for the code bloat is:
# AMRMClient has to handle transformation of a ContainerRequest to the 
corresponding ResourceRequests (the node/rack and ANY reqs) - I currently don't 
know how we can do this transparantly to the end application. I propose we make 
ContainerRequest a first class YARN API and deprecate the ResourceRequest 
object.
# On RM failover etc, the AMRMClient ensures that the AM re-registers and it 
also ensures that outstanding requests are resent.

[~subru] and I were discussing issues related to failover in a federated setup 
- and we were thinking the same thing: deprecate AMRMClient

> [API] Introduce SchedulingRequest object
> 
>
> Key: YARN-6594
> URL: https://issues.apache.org/jira/browse/YARN-6594
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6594.001.patch
>
>
> This JIRA introduces a new SchedulingRequest object.
> It will be part of the {{AllocateRequest}} and will be used to define sizing 
> (e.g., number of allocations, size of allocations) and placement constraints 
> for allocations.
> Applications can use either this new object (when rich placement constraints 
> are required) or the existing {{ResourceRequest}} object.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6797) TimelineWriter does not fully consume the POST response

2017-07-10 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-6797:
-
Attachment: YARN-6797.001.patch

Attaching a patch that calls ClientReponse#bufferEntity so we fully consume the 
response from the input stream.  Verified that after this change the number of 
"WARN mortbay.log: SSL renegotiate denied" messages in the timeline server log 
drastically reduced.  Not sure offhand how to write a unit test for this.

> TimelineWriter does not fully consume the POST response
> ---
>
> Key: YARN-6797
> URL: https://issues.apache.org/jira/browse/YARN-6797
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineclient
>Affects Versions: 2.8.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: YARN-6797.001.patch
>
>
> TimelineWriter does not fully consume the response to the POST request, and 
> that ends up preventing the HTTP client from being reused for the next write 
> of an entity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6776) Refactor ApplicaitonMasterService to move actual processing logic to a separate class

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081109#comment-16081109
 ] 

Hadoop QA commented on YARN-6776:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 15 unchanged - 17 fixed = 15 total (was 32) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 21s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 94m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6776 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876472/YARN-6776.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0602fac42e4a 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 09653ea |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/16354/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16354/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U:

[jira] [Commented] (YARN-6706) Refactor ContainerScheduler to make oversubscription change easier

2017-07-10 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081132#comment-16081132
 ] 

Arun Suresh commented on YARN-6706:
---

Yup. Agree this can go into trunk.

The patch generally looks good. One comment:
In the {{ContainerScheduler#scheduleContainer()}}, I see how 
startPendingContainers() will ensure that we do not have to wait for a 
completed container to start containers, but it maybe be better to just move 
all the {{startPendingContainers()}} invocations from inside the if..then..else 
to the last line of the method. Not sure if startPendingContainers() before and 
after en-queuing  is usefull.

> Refactor ContainerScheduler to make oversubscription change easier
> --
>
> Key: YARN-6706
> URL: https://issues.apache.org/jira/browse/YARN-6706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-6706.01.patch, YARN-6706-YARN-1011.00.patch, 
> YARN-6706-YARN-1011.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5731) Preemption does not work in few corner cases when reservations are placed

2017-07-10 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-5731:
-
Attachment: YARN-5731-branch-2.8.001.patch

Attached patch for the fix.

> Preemption does not work in few corner cases when reservations are placed
> -
>
> Key: YARN-5731
> URL: https://issues.apache.org/jira/browse/YARN-5731
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 2.8.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5731-branch-2.8.001.patch
>
>
> Preemption doesnt kick in under below scenario.
> Two queues A and B each with 50% capacity and 100% maximum capacity and user 
> limit factor 2. Job which is submitted to queueA has taken 95% of resources 
> in cluster. Remaining 5% was also reserved by same job as demand was still 
> higher.
> Now submit a small job with AM container size is lesser to above mentioned 
> 5%. Job waits and no preemption is happening.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6594) [API] Introduce SchedulingRequest object

2017-07-10 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081197#comment-16081197
 ] 

Wangda Tan commented on YARN-6594:
--

[~kkaranasos], chatted with [~jianhe] offline. I think Jian's concern is more 
about how AM use the new feature:

Existing applications interact with YARN by leveraging AMRMClient, however 
AMRMClient is mostly design to support MR-like workload, and its design has 
many issues such as in-consistent resource-request table between server/client. 
I think now it is a good opportunity to think about how to fix the last-mile 
problem. Do you think we should leverage AMRMClient or we should create a 
brand-new ApplicationMaster Java API? I personally prefer to create a new one, 
to me maintenance of  legacy code to support new features is much harder than 
adding a new implementation.

> [API] Introduce SchedulingRequest object
> 
>
> Key: YARN-6594
> URL: https://issues.apache.org/jira/browse/YARN-6594
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6594.001.patch
>
>
> This JIRA introduces a new SchedulingRequest object.
> It will be part of the {{AllocateRequest}} and will be used to define sizing 
> (e.g., number of allocations, size of allocations) and placement constraints 
> for allocations.
> Applications can use either this new object (when rich placement constraints 
> are required) or the existing {{ResourceRequest}} object.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6413) Decouple Yarn Registry API from ZK

2017-07-10 Thread Ellen Hui (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ellen Hui updated YARN-6413:

Attachment: 0002-Registry-API-v2.patch

Addressed comments from [~jianhe]:

* The V2 has every method used by RegistryDNS except for monitorRegistryEntries 
and registrPathListener, which were added in a yarn-native-services commit.
* Removed extra imports from RegistryOperations
* The path was rearranged in order to flatten the namespace
* removed getSubpathsForUserServiceRecordKey
* Removed catch stub and replaced with a proper log

> Decouple Yarn Registry API from ZK
> --
>
> Key: YARN-6413
> URL: https://issues.apache.org/jira/browse/YARN-6413
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: amrmproxy, api, resourcemanager
>Reporter: Ellen Hui
>Assignee: Ellen Hui
> Attachments: 0001-Registry-API-v2.patch, 0002-Registry-API-v2.patch
>
>
> Right now the Yarn Registry API (defined in the RegistryOperations interface) 
> is a very thin layer over Zookeeper. This jira proposes changing the 
> interface to abstract away the implementation details so that we can write a 
> FS-based implementation of the registry service, which will be used to 
> support AMRMProxy HA.
> The new interface will use register/delete/resolve APIs instead of 
> Zookeeper-specific operations like mknode. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2017-07-10 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-1014:
-
Attachment: YARN-1014.00.patch

> Configure OOM Killer to kill OPPORTUNISTIC containers first
> ---
>
> Key: YARN-1014
> URL: https://issues.apache.org/jira/browse/YARN-1014
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Arun C Murthy
>Assignee: Karthik Kambatla
> Attachments: YARN-1014.00.patch
>
>
> YARN-2882 introduces the notion of OPPORTUNISTIC containers. These containers 
> should be killed first should the system run out of memory. 
> -
> Previous description:
> Once RM allocates 'speculative containers' we need to get LCE to schedule 
> them at lower priorities via cgroups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6797) TimelineWriter does not fully consume the POST response

2017-07-10 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-6797:
-
Summary: TimelineWriter does not fully consume the POST response  (was: 
TimelineWriter does not fully consume the post response)

This becomes particularly problematic when the writer is communicating to the 
timeline server via HTTPS.  The lack of reuse causes each new HTTP client to 
renegotiate the SSL connection, and Jetty rejects the renegotiate and closes 
the connection.  The end result is that each posting of a timeline entity 
requires a new socket connection.

> TimelineWriter does not fully consume the POST response
> ---
>
> Key: YARN-6797
> URL: https://issues.apache.org/jira/browse/YARN-6797
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineclient
>Affects Versions: 2.8.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>
> TimelineWriter does not fully consume the response to the POST request, and 
> that ends up preventing the HTTP client from being reused for the next write 
> of an entity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6794) promote OPPORTUNISITC containers locally at the node where they're running

2017-07-10 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081112#comment-16081112
 ] 

Arun Suresh commented on YARN-6794:
---

[~haibo.chen] / [~kasha]
I expect some overlap between this and YARN-5978. Let me clean up my patch, 
based on what we have in prod and post it there

> promote OPPORTUNISITC containers locally at the node where they're running
> --
>
> Key: YARN-6794
> URL: https://issues.apache.org/jira/browse/YARN-6794
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, scheduler
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6798) NM startup failure with old state store due to version mismatch

2017-07-10 Thread Ray Chiang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081105#comment-16081105
 ] 

Ray Chiang commented on YARN-6798:
--

[~kkaranasos], [~subru], [~botong], [~asuresh].  It looks like you guys bumped 
the NM version twice.  Is this behavior desirable or is it preferable to have 
more compatible state store versions (i.e. 1.0 -> 1.1 -> 1.2 instead of 1.0 -> 
2.0 -> 3.0).

Plus, anyone else who has thoughts about NM rolling upgrade, please chime in.

> NM startup failure with old state store due to version mismatch
> ---
>
> Key: YARN-6798
> URL: https://issues.apache.org/jira/browse/YARN-6798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha4
>Reporter: Ray Chiang
>
> YARN-6703 rolled back the state store version number for the RM from 2.0 to 
> 1.4.
> YARN-6127 bumped the version for the NM to 3.0
> private static final Version CURRENT_VERSION_INFO = 
> Version.newInstance(3, 0);
> YARN-5049 bumped the version for the NM to 2.0
> private static final Version CURRENT_VERSION_INFO = 
> Version.newInstance(2, 0);
> During an upgrade, all NMs died after upgrading a C6 cluster from alpha2 to 
> alpha4.
> {noformat}
> 2017-07-07 15:48:17,259 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
> Incompatible version for NM state: expecting NM state version 3.0, but 
> loading version 2.0
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:246)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:307)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:748)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:809)
> Caused by: java.io.IOException: Incompatible version for NM state: expecting 
> NM state version 3.0, but loading version 2.0
> at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.checkVersion(NMLeveldbStateStoreService.java:1454)
> at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:1308)
> at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:307)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> ... 5 more
> 2017-07-07 15:48:17,277 INFO 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down NodeManager at xxx.gce.cloudera.com/aa.bb.cc.dd
> /
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4767) Network issues can cause persistent RM UI outage

2017-07-10 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081107#comment-16081107
 ] 

Daniel Templeton commented on YARN-4767:


No reason, really.  If you need it in 2.8, feel free to open a new JIRA to 
backport it.

> Network issues can cause persistent RM UI outage
> 
>
> Key: YARN-4767
> URL: https://issues.apache.org/jira/browse/YARN-4767
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.7.2
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-4767.001.patch, YARN-4767.002.patch, 
> YARN-4767.003.patch, YARN-4767.004.patch, YARN-4767.005.patch, 
> YARN-4767.006.patch, YARN-4767.007.patch, YARN-4767.008.patch, 
> YARN-4767.009.patch, YARN-4767.010.patch, YARN-4767.011.patch
>
>
> If a network issue causes an AM web app to resolve the RM proxy's address to 
> something other than what's listed in the allowed proxies list, the 
> AmIpFilter will 302 redirect the RM proxy's request back to the RM proxy.  
> The RM proxy will then consume all available handler threads connecting to 
> itself over and over, resulting in an outage of the web UI.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6797) TimelineWriter does not fully consume the POST response

2017-07-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081130#comment-16081130
 ] 

Hadoop QA commented on YARN-6797:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 17s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The patch generated 1 new + 
0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
18s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6797 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876486/YARN-6797.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux fac3e5ea522c 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 09653ea |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16355/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16355/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16355/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TimelineWriter does not fully consume the POST response
> ---
>
> Key: YARN-6797
> URL: https://issues.apache.org/jira/browse/YARN-6797
> Project: Hadoop YARN

[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)

2017-07-10 Thread Chris Trezzo (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081151#comment-16081151
 ] 

Chris Trezzo commented on YARN-1492:


bq. Could you explain how the shared cache leverage the node manager local 
cache in detail?

The shared cache leverages the local cache via the normal LocalResource API. 
The YARN application specifies a shared cache path that it received from the 
shared cache as the LocalResource URI.

bq. Are those shared jars marked as PUBLIC?

Yes, currently all resources in the shared cache are world readable, so they 
are in that sense public. However, at the node manager level you could set the 
visibilities to PRIVATE or APPLICATION.

bq. Could you point me the source code that handle this?

The shared cache uses the normal localization code path (see 
ResourceLocalizationService). For shared cache specific parts to upload a 
resource to the cache see SharedCacheUploader. If you want to see an example of 
how a YARN application can implement support for the shared cache, see 
MAPREDUCE-5951 for how map reduce does it.

> truly shared cache for jars (jobjar/libjar)
> ---
>
> Key: YARN-1492
> URL: https://issues.apache.org/jira/browse/YARN-1492
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.0.4-alpha
>Reporter: Sangjin Lee
>Assignee: Chris Trezzo
> Attachments: shared_cache_design.pdf, shared_cache_design_v2.pdf, 
> shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, 
> shared_cache_design_v5.pdf, shared_cache_design_v6.pdf, 
> YARN-1492-all-trunk-v1.patch, YARN-1492-all-trunk-v2.patch, 
> YARN-1492-all-trunk-v3.patch, YARN-1492-all-trunk-v4.patch, 
> YARN-1492-all-trunk-v5.patch
>
>
> Currently there is the distributed cache that enables you to cache jars and 
> files so that attempts from the same job can reuse them. However, sharing is 
> limited with the distributed cache because it is normally on a per-job basis. 
> On a large cluster, sometimes copying of jobjars and libjars becomes so 
> prevalent that it consumes a large portion of the network bandwidth, not to 
> speak of defeating the purpose of "bringing compute to where data is". This 
> is wasteful because in most cases code doesn't change much across many jobs.
> I'd like to propose and discuss feasibility of introducing a truly shared 
> cache so that multiple jobs from multiple users can share and cache jars. 
> This JIRA is to open the discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6799) Remove the duplicated code in CGroupsHandlerImp.java

2017-07-10 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6799:
---
Issue Type: Improvement  (was: Bug)

> Remove the duplicated code in CGroupsHandlerImp.java
> 
>
> Key: YARN-6799
> URL: https://issues.apache.org/jira/browse/YARN-6799
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Yufei Gu
>Priority: Trivial
>
> The else clause in initializeCGroupController is not necessary.
> {code}
>   public void initializeCGroupController(CGroupController controller) throws
>   ResourceHandlerException {
> if (enableCGroupMount) {
>   // We have a controller that needs to be mounted
>   mountCGroupController(controller);
> } else {
>   String controllerPath = getControllerPath(controller);
>   if (controllerPath == null) {
> throw new ResourceHandlerException(
> String.format("Controller %s not mounted."
> + " You either need to mount it with %s"
> + " or mount cgroups before launching Yarn",
> controller.getName(), YarnConfiguration.
> NM_LINUX_CONTAINER_CGROUPS_MOUNT));
>   }
> }
> // We are working with a pre-mounted contoller
> // Make sure that Yarn cgroup hierarchy path exists
> initializePreMountedCGroupController(controller);
>   }
>   private void initializePreMountedCGroupController(CGroupController 
> controller)
>   throws ResourceHandlerException {
> // Check permissions to cgroup hierarchy and
> // create YARN cgroup if it does not exist, yet
> String controllerPath = getControllerPath(controller);
> if (controllerPath == null) {
>   throw new ResourceHandlerException(
>   String.format("Controller %s not mounted."
>   + " You either need to mount it with %s"
>   + " or mount cgroups before launching Yarn",
>   controller.getName(), YarnConfiguration.
>   NM_LINUX_CONTAINER_CGROUPS_MOUNT));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 126 matches

Mail list logo