[jira] [Commented] (YARN-4000) RM crashes with NPE if leaf queue becomes parent queue during restart

2015-09-16 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746967#comment-14746967
 ] 

Jian He commented on YARN-4000:
---

I see,  a few more comments:
- QueueException -> QueueInvalidException
- If appDiagnosticsBeforeKilling already contains the associated diagnostics, 
we do not need this if/else ?
{code}
if (appDiagnosticsBeforeKilling.isEmpty()) {
  
   diags = getAppKilledDiagnostics();

} else {

  diags = appDiagnosticsBeforeKilling;
}
{code}

> RM crashes with NPE if leaf queue becomes parent queue during restart
> -
>
> Key: YARN-4000
> URL: https://issues.apache.org/jira/browse/YARN-4000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-4000.01.patch, YARN-4000.02.patch
>
>
> This is a similar situation to YARN-2308.  If an application is active in 
> queue A and then the RM restarts with a changed capacity scheduler 
> configuration where queue A becomes a parent queue to other subqueues then 
> the RM will crash with a NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4135) Improve the assertion message in MockRM while failing after waiting for the state.

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746978#comment-14746978
 ] 

Hadoop QA commented on YARN-4135:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   6m 57s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 52s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 50s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 26s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 27s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  54m 15s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  73m 44s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756164/YARN-4135_2.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 2ffe2db |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9156/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9156/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9156/console |


This message was automatically generated.

> Improve the assertion message in MockRM while failing after waiting for the 
> state.
> --
>
> Key: YARN-4135
> URL: https://issues.apache.org/jira/browse/YARN-4135
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: nijel
>Assignee: nijel
>Priority: Minor
>  Labels: test
> Attachments: YARN-4135_1.patch, YARN-4135_2.patch
>
>
> In MockRM when the test is failed after waiting for the given state, the 
> application id or the attempt id can be printed for easy debug
> As of now if it hard to track the test fail in log since there is no relation 
> with test case and the application id.
> Any thoughts ? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4034) Render cluster Max Priority in scheduler metrics in RM web UI

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746979#comment-14746979
 ] 

Hadoop QA commented on YARN-4034:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756141/0001-YARN-4034.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 2ffe2db |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9157/console |


This message was automatically generated.

> Render cluster Max Priority in scheduler metrics in RM web UI
> -
>
> Key: YARN-4034
> URL: https://issues.apache.org/jira/browse/YARN-4034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Minor
> Attachments: 0001-YARN-4034.patch, 0001-YARN-4034.patch, YARN-4034.PNG
>
>
> Currently Scheduler Metric renders the common scheduler metrics in RM web UI. 
> It would be helpful for the user to know what is the configured cluster max 
> priority from web UI. 
> So, in RM web UI front page, Scheduler Metrics can render configured max 
> cluster priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4034) Render cluster Max Priority in scheduler metrics in RM web UI

2015-09-16 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746989#comment-14746989
 ] 

Rohith Sharma K S commented on YARN-4034:
-

Patch need to be rebased. I will upload rebased patch..

> Render cluster Max Priority in scheduler metrics in RM web UI
> -
>
> Key: YARN-4034
> URL: https://issues.apache.org/jira/browse/YARN-4034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Minor
> Attachments: 0001-YARN-4034.patch, 0001-YARN-4034.patch, YARN-4034.PNG
>
>
> Currently Scheduler Metric renders the common scheduler metrics in RM web UI. 
> It would be helpful for the user to know what is the configured cluster max 
> priority from web UI. 
> So, in RM web UI front page, Scheduler Metrics can render configured max 
> cluster priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4165) An outstanding container request makes all nodes to be reserved causing all jobs pending

2015-09-16 Thread Weiwei Yang (JIRA)
Weiwei Yang created YARN-4165:
-

 Summary: An outstanding container request makes all nodes to be 
reserved causing all jobs pending
 Key: YARN-4165
 URL: https://issues.apache.org/jira/browse/YARN-4165
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Affects Versions: 2.7.1
Reporter: Weiwei Yang


We have a long running service in YARN, it has a outstanding container request 
that YARN cannot satisfy (require more memory that nodemanager can supply). 
Then YARN reserves all nodes for this application, when I submit other jobs 
(require relative small memory that nodemanager can supply), all jobs are 
pending because YARN skips scheduling containers on the nodes that have been 
reserved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4165) An outstanding container request makes all nodes to be reserved causing all jobs pending

2015-09-16 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reassigned YARN-4165:
-

Assignee: Weiwei Yang

> An outstanding container request makes all nodes to be reserved causing all 
> jobs pending
> 
>
> Key: YARN-4165
> URL: https://issues.apache.org/jira/browse/YARN-4165
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager, scheduler
>Affects Versions: 2.7.1
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>
> We have a long running service in YARN, it has a outstanding container 
> request that YARN cannot satisfy (require more memory that nodemanager can 
> supply). Then YARN reserves all nodes for this application, when I submit 
> other jobs (require relative small memory that nodemanager can supply), all 
> jobs are pending because YARN skips scheduling containers on the nodes that 
> have been reserved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4141) Runtime Application Priority change should not throw exception for applications at finishing states

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746972#comment-14746972
 ] 

Hadoop QA commented on YARN-4141:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 45s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 54s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 58s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 50s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  54m 39s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  94m 10s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756154/0004-YARN-4141.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 2ffe2db |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9155/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9155/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9155/console |


This message was automatically generated.

> Runtime Application Priority change should not throw exception for 
> applications at finishing states
> ---
>
> Key: YARN-4141
> URL: https://issues.apache.org/jira/browse/YARN-4141
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-4141.patch, 0002-YARN-4141.patch, 
> 0003-YARN-4141.patch, 0004-YARN-4141.patch
>
>
> As suggested by [~jlowe] in 
> [MAPREDUCE-5870-comment|https://issues.apache.org/jira/browse/MAPREDUCE-5870?focusedCommentId=14737035=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14737035]
>  , its good that if YARN can suppress exceptions during change application 
> priority calls for applications at its finishing stages.
> Currently it will be difficult for clients to handle this. This will be 
> similar to kill application behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4166) Support changing container cpu resource

2015-09-16 Thread Jian He (JIRA)
Jian He created YARN-4166:
-

 Summary: Support changing container cpu resource
 Key: YARN-4166
 URL: https://issues.apache.org/jira/browse/YARN-4166
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He


Memory resizing is now supported, we need to support the same for cpu.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4164) Retrospect update ApplicationPriority API return type

2015-09-16 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746952#comment-14746952
 ] 

Sunil G commented on YARN-4164:
---

Hi [~rohithsharma]
Thanks for raising this. To an extent I also feel that this is fine.
>From client side, if we want to verify the change immediately after calling 
>{{updateApplicationPriority}}, we can avoid an RPC call. (as scheduler is 
>capable of doing max-cap with max-cluster-priority, its good to respond back 
>with what we changed). This is a clear advantage.

However reporting back the changed value is not much conventional from what I 
see n hadoop apis much. But if it adds value, I think its ok. Looping [~jianhe].




> Retrospect update ApplicationPriority API return type
> -
>
> Key: YARN-4164
> URL: https://issues.apache.org/jira/browse/YARN-4164
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>
> Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API 
> returns empty UpdateApplicationPriorityResponse response.
> But RM update priority to the cluster.max-priority if the given priority is 
> greater than cluster.max-priority. In this scenarios, need to intimate back 
> to client that updated  priority rather just keeping quite where client 
> assumes that given priority itself is taken.
> During application submission also has same scenario can happen, but I feel 
> when 
> explicitly invoke via ApplicationClientProtocol#updateApplicationPriority(), 
> response should have updated priority in response. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4000) RM crashes with NPE if leaf queue becomes parent queue during restart

2015-09-16 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746985#comment-14746985
 ] 

Varun Saxena commented on YARN-4000:


[~jianhe], moreover there will be subclasses of RMAppEvent which also have 
their version of diagnostics.
Maybe we would want to refactor that as well. Do it as part of this JIRA only ?

> RM crashes with NPE if leaf queue becomes parent queue during restart
> -
>
> Key: YARN-4000
> URL: https://issues.apache.org/jira/browse/YARN-4000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-4000.01.patch, YARN-4000.02.patch
>
>
> This is a similar situation to YARN-2308.  If an application is active in 
> queue A and then the RM restarts with a changed capacity scheduler 
> configuration where queue A becomes a parent queue to other subqueues then 
> the RM will crash with a NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4034) Render cluster Max Priority in scheduler metrics in RM web UI

2015-09-16 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-4034:

Attachment: 0002-YARN-4034.patch

Updated rebased patch!!

> Render cluster Max Priority in scheduler metrics in RM web UI
> -
>
> Key: YARN-4034
> URL: https://issues.apache.org/jira/browse/YARN-4034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Minor
> Attachments: 0001-YARN-4034.patch, 0001-YARN-4034.patch, 
> 0002-YARN-4034.patch, YARN-4034.PNG
>
>
> Currently Scheduler Metric renders the common scheduler metrics in RM web UI. 
> It would be helpful for the user to know what is the configured cluster max 
> priority from web UI. 
> So, in RM web UI front page, Scheduler Metrics can render configured max 
> cluster priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4152) NM crash with NPE when LogAggregationService#stopContainer called for absent container

2015-09-16 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4152:
---
Summary: NM crash with NPE when LogAggregationService#stopContainer called 
for absent container  (was: NM crash when LogAggregationService#stopContainer 
called for absent container)

> NM crash with NPE when LogAggregationService#stopContainer called for absent 
> container
> --
>
> Key: YARN-4152
> URL: https://issues.apache.org/jira/browse/YARN-4152
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: 0001-YARN-4152.patch
>
>
> NM crash during of log aggregation.
> Ran Pi job with 500 container and killed application in between
> *Logs*
> {code}
> 2015-09-12 18:44:25,597 WARN 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code 
> from container container_e51_1442063466801_0001_01_99 is : 143
> 2015-09-12 18:44:25,670 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
>  Event EventType: KILL_CONTAINER sent to absent container 
> container_e51_1442063466801_0001_01_000101
> 2015-09-12 18:44:25,670 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl:
>  Removing container_e51_1442063466801_0001_01_000101 from application 
> application_1442063466801_0001
> 2015-09-12 18:44:25,670 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.stopContainer(LogAggregationService.java:422)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:456)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:68)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> at java.lang.Thread.run(Thread.java:745)
> 2015-09-12 18:44:25,692 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got 
> event CONTAINER_STOP for appId application_1442063466801_0001
> 2015-09-12 18:44:25,692 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Exiting, bbye..
> 2015-09-12 18:44:25,692 INFO 
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=dsperf   
> OPERATION=Container Finished - SucceededTARGET=ContainerImpl
> RESULT=SUCCESS  APPID=application_1442063466801_0001
> CONTAINERID=container_e51_1442063466801_0001_01_000100
> {code}
> *Analysis*
> Looks like for absent container also {{stopContainer}} is called 
> {code}
>   case CONTAINER_FINISHED:
> LogHandlerContainerFinishedEvent containerFinishEvent =
> (LogHandlerContainerFinishedEvent) event;
> stopContainer(containerFinishEvent.getContainerId(),
> containerFinishEvent.getExitCode());
> break;
> {code}
> *Event EventType: KILL_CONTAINER sent to absent container 
> container_e51_1442063466801_0001_01_000101*
> Should skip when {{null==context.getContainers().get(containerId)}} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4034) Render cluster Max Priority in scheduler metrics in RM web UI

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747149#comment-14747149
 ] 

Hadoop QA commented on YARN-4034:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  5s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 27s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 30s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 52s | The applied patch generated  5 
new checkstyle issues (total was 137, now 142). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  59m 21s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 100m 41s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.server.resourcemanager.webapp.TestNodesPage |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756180/0002-YARN-4034.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 2ffe2db |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9159/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/9159/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9159/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9159/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9159/console |


This message was automatically generated.

> Render cluster Max Priority in scheduler metrics in RM web UI
> -
>
> Key: YARN-4034
> URL: https://issues.apache.org/jira/browse/YARN-4034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Minor
> Attachments: 0001-YARN-4034.patch, 0001-YARN-4034.patch, 
> 0002-YARN-4034.patch, YARN-4034.PNG
>
>
> Currently Scheduler Metric renders the common scheduler metrics in RM web UI. 
> It would be helpful for the user to know what is the configured cluster max 
> priority from web UI. 
> So, in RM web UI front page, Scheduler Metrics can render configured max 
> cluster priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4034) Render cluster Max Priority in scheduler metrics in RM web UI

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747225#comment-14747225
 ] 

Hadoop QA commented on YARN-4034:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756201/0003-YARN-4034.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bf2f2b4 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9160/console |


This message was automatically generated.

> Render cluster Max Priority in scheduler metrics in RM web UI
> -
>
> Key: YARN-4034
> URL: https://issues.apache.org/jira/browse/YARN-4034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Minor
> Attachments: 0001-YARN-4034.patch, 0001-YARN-4034.patch, 
> 0002-YARN-4034.patch, 0003-YARN-4034.patch, YARN-4034.PNG
>
>
> Currently Scheduler Metric renders the common scheduler metrics in RM web UI. 
> It would be helpful for the user to know what is the configured cluster max 
> priority from web UI. 
> So, in RM web UI front page, Scheduler Metrics can render configured max 
> cluster priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4078) getPendingResourceRequestForAttempt is present in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747020#comment-14747020
 ] 

Hudson commented on YARN-4078:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8464 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8464/])
YARN-4078. Add getPendingResourceRequestForAttempt in YarnScheduler interface. 
Contributed by Naganarasimha G R (jianhe: rev 
452079af8bc56195945e28b8cf76620f0aca01c3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java


> getPendingResourceRequestForAttempt is present in AbstractYarnScheduler 
> should be present in YarnScheduler interface instead  
> -
>
> Key: YARN-4078
> URL: https://issues.apache.org/jira/browse/YARN-4078
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4078.20150915-2.patch, YARN-4078.20150915.patch
>
>
> Currently getPendingResourceRequestForAttempt is present in 
> {{AbstractYarnScheduler}}.
> *But in AppInfo,  we are calling this method by typecasting it to 
> AbstractYarnScheduler, which is incorrect.*
> Because if a custom scheduler is to be added, it will implement 
> YarnScheduler, not AbstractYarnScheduler.
> This method should be moved to YarnScheduler or it should have a guarded 
> check like in other places (RMAppAttemptBlock.getBlackListedNodes) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4166) Support changing container cpu resource

2015-09-16 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747021#comment-14747021
 ] 

Jian He commented on YARN-4166:
---

yes, that's the idea.

> Support changing container cpu resource
> ---
>
> Key: YARN-4166
> URL: https://issues.apache.org/jira/browse/YARN-4166
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, nodemanager, resourcemanager
>Reporter: Jian He
>
> Memory resizing is now supported, we need to support the same for cpu.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4166) Support changing container cpu resource

2015-09-16 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-4166:
---

Assignee: Naganarasimha G R

> Support changing container cpu resource
> ---
>
> Key: YARN-4166
> URL: https://issues.apache.org/jira/browse/YARN-4166
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, nodemanager, resourcemanager
>Reporter: Jian He
>Assignee: Naganarasimha G R
>
> Memory resizing is now supported, we need to support the same for cpu.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4078) getPendingResourceRequestForAttempt is present in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747179#comment-14747179
 ] 

Hudson commented on YARN-4078:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #402 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/402/])
YARN-4078. Add getPendingResourceRequestForAttempt in YarnScheduler interface. 
Contributed by Naganarasimha G R (jianhe: rev 
452079af8bc56195945e28b8cf76620f0aca01c3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java


> getPendingResourceRequestForAttempt is present in AbstractYarnScheduler 
> should be present in YarnScheduler interface instead  
> -
>
> Key: YARN-4078
> URL: https://issues.apache.org/jira/browse/YARN-4078
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4078.20150915-2.patch, YARN-4078.20150915.patch
>
>
> Currently getPendingResourceRequestForAttempt is present in 
> {{AbstractYarnScheduler}}.
> *But in AppInfo,  we are calling this method by typecasting it to 
> AbstractYarnScheduler, which is incorrect.*
> Because if a custom scheduler is to be added, it will implement 
> YarnScheduler, not AbstractYarnScheduler.
> This method should be moved to YarnScheduler or it should have a guarded 
> check like in other places (RMAppAttemptBlock.getBlackListedNodes) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4167) NPE on RMActiveServices#serviceStop when store is null

2015-09-16 Thread Bibin A Chundatt (JIRA)
Bibin A Chundatt created YARN-4167:
--

 Summary: NPE on RMActiveServices#serviceStop when store is null
 Key: YARN-4167
 URL: https://issues.apache.org/jira/browse/YARN-4167
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt


Configure 
{{yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs}} 
mismatching with {{yarn.nm.liveness-monitor.expiry-interval-ms}}

On startup NPE is thrown on {{RMActiveServices#serviceStop}}

{noformat}
2015-09-16 12:23:29,504 INFO org.apache.hadoop.service.AbstractService: Service 
RMActiveServices failed in state INITED; cause: 
java.lang.IllegalArgumentException: 
yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs should 
be more than 3 X yarn.nm.liveness-monitor.expiry-interval-ms
java.lang.IllegalArgumentException: 
yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs should 
be more than 3 X yarn.nm.liveness-monitor.expiry-interval-ms
 at 
org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.(RMContainerTokenSecretManager.java:82)
 at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.createContainerTokenSecretManager(RMSecretManagerService.java:109)
 at 
org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.(RMSecretManagerService.java:57)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createRMSecretManagerService(ResourceManager.java:)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:423)
 at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:963)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:256)
 at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1193)
2015-09-16 12:23:29,507 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error closing 
store.
java.lang.NullPointerException
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:608)
 at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
 at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
 at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
 at org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:963)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:256)
 at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1193
{noformat}

*Impact Area*: RM failover with wrong configuration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3433) Jersey tests failing with Port in Use -again

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747227#comment-14747227
 ] 

Hudson commented on YARN-3433:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8465 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8465/])
YARN-3433. Jersey tests failing with Port in Use -again.  (Brahma Reddy 
Battula) (stevel: rev bf2f2b4fc436ea5990e6fc78eb18091b9458e75a)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/JerseyTestBase.java


> Jersey tests failing with Port in Use -again
> 
>
> Key: YARN-3433
> URL: https://issues.apache.org/jira/browse/YARN-3433
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, test
>Affects Versions: 3.0.0
> Environment: ASF Jenkins
>Reporter: Steve Loughran
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-3433.patch
>
>
> ASF Jenkins jersey tests failing with port in use exceptions.
> The YARN-2912 patch tried to fix it, but it defaults to port 9998 and doesn't 
> scan for a spare port —so is too brittle on a busy server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4000) RM crashes with NPE if leaf queue becomes parent queue during restart

2015-09-16 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746999#comment-14746999
 ] 

Jian He commented on YARN-4000:
---

sounds good to me. thanks

> RM crashes with NPE if leaf queue becomes parent queue during restart
> -
>
> Key: YARN-4000
> URL: https://issues.apache.org/jira/browse/YARN-4000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-4000.01.patch, YARN-4000.02.patch
>
>
> This is a similar situation to YARN-2308.  If an application is active in 
> queue A and then the RM restarts with a changed capacity scheduler 
> configuration where queue A becomes a parent queue to other subqueues then 
> the RM will crash with a NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4000) RM crashes with NPE if leaf queue becomes parent queue during restart

2015-09-16 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746997#comment-14746997
 ] 

Jian He commented on YARN-4000:
---

sounds good to me. thanks

> RM crashes with NPE if leaf queue becomes parent queue during restart
> -
>
> Key: YARN-4000
> URL: https://issues.apache.org/jira/browse/YARN-4000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-4000.01.patch, YARN-4000.02.patch
>
>
> This is a similar situation to YARN-2308.  If an application is active in 
> queue A and then the RM restarts with a changed capacity scheduler 
> configuration where queue A becomes a parent queue to other subqueues then 
> the RM will crash with a NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4000) RM crashes with NPE if leaf queue becomes parent queue during restart

2015-09-16 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746998#comment-14746998
 ] 

Jian He commented on YARN-4000:
---

sounds good to me. thanks

> RM crashes with NPE if leaf queue becomes parent queue during restart
> -
>
> Key: YARN-4000
> URL: https://issues.apache.org/jira/browse/YARN-4000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-4000.01.patch, YARN-4000.02.patch
>
>
> This is a similar situation to YARN-2308.  If an application is active in 
> queue A and then the RM restarts with a changed capacity scheduler 
> configuration where queue A becomes a parent queue to other subqueues then 
> the RM will crash with a NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (YARN-4000) RM crashes with NPE if leaf queue becomes parent queue during restart

2015-09-16 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-4000:
--
Comment: was deleted

(was: sounds good to me. thanks)

> RM crashes with NPE if leaf queue becomes parent queue during restart
> -
>
> Key: YARN-4000
> URL: https://issues.apache.org/jira/browse/YARN-4000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-4000.01.patch, YARN-4000.02.patch
>
>
> This is a similar situation to YARN-2308.  If an application is active in 
> queue A and then the RM restarts with a changed capacity scheduler 
> configuration where queue A becomes a parent queue to other subqueues then 
> the RM will crash with a NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (YARN-4000) RM crashes with NPE if leaf queue becomes parent queue during restart

2015-09-16 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-4000:
--
Comment: was deleted

(was: sounds good to me. thanks)

> RM crashes with NPE if leaf queue becomes parent queue during restart
> -
>
> Key: YARN-4000
> URL: https://issues.apache.org/jira/browse/YARN-4000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-4000.01.patch, YARN-4000.02.patch
>
>
> This is a similar situation to YARN-2308.  If an application is active in 
> queue A and then the RM restarts with a changed capacity scheduler 
> configuration where queue A becomes a parent queue to other subqueues then 
> the RM will crash with a NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4166) Support changing container cpu resource

2015-09-16 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747001#comment-14747001
 ] 

Naganarasimha G R commented on YARN-4166:
-

Hi [~jianhe ],
IIUC cgroups controls the CPU usage so is the idea to support changing cpu 
resource is by modifying cgroup cpu control files ? 
If so can i work on this jira?

> Support changing container cpu resource
> ---
>
> Key: YARN-4166
> URL: https://issues.apache.org/jira/browse/YARN-4166
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, nodemanager, resourcemanager
>Reporter: Jian He
>
> Memory resizing is now supported, we need to support the same for cpu.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4034) Render cluster Max Priority in scheduler metrics in RM web UI

2015-09-16 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-4034:

Attachment: 0003-YARN-4034.patch

Updated the patch fixing test case failures

> Render cluster Max Priority in scheduler metrics in RM web UI
> -
>
> Key: YARN-4034
> URL: https://issues.apache.org/jira/browse/YARN-4034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Minor
> Attachments: 0001-YARN-4034.patch, 0001-YARN-4034.patch, 
> 0002-YARN-4034.patch, 0003-YARN-4034.patch, YARN-4034.PNG
>
>
> Currently Scheduler Metric renders the common scheduler metrics in RM web UI. 
> It would be helpful for the user to know what is the configured cluster max 
> priority from web UI. 
> So, in RM web UI front page, Scheduler Metrics can render configured max 
> cluster priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4135) Improve the assertion message in MockRM while failing after waiting for the state.

2015-09-16 Thread nijel (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747106#comment-14747106
 ] 

nijel commented on YARN-4135:
-

bq.-1   yarn tests  54m 15s Tests failed in 
hadoop-yarn-server-resourcemanager.
Test skip is not related to this change

Thanks

> Improve the assertion message in MockRM while failing after waiting for the 
> state.
> --
>
> Key: YARN-4135
> URL: https://issues.apache.org/jira/browse/YARN-4135
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: nijel
>Assignee: nijel
>Priority: Minor
>  Labels: test
> Attachments: YARN-4135_1.patch, YARN-4135_2.patch
>
>
> In MockRM when the test is failed after waiting for the given state, the 
> application id or the attempt id can be printed for easy debug
> As of now if it hard to track the test fail in log since there is no relation 
> with test case and the application id.
> Any thoughts ? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4078) getPendingResourceRequestForAttempt is present in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-09-16 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747111#comment-14747111
 ] 

Naganarasimha G R commented on YARN-4078:
-

Thanks for reviewing and committing [~jianhe] & [~rohithsharma]

> getPendingResourceRequestForAttempt is present in AbstractYarnScheduler 
> should be present in YarnScheduler interface instead  
> -
>
> Key: YARN-4078
> URL: https://issues.apache.org/jira/browse/YARN-4078
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4078.20150915-2.patch, YARN-4078.20150915.patch
>
>
> Currently getPendingResourceRequestForAttempt is present in 
> {{AbstractYarnScheduler}}.
> *But in AppInfo,  we are calling this method by typecasting it to 
> AbstractYarnScheduler, which is incorrect.*
> Because if a custom scheduler is to be added, it will implement 
> YarnScheduler, not AbstractYarnScheduler.
> This method should be moved to YarnScheduler or it should have a guarded 
> check like in other places (RMAppAttemptBlock.getBlackListedNodes) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4078) getPendingResourceRequestForAttempt is present in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747203#comment-14747203
 ] 

Hudson commented on YARN-4078:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1136 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1136/])
YARN-4078. Add getPendingResourceRequestForAttempt in YarnScheduler interface. 
Contributed by Naganarasimha G R (jianhe: rev 
452079af8bc56195945e28b8cf76620f0aca01c3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* hadoop-yarn-project/CHANGES.txt


> getPendingResourceRequestForAttempt is present in AbstractYarnScheduler 
> should be present in YarnScheduler interface instead  
> -
>
> Key: YARN-4078
> URL: https://issues.apache.org/jira/browse/YARN-4078
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4078.20150915-2.patch, YARN-4078.20150915.patch
>
>
> Currently getPendingResourceRequestForAttempt is present in 
> {{AbstractYarnScheduler}}.
> *But in AppInfo,  we are calling this method by typecasting it to 
> AbstractYarnScheduler, which is incorrect.*
> Because if a custom scheduler is to be added, it will implement 
> YarnScheduler, not AbstractYarnScheduler.
> This method should be moved to YarnScheduler or it should have a guarded 
> check like in other places (RMAppAttemptBlock.getBlackListedNodes) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4154) Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change

2015-09-16 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-4154:

Assignee: (was: Akira AJISAKA)

> Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change
> ---
>
> Key: YARN-4154
> URL: https://issues.apache.org/jira/browse/YARN-4154
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Jeff Zhang
>Priority: Blocker
>
> {code}
> [ERROR] 
> /mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
>  no suitable constructor found for 
> MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> {code}
> MR might have the same issue.
> \cc [~vinodkv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4154) Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change

2015-09-16 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA resolved YARN-4154.
-
Resolution: Done

Seems to be done by 9c4a6e1 and dba2b6. Thanks Vinod!

> Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change
> ---
>
> Key: YARN-4154
> URL: https://issues.apache.org/jira/browse/YARN-4154
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Jeff Zhang
>Priority: Blocker
>
> {code}
> [ERROR] 
> /mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
>  no suitable constructor found for 
> MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> {code}
> MR might have the same issue.
> \cc [~vinodkv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4168) Test TestLogAggregationService.testLocalFileDeletionOnDiskFull failing

2015-09-16 Thread Steve Loughran (JIRA)
Steve Loughran created YARN-4168:


 Summary: Test 
TestLogAggregationService.testLocalFileDeletionOnDiskFull failing
 Key: YARN-4168
 URL: https://issues.apache.org/jira/browse/YARN-4168
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
 Environment: Jenkins
Reporter: Steve Loughran
Priority: Critical


{{TestLogAggregationService.testLocalFileDeletionOnDiskFull}} failing on 
[Jenkins build 
1136|https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-Yarn-trunk/1136/testReport/junit/org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation/TestLogAggregationService/testLocalFileDeletionOnDiskFull/]
{code}

{noformat}
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertFalse(Assert.java:64)
at org.junit.Assert.assertFalse(Assert.java:74)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.verifyLocalFileDeletion(TestLogAggregationService.java:229)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLocalFileDeletionOnDiskFull(TestLogAggregationService.java:285)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4169) jenkins trunk+java build failed in TestNodeStatusUpdaterForLabels

2015-09-16 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-4169:
---

Assignee: Naganarasimha G R

> jenkins trunk+java build failed in TestNodeStatusUpdaterForLabels
> -
>
> Key: YARN-4169
> URL: https://issues.apache.org/jira/browse/YARN-4169
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Steve Loughran
>Assignee: Naganarasimha G R
>Priority: Critical
>
> Test failing in [[Jenkins build 
> 402|https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-Yarn-trunk-Java8/402/testReport/junit/org.apache.hadoop.yarn.server.nodemanager/TestNodeStatusUpdaterForLabels/testNodeStatusUpdaterForNodeLabels/]
> {code}
> java.lang.NullPointerException: null
>   at java.util.HashSet.(HashSet.java:118)
>   at 
> org.apache.hadoop.yarn.nodelabels.NodeLabelTestBase.assertNLCollectionEquals(NodeLabelTestBase.java:103)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels.testNodeStatusUpdaterForNodeLabels(TestNodeStatusUpdaterForLabels.java:268)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4034) Render cluster Max Priority in scheduler metrics in RM web UI

2015-09-16 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-4034:

Attachment: 0004-YARN-4034.patch

> Render cluster Max Priority in scheduler metrics in RM web UI
> -
>
> Key: YARN-4034
> URL: https://issues.apache.org/jira/browse/YARN-4034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Minor
> Attachments: 0001-YARN-4034.patch, 0001-YARN-4034.patch, 
> 0002-YARN-4034.patch, 0003-YARN-4034.patch, 0004-YARN-4034.patch, 
> YARN-4034.PNG
>
>
> Currently Scheduler Metric renders the common scheduler metrics in RM web UI. 
> It would be helpful for the user to know what is the configured cluster max 
> priority from web UI. 
> So, in RM web UI front page, Scheduler Metrics can render configured max 
> cluster priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3433) Jersey tests failing with Port in Use -again

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768846#comment-14768846
 ] 

Hudson commented on YARN-3433:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #396 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/396/])
YARN-3433. Jersey tests failing with Port in Use -again.  (Brahma Reddy 
Battula) (stevel: rev bf2f2b4fc436ea5990e6fc78eb18091b9458e75a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/JerseyTestBase.java
* hadoop-yarn-project/CHANGES.txt


> Jersey tests failing with Port in Use -again
> 
>
> Key: YARN-3433
> URL: https://issues.apache.org/jira/browse/YARN-3433
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, test
>Affects Versions: 3.0.0
> Environment: ASF Jenkins
>Reporter: Steve Loughran
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-3433.patch
>
>
> ASF Jenkins jersey tests failing with port in use exceptions.
> The YARN-2912 patch tried to fix it, but it defaults to port 9998 and doesn't 
> scan for a spare port —so is too brittle on a busy server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747317#comment-14747317
 ] 

Hadoop QA commented on YARN-4009:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  21m 28s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 10s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   3m  3s | Site still builds. |
| {color:red}-1{color} | checkstyle |   1m 47s | The applied patch generated  5 
new checkstyle issues (total was 0, now 5). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 36s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   3m 48s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-server-common. |
| | |  56m 33s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756213/YARN-4009.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / bf2f2b4 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9162/artifact/patchprocess/diffcheckstylehadoop-yarn-server-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9162/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9162/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9162/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9162/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9162/console |


This message was automatically generated.

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4155) TestLogAggregationService.testLogAggregationServiceWithInterval failing

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747331#comment-14747331
 ] 

Hadoop QA commented on YARN-4155:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   6m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 59s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 38s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 15s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   7m 39s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  26m 38s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756225/0001-YARN-4155.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / bf2f2b4 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9165/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9165/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9165/console |


This message was automatically generated.

> TestLogAggregationService.testLogAggregationServiceWithInterval failing
> ---
>
> Key: YARN-4155
> URL: https://issues.apache.org/jira/browse/YARN-4155
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Steve Loughran
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: 0001-YARN-4155.patch, 0001-YARN-4155.patch
>
>
> Test failing on Jenkins: 
> {{TestLogAggregationService.testLogAggregationServiceWithInterval}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4169) jenkins trunk+java build failed in TestNodeStatusUpdaterForLabels

2015-09-16 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747332#comment-14747332
 ] 

Steve Loughran commented on YARN-4169:
--

This may be a transient race condition, but it is caused by 
{{resourceTracker.labels}} being null.

{code}
assertNLCollectionEquals(resourceTracker.labels,
dummyLabelsProviderRef
.getNodeLabels());
{code}

Even if this isn't replicable
# the assert is getting the equals test wrong: expected comes first
# the {{assertNLCollectionEquals}} check needs some {{assertNotNull()}} checks 
with meaningful errors on its arguments

> jenkins trunk+java build failed in TestNodeStatusUpdaterForLabels
> -
>
> Key: YARN-4169
> URL: https://issues.apache.org/jira/browse/YARN-4169
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Steve Loughran
>Priority: Critical
>
> Test failing in [[Jenkins build 
> 402|https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-Yarn-trunk-Java8/402/testReport/junit/org.apache.hadoop.yarn.server.nodemanager/TestNodeStatusUpdaterForLabels/testNodeStatusUpdaterForNodeLabels/]
> {code}
> java.lang.NullPointerException: null
>   at java.util.HashSet.(HashSet.java:118)
>   at 
> org.apache.hadoop.yarn.nodelabels.NodeLabelTestBase.assertNLCollectionEquals(NodeLabelTestBase.java:103)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels.testNodeStatusUpdaterForNodeLabels(TestNodeStatusUpdaterForLabels.java:268)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4034) Render cluster Max Priority in scheduler metrics in RM web UI

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747356#comment-14747356
 ] 

Hadoop QA commented on YARN-4034:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 52s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  1s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 51s | The applied patch generated  3 
new checkstyle issues (total was 137, now 140). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  54m 34s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  94m 13s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756211/0004-YARN-4034.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bf2f2b4 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9161/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/9161/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9161/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9161/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9161/console |


This message was automatically generated.

> Render cluster Max Priority in scheduler metrics in RM web UI
> -
>
> Key: YARN-4034
> URL: https://issues.apache.org/jira/browse/YARN-4034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Minor
> Attachments: 0001-YARN-4034.patch, 0001-YARN-4034.patch, 
> 0002-YARN-4034.patch, 0003-YARN-4034.patch, 0004-YARN-4034.patch, 
> YARN-4034.PNG
>
>
> Currently Scheduler Metric renders the common scheduler metrics in RM web UI. 
> It would be helpful for the user to know what is the configured cluster max 
> priority from web UI. 
> So, in RM web UI front page, Scheduler Metrics can render configured max 
> cluster priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4078) getPendingResourceRequestForAttempt is present in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768858#comment-14768858
 ] 

Hudson commented on YARN-4078:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #379 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/379/])
YARN-4078. Add getPendingResourceRequestForAttempt in YarnScheduler interface. 
Contributed by Naganarasimha G R (jianhe: rev 
452079af8bc56195945e28b8cf76620f0aca01c3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java


> getPendingResourceRequestForAttempt is present in AbstractYarnScheduler 
> should be present in YarnScheduler interface instead  
> -
>
> Key: YARN-4078
> URL: https://issues.apache.org/jira/browse/YARN-4078
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4078.20150915-2.patch, YARN-4078.20150915.patch
>
>
> Currently getPendingResourceRequestForAttempt is present in 
> {{AbstractYarnScheduler}}.
> *But in AppInfo,  we are calling this method by typecasting it to 
> AbstractYarnScheduler, which is incorrect.*
> Because if a custom scheduler is to be added, it will implement 
> YarnScheduler, not AbstractYarnScheduler.
> This method should be moved to YarnScheduler or it should have a guarded 
> check like in other places (RMAppAttemptBlock.getBlackListedNodes) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3433) Jersey tests failing with Port in Use -again

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768859#comment-14768859
 ] 

Hudson commented on YARN-3433:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #379 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/379/])
YARN-3433. Jersey tests failing with Port in Use -again.  (Brahma Reddy 
Battula) (stevel: rev bf2f2b4fc436ea5990e6fc78eb18091b9458e75a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/JerseyTestBase.java
* hadoop-yarn-project/CHANGES.txt


> Jersey tests failing with Port in Use -again
> 
>
> Key: YARN-3433
> URL: https://issues.apache.org/jira/browse/YARN-3433
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, test
>Affects Versions: 3.0.0
> Environment: ASF Jenkins
>Reporter: Steve Loughran
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-3433.patch
>
>
> ASF Jenkins jersey tests failing with port in use exceptions.
> The YARN-2912 patch tried to fix it, but it defaults to port 9998 and doesn't 
> scan for a spare port —so is too brittle on a busy server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-4009:

Attachment: YARN-4009.002.patch

Uploaded a newer version of the patch with some documentation fixed.

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4078) getPendingResourceRequestForAttempt is present in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747271#comment-14747271
 ] 

Hudson commented on YARN-4078:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #395 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/395/])
YARN-4078. Add getPendingResourceRequestForAttempt in YarnScheduler interface. 
Contributed by Naganarasimha G R (jianhe: rev 
452079af8bc56195945e28b8cf76620f0aca01c3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java


> getPendingResourceRequestForAttempt is present in AbstractYarnScheduler 
> should be present in YarnScheduler interface instead  
> -
>
> Key: YARN-4078
> URL: https://issues.apache.org/jira/browse/YARN-4078
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4078.20150915-2.patch, YARN-4078.20150915.patch
>
>
> Currently getPendingResourceRequestForAttempt is present in 
> {{AbstractYarnScheduler}}.
> *But in AppInfo,  we are calling this method by typecasting it to 
> AbstractYarnScheduler, which is incorrect.*
> Because if a custom scheduler is to be added, it will implement 
> YarnScheduler, not AbstractYarnScheduler.
> This method should be moved to YarnScheduler or it should have a guarded 
> check like in other places (RMAppAttemptBlock.getBlackListedNodes) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4169) jenkins trunk+java build failed in TestNodeStatusUpdaterForLabels

2015-09-16 Thread Steve Loughran (JIRA)
Steve Loughran created YARN-4169:


 Summary: jenkins trunk+java build failed in 
TestNodeStatusUpdaterForLabels
 Key: YARN-4169
 URL: https://issues.apache.org/jira/browse/YARN-4169
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
 Environment: Jenkins
Reporter: Steve Loughran
Priority: Critical


Test failing in [[Jenkins build 
402|https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-Yarn-trunk-Java8/402/testReport/junit/org.apache.hadoop.yarn.server.nodemanager/TestNodeStatusUpdaterForLabels/testNodeStatusUpdaterForNodeLabels/]

{code}
java.lang.NullPointerException: null
at java.util.HashSet.(HashSet.java:118)
at 
org.apache.hadoop.yarn.nodelabels.NodeLabelTestBase.assertNLCollectionEquals(NodeLabelTestBase.java:103)
at 
org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels.testNodeStatusUpdaterForNodeLabels(TestNodeStatusUpdaterForLabels.java:268)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4078) getPendingResourceRequestForAttempt is present in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747373#comment-14747373
 ] 

Hudson commented on YARN-4078:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2318 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2318/])
YARN-4078. Add getPendingResourceRequestForAttempt in YarnScheduler interface. 
Contributed by Naganarasimha G R (jianhe: rev 
452079af8bc56195945e28b8cf76620f0aca01c3)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java


> getPendingResourceRequestForAttempt is present in AbstractYarnScheduler 
> should be present in YarnScheduler interface instead  
> -
>
> Key: YARN-4078
> URL: https://issues.apache.org/jira/browse/YARN-4078
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4078.20150915-2.patch, YARN-4078.20150915.patch
>
>
> Currently getPendingResourceRequestForAttempt is present in 
> {{AbstractYarnScheduler}}.
> *But in AppInfo,  we are calling this method by typecasting it to 
> AbstractYarnScheduler, which is incorrect.*
> Because if a custom scheduler is to be added, it will implement 
> YarnScheduler, not AbstractYarnScheduler.
> This method should be moved to YarnScheduler or it should have a guarded 
> check like in other places (RMAppAttemptBlock.getBlackListedNodes) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-261) Ability to kill AM attempts

2015-09-16 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-261:
---
Attachment: 0001-YARN-261.patch

> Ability to kill AM attempts
> ---
>
> Key: YARN-261
> URL: https://issues.apache.org/jira/browse/YARN-261
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api
>Affects Versions: 2.0.3-alpha
>Reporter: Jason Lowe
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-261.patch, YARN-261--n2.patch, 
> YARN-261--n3.patch, YARN-261--n4.patch, YARN-261--n5.patch, 
> YARN-261--n6.patch, YARN-261--n7.patch, YARN-261.patch
>
>
> It would be nice if clients could ask for an AM attempt to be killed.  This 
> is analogous to the task attempt kill support provided by MapReduce.
> This feature would be useful in a scenario where AM retries are enabled, the 
> AM supports recovery, and a particular AM attempt is stuck.  Currently if 
> this occurs the user's only recourse is to kill the entire application, 
> requiring them to resubmit a new application and potentially breaking 
> downstream dependent jobs if it's part of a bigger workflow.  Killing the 
> attempt would allow a new attempt to be started by the RM without killing the 
> entire application, and if the AM supports recovery it could potentially save 
> a lot of work.  It could also be useful in workflow scenarios where the 
> failure of the entire application kills the workflow, but the ability to kill 
> an attempt can keep the workflow going if the subsequent attempt succeeds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-261) Ability to kill AM attempts

2015-09-16 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747384#comment-14747384
 ] 

Rohith Sharma K S commented on YARN-261:


Updated the rebased patch.. I have verified the patch in cluster and it is 
working fine.. 
Some of the change from earlier patches are
# The client API takes only application attempt ID. Earlier patch used to take 
AttemptId also as an argument.
# The client CLI api is {{./yarn applicationattempt -fail }}
# Help message for fail attempt is 
{code}
usage: applicationattempt
 -fail  Fails application attempt.
{code}
# When fail is called upon on application attempt, this attempt failure is 
counted for checking maxAttempt to launch.
# More functionality tests can be added. I will add in next patches.

Kindly review the updated patch

> Ability to kill AM attempts
> ---
>
> Key: YARN-261
> URL: https://issues.apache.org/jira/browse/YARN-261
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api
>Affects Versions: 2.0.3-alpha
>Reporter: Jason Lowe
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-261.patch, YARN-261--n2.patch, 
> YARN-261--n3.patch, YARN-261--n4.patch, YARN-261--n5.patch, 
> YARN-261--n6.patch, YARN-261--n7.patch, YARN-261.patch
>
>
> It would be nice if clients could ask for an AM attempt to be killed.  This 
> is analogous to the task attempt kill support provided by MapReduce.
> This feature would be useful in a scenario where AM retries are enabled, the 
> AM supports recovery, and a particular AM attempt is stuck.  Currently if 
> this occurs the user's only recourse is to kill the entire application, 
> requiring them to resubmit a new application and potentially breaking 
> downstream dependent jobs if it's part of a bigger workflow.  Killing the 
> attempt would allow a new attempt to be started by the RM without killing the 
> entire application, and if the AM supports recovery it could potentially save 
> a lot of work.  It could also be useful in workflow scenarios where the 
> failure of the entire application kills the workflow, but the ability to kill 
> an attempt can keep the workflow going if the subsequent attempt succeeds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4078) getPendingResourceRequestForAttempt is present in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747293#comment-14747293
 ] 

Hudson commented on YARN-4078:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2342 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2342/])
YARN-4078. Add getPendingResourceRequestForAttempt in YarnScheduler interface. 
Contributed by Naganarasimha G R (jianhe: rev 
452079af8bc56195945e28b8cf76620f0aca01c3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* hadoop-yarn-project/CHANGES.txt


> getPendingResourceRequestForAttempt is present in AbstractYarnScheduler 
> should be present in YarnScheduler interface instead  
> -
>
> Key: YARN-4078
> URL: https://issues.apache.org/jira/browse/YARN-4078
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4078.20150915-2.patch, YARN-4078.20150915.patch
>
>
> Currently getPendingResourceRequestForAttempt is present in 
> {{AbstractYarnScheduler}}.
> *But in AppInfo,  we are calling this method by typecasting it to 
> AbstractYarnScheduler, which is incorrect.*
> Because if a custom scheduler is to be added, it will implement 
> YarnScheduler, not AbstractYarnScheduler.
> This method should be moved to YarnScheduler or it should have a guarded 
> check like in other places (RMAppAttemptBlock.getBlackListedNodes) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3433) Jersey tests failing with Port in Use -again

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747347#comment-14747347
 ] 

Hudson commented on YARN-3433:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1137 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1137/])
YARN-3433. Jersey tests failing with Port in Use -again.  (Brahma Reddy 
Battula) (stevel: rev bf2f2b4fc436ea5990e6fc78eb18091b9458e75a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/JerseyTestBase.java
* hadoop-yarn-project/CHANGES.txt


> Jersey tests failing with Port in Use -again
> 
>
> Key: YARN-3433
> URL: https://issues.apache.org/jira/browse/YARN-3433
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, test
>Affects Versions: 3.0.0
> Environment: ASF Jenkins
>Reporter: Steve Loughran
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-3433.patch
>
>
> ASF Jenkins jersey tests failing with port in use exceptions.
> The YARN-2912 patch tried to fix it, but it defaults to port 9998 and doesn't 
> scan for a spare port —so is too brittle on a busy server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4169) jenkins trunk+java build failed in TestNodeStatusUpdaterForLabels

2015-09-16 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747354#comment-14747354
 ] 

Naganarasimha G R commented on YARN-4169:
-

Hi [~steve_l]
This test is passing locally i will try to analyze more and also fix other 
issues which you mentioned .



> jenkins trunk+java build failed in TestNodeStatusUpdaterForLabels
> -
>
> Key: YARN-4169
> URL: https://issues.apache.org/jira/browse/YARN-4169
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Steve Loughran
>Assignee: Naganarasimha G R
>Priority: Critical
>
> Test failing in [[Jenkins build 
> 402|https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-Yarn-trunk-Java8/402/testReport/junit/org.apache.hadoop.yarn.server.nodemanager/TestNodeStatusUpdaterForLabels/testNodeStatusUpdaterForNodeLabels/]
> {code}
> java.lang.NullPointerException: null
>   at java.util.HashSet.(HashSet.java:118)
>   at 
> org.apache.hadoop.yarn.nodelabels.NodeLabelTestBase.assertNLCollectionEquals(NodeLabelTestBase.java:103)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels.testNodeStatusUpdaterForNodeLabels(TestNodeStatusUpdaterForLabels.java:268)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3433) Jersey tests failing with Port in Use -again

2015-09-16 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747353#comment-14747353
 ] 

Brahma Reddy Battula commented on YARN-3433:


[~ste...@apache.org] thanks for committing!!

> Jersey tests failing with Port in Use -again
> 
>
> Key: YARN-3433
> URL: https://issues.apache.org/jira/browse/YARN-3433
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, test
>Affects Versions: 3.0.0
> Environment: ASF Jenkins
>Reporter: Steve Loughran
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-3433.patch
>
>
> ASF Jenkins jersey tests failing with port in use exceptions.
> The YARN-2912 patch tried to fix it, but it defaults to port 9998 and doesn't 
> scan for a spare port —so is too brittle on a busy server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4170) AM need to be notified with priority in AllocateResponse

2015-09-16 Thread Sunil G (JIRA)
Sunil G created YARN-4170:
-

 Summary: AM need to be notified with priority in AllocateResponse 
 Key: YARN-4170
 URL: https://issues.apache.org/jira/browse/YARN-4170
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G


As discussed in MAPREDUCE-5870, Application Master need to be notified with 
priority in Allocate heartbeat.  This will help AM to know the priority and can 
update JobStatus when client asks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4167) NPE on RMActiveServices#serviceStop when store is null

2015-09-16 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4167:
---
Priority: Minor  (was: Major)

> NPE on RMActiveServices#serviceStop when store is null
> --
>
> Key: YARN-4167
> URL: https://issues.apache.org/jira/browse/YARN-4167
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: 0001-YARN-4167.patch
>
>
> Configure 
> {{yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs}} 
> mismatching with {{yarn.nm.liveness-monitor.expiry-interval-ms}}
> On startup NPE is thrown on {{RMActiveServices#serviceStop}}
> {noformat}
> 2015-09-16 12:23:29,504 INFO org.apache.hadoop.service.AbstractService: 
> Service RMActiveServices failed in state INITED; cause: 
> java.lang.IllegalArgumentException: 
> yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs should 
> be more than 3 X yarn.nm.liveness-monitor.expiry-interval-ms
> java.lang.IllegalArgumentException: 
> yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs should 
> be more than 3 X yarn.nm.liveness-monitor.expiry-interval-ms
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.(RMContainerTokenSecretManager.java:82)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.createContainerTokenSecretManager(RMSecretManagerService.java:109)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.(RMSecretManagerService.java:57)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createRMSecretManagerService(ResourceManager.java:)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:423)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:963)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:256)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1193)
> 2015-09-16 12:23:29,507 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error closing 
> store.
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:608)
>  at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>  at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>  at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:963)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:256)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1193
> {noformat}
> *Impact Area*: RM failover with wrong configuration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4167) NPE on RMActiveServices#serviceStop when store is null

2015-09-16 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4167:
---
Attachment: 0001-YARN-4167.patch

Its an improvement . 

> NPE on RMActiveServices#serviceStop when store is null
> --
>
> Key: YARN-4167
> URL: https://issues.apache.org/jira/browse/YARN-4167
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4167.patch
>
>
> Configure 
> {{yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs}} 
> mismatching with {{yarn.nm.liveness-monitor.expiry-interval-ms}}
> On startup NPE is thrown on {{RMActiveServices#serviceStop}}
> {noformat}
> 2015-09-16 12:23:29,504 INFO org.apache.hadoop.service.AbstractService: 
> Service RMActiveServices failed in state INITED; cause: 
> java.lang.IllegalArgumentException: 
> yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs should 
> be more than 3 X yarn.nm.liveness-monitor.expiry-interval-ms
> java.lang.IllegalArgumentException: 
> yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs should 
> be more than 3 X yarn.nm.liveness-monitor.expiry-interval-ms
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.(RMContainerTokenSecretManager.java:82)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.createContainerTokenSecretManager(RMSecretManagerService.java:109)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.(RMSecretManagerService.java:57)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createRMSecretManagerService(ResourceManager.java:)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:423)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:963)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:256)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1193)
> 2015-09-16 12:23:29,507 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error closing 
> store.
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:608)
>  at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>  at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>  at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:963)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:256)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1193
> {noformat}
> *Impact Area*: RM failover with wrong configuration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4167) NPE on RMActiveServices#serviceStop when store is null

2015-09-16 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747275#comment-14747275
 ] 

Bibin A Chundatt commented on YARN-4167:


Attaching patch for the same . Please review

> NPE on RMActiveServices#serviceStop when store is null
> --
>
> Key: YARN-4167
> URL: https://issues.apache.org/jira/browse/YARN-4167
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: 0001-YARN-4167.patch
>
>
> Configure 
> {{yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs}} 
> mismatching with {{yarn.nm.liveness-monitor.expiry-interval-ms}}
> On startup NPE is thrown on {{RMActiveServices#serviceStop}}
> {noformat}
> 2015-09-16 12:23:29,504 INFO org.apache.hadoop.service.AbstractService: 
> Service RMActiveServices failed in state INITED; cause: 
> java.lang.IllegalArgumentException: 
> yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs should 
> be more than 3 X yarn.nm.liveness-monitor.expiry-interval-ms
> java.lang.IllegalArgumentException: 
> yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs should 
> be more than 3 X yarn.nm.liveness-monitor.expiry-interval-ms
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.(RMContainerTokenSecretManager.java:82)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.createContainerTokenSecretManager(RMSecretManagerService.java:109)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService.(RMSecretManagerService.java:57)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createRMSecretManagerService(ResourceManager.java:)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:423)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:963)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:256)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1193)
> 2015-09-16 12:23:29,507 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error closing 
> store.
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:608)
>  at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>  at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>  at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:963)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:256)
>  at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1193
> {noformat}
> *Impact Area*: RM failover with wrong configuration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3212) RMNode State Transition Update with DECOMMISSIONING state

2015-09-16 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3212:
-
Attachment: YARN-3212-v6.2.patch

The unit test failure should be unrelated. Fix whitespace issue in v6.2 patch.

> RMNode State Transition Update with DECOMMISSIONING state
> -
>
> Key: YARN-3212
> URL: https://issues.apache.org/jira/browse/YARN-3212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: RMNodeImpl - new.png, YARN-3212-v1.patch, 
> YARN-3212-v2.patch, YARN-3212-v3.patch, YARN-3212-v4.1.patch, 
> YARN-3212-v4.patch, YARN-3212-v5.1.patch, YARN-3212-v5.patch, 
> YARN-3212-v6.1.patch, YARN-3212-v6.2.patch, YARN-3212-v6.patch
>
>
> As proposed in YARN-914, a new state of “DECOMMISSIONING” will be added and 
> can transition from “running” state triggered by a new event - 
> “decommissioning”. 
> This new state can be transit to state of “decommissioned” when 
> Resource_Update if no running apps on this NM or NM reconnect after restart. 
> Or it received DECOMMISSIONED event (after timeout from CLI).
> In addition, it can back to “running” if user decides to cancel previous 
> decommission by calling recommission on the same node. The reaction to other 
> events is similar to RUNNING state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4155) TestLogAggregationService.testLogAggregationServiceWithInterval failing

2015-09-16 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4155:
---
Attachment: 0001-YARN-4155.patch

Attaching same patch again to trigger jenkins

> TestLogAggregationService.testLogAggregationServiceWithInterval failing
> ---
>
> Key: YARN-4155
> URL: https://issues.apache.org/jira/browse/YARN-4155
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Steve Loughran
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: 0001-YARN-4155.patch, 0001-YARN-4155.patch
>
>
> Test failing on Jenkins: 
> {{TestLogAggregationService.testLogAggregationServiceWithInterval}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3433) Jersey tests failing with Port in Use -again

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747313#comment-14747313
 ] 

Hudson commented on YARN-3433:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #403 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/403/])
YARN-3433. Jersey tests failing with Port in Use -again.  (Brahma Reddy 
Battula) (stevel: rev bf2f2b4fc436ea5990e6fc78eb18091b9458e75a)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/JerseyTestBase.java


> Jersey tests failing with Port in Use -again
> 
>
> Key: YARN-3433
> URL: https://issues.apache.org/jira/browse/YARN-3433
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, test
>Affects Versions: 3.0.0
> Environment: ASF Jenkins
>Reporter: Steve Loughran
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-3433.patch
>
>
> ASF Jenkins jersey tests failing with port in use exceptions.
> The YARN-2912 patch tried to fix it, but it defaults to port 9998 and doesn't 
> scan for a spare port —so is too brittle on a busy server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3433) Jersey tests failing with Port in Use -again

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747359#comment-14747359
 ] 

Hudson commented on YARN-3433:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2343 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2343/])
YARN-3433. Jersey tests failing with Port in Use -again.  (Brahma Reddy 
Battula) (stevel: rev bf2f2b4fc436ea5990e6fc78eb18091b9458e75a)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/JerseyTestBase.java


> Jersey tests failing with Port in Use -again
> 
>
> Key: YARN-3433
> URL: https://issues.apache.org/jira/browse/YARN-3433
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, test
>Affects Versions: 3.0.0
> Environment: ASF Jenkins
>Reporter: Steve Loughran
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-3433.patch
>
>
> ASF Jenkins jersey tests failing with port in use exceptions.
> The YARN-2912 patch tried to fix it, but it defaults to port 9998 and doesn't 
> scan for a spare port —so is too brittle on a busy server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-4009:

Attachment: YARN-4009.003.patch

Uploaded a new version of the patch that moves the filter into hadoop-common. 
No reason why the filter needs to be YARN specific.

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4165) An outstanding container request makes all nodes to be reserved causing all jobs pending

2015-09-16 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768915#comment-14768915
 ] 

Jason Lowe commented on YARN-4165:
--

Is this with the fair scheduler?  I don't believe the capacity scheduler has 
this issue after YARN-957 which was fixed a long time ago.

> An outstanding container request makes all nodes to be reserved causing all 
> jobs pending
> 
>
> Key: YARN-4165
> URL: https://issues.apache.org/jira/browse/YARN-4165
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager, scheduler
>Affects Versions: 2.7.1
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>
> We have a long running service in YARN, it has a outstanding container 
> request that YARN cannot satisfy (require more memory that nodemanager can 
> supply). Then YARN reserves all nodes for this application, when I submit 
> other jobs (require relative small memory that nodemanager can supply), all 
> jobs are pending because YARN skips scheduling containers on the nodes that 
> have been reserved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-306) FIFO scheduler doesn't respect changing job priority

2015-09-16 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved YARN-306.
-
Resolution: Won't Fix

bq. Thinking that it may violate FIFO scheduling policy.
Well if we're worried about that then the same logic would apply to the 
CapacityScheduler.  It also was using a FIFO scheduling policy before priority 
support was added.  All that adding it to the FIFO scheduler would mean is that 
jobs would be in priority order then FIFO within the same priority.

However I agree with Jian that we probably don't need to add the support unless 
there is sufficient demand.  If users really want the FIFO scheduler with 
priority then they can run the CapacityScheduler with a single queue which 
should be effectively equivalent.

> FIFO scheduler doesn't respect changing job priority
> 
>
> Key: YARN-306
> URL: https://issues.apache.org/jira/browse/YARN-306
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Nishan Shetty
>Assignee: Rohith Sharma K S
>
> 1.Submit job
> 2.Change the job priority using setPriority() or CLI command ./mapred 
> job-set-priority  
> Observe that Job priority is not changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4141) Runtime Application Priority change should not throw exception for applications at finishing states

2015-09-16 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768994#comment-14768994
 ] 

Jason Lowe commented on YARN-4141:
--

Thanks for updating the patch.  One last nit: the EnumSets were using are all 
effectively constants, and we should precompute these as static variable 
constants rather than create them every time.  Otherwise latest patch looks 
good to me.

> Runtime Application Priority change should not throw exception for 
> applications at finishing states
> ---
>
> Key: YARN-4141
> URL: https://issues.apache.org/jira/browse/YARN-4141
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-4141.patch, 0002-YARN-4141.patch, 
> 0003-YARN-4141.patch, 0004-YARN-4141.patch
>
>
> As suggested by [~jlowe] in 
> [MAPREDUCE-5870-comment|https://issues.apache.org/jira/browse/MAPREDUCE-5870?focusedCommentId=14737035=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14737035]
>  , its good that if YARN can suppress exceptions during change application 
> priority calls for applications at its finishing stages.
> Currently it will be difficult for clients to handle this. This will be 
> similar to kill application behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3212) RMNode State Transition Update with DECOMMISSIONING state

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14769010#comment-14769010
 ] 

Hadoop QA commented on YARN-3212:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 33s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  8s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 49s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  8s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 29s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  59m 12s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  98m 39s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756224/YARN-3212-v6.2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bf2f2b4 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9167/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9167/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9167/console |


This message was automatically generated.

> RMNode State Transition Update with DECOMMISSIONING state
> -
>
> Key: YARN-3212
> URL: https://issues.apache.org/jira/browse/YARN-3212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: RMNodeImpl - new.png, YARN-3212-v1.patch, 
> YARN-3212-v2.patch, YARN-3212-v3.patch, YARN-3212-v4.1.patch, 
> YARN-3212-v4.patch, YARN-3212-v5.1.patch, YARN-3212-v5.patch, 
> YARN-3212-v6.1.patch, YARN-3212-v6.2.patch, YARN-3212-v6.patch
>
>
> As proposed in YARN-914, a new state of “DECOMMISSIONING” will be added and 
> can transition from “running” state triggered by a new event - 
> “decommissioning”. 
> This new state can be transit to state of “decommissioned” when 
> Resource_Update if no running apps on this NM or NM reconnect after restart. 
> Or it received DECOMMISSIONED event (after timeout from CLI).
> In addition, it can back to “running” if user decides to cancel previous 
> decommission by calling recommission on the same node. The reaction to other 
> events is similar to RUNNING state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790529#comment-14790529
 ] 

Hadoop QA commented on YARN-4009:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  23m 31s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  0s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   3m  2s | Site still builds. |
| {color:red}-1{color} | checkstyle |   2m 10s | The applied patch generated  5 
new checkstyle issues (total was 0, now 5). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   5m 26s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m 27s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | yarn tests |   1m 56s | Tests failed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   3m 47s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-server-common. |
| | |  85m  1s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-applicationhistoryservice |
| Failed unit tests | hadoop.yarn.logaggregation.TestAggregatedLogsBlock |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756252/YARN-4009.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / bf2f2b4 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9168/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/9168/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-applicationhistoryservice.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9168/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9168/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9168/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9168/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9168/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9168/console |


This message was automatically generated.

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3433) Jersey tests failing with Port in Use -again

2015-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790533#comment-14790533
 ] 

Hudson commented on YARN-3433:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2319 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2319/])
YARN-3433. Jersey tests failing with Port in Use -again.  (Brahma Reddy 
Battula) (stevel: rev bf2f2b4fc436ea5990e6fc78eb18091b9458e75a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/JerseyTestBase.java
* hadoop-yarn-project/CHANGES.txt


> Jersey tests failing with Port in Use -again
> 
>
> Key: YARN-3433
> URL: https://issues.apache.org/jira/browse/YARN-3433
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, test
>Affects Versions: 3.0.0
> Environment: ASF Jenkins
>Reporter: Steve Loughran
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-3433.patch
>
>
> ASF Jenkins jersey tests failing with port in use exceptions.
> The YARN-2912 patch tried to fix it, but it defaults to port 9998 and doesn't 
> scan for a spare port —so is too brittle on a busy server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-4009:

Attachment: YARN-4009.004.patch

Fixed findbugs warning.

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4148) When killing app, RM releases app's resource before they are released by NM

2015-09-16 Thread Jun Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Gong updated YARN-4148:
---
Attachment: YARN-4148.001.patch

> When killing app, RM releases app's resource before they are released by NM
> ---
>
> Key: YARN-4148
> URL: https://issues.apache.org/jira/browse/YARN-4148
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Jun Gong
>Assignee: Jun Gong
> Attachments: YARN-4148.001.patch, YARN-4148.wip.patch
>
>
> When killing a app, RM scheduler releases app's resource as soon as possible, 
> then it might allocate these resource for new requests. But NM have not 
> released them at that time.
> The problem was found when we supported GPU as a resource(YARN-4122).  Test 
> environment: a NM had 6 GPUs, app A used all 6 GPUs, app B was requesting 3 
> GPUs. Killed app A, then RM released A's 6 GPUs, and allocated 3 GPUs to B. 
> But when B tried to start container on NM, NM found it didn't have 3 GPUs to 
> allocate because it had not released A's GPUs.
> I think the problem also exists for CPU/Memory. It might cause OOM when 
> memory is overused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4148) When killing app, RM releases app's resource before they are released by NM

2015-09-16 Thread Jun Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790730#comment-14790730
 ] 

Jun Gong commented on YARN-4148:


Attach a new patch.

Add a new config to specify whether permits RM release container's resource 
before NM.

> When killing app, RM releases app's resource before they are released by NM
> ---
>
> Key: YARN-4148
> URL: https://issues.apache.org/jira/browse/YARN-4148
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Jun Gong
>Assignee: Jun Gong
> Attachments: YARN-4148.001.patch, YARN-4148.wip.patch
>
>
> When killing a app, RM scheduler releases app's resource as soon as possible, 
> then it might allocate these resource for new requests. But NM have not 
> released them at that time.
> The problem was found when we supported GPU as a resource(YARN-4122).  Test 
> environment: a NM had 6 GPUs, app A used all 6 GPUs, app B was requesting 3 
> GPUs. Killed app A, then RM released A's 6 GPUs, and allocated 3 GPUs to B. 
> But when B tried to start container on NM, NM found it didn't have 3 GPUs to 
> allocate because it had not released A's GPUs.
> I think the problem also exists for CPU/Memory. It might cause OOM when 
> memory is overused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-261) Ability to kill AM attempts

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790611#comment-14790611
 ] 

Hadoop QA commented on YARN-261:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  22m 34s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   8m  4s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 19s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 52s | The applied patch generated  5 
new checkstyle issues (total was 32, now 37). |
| {color:red}-1{color} | whitespace |   0m 13s | The patch has 30  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   7m 44s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | mapreduce tests | 101m 47s | Tests failed in 
hadoop-mapreduce-client-jobclient. |
| {color:green}+1{color} | yarn tests |   0m 31s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |   7m  3s | Tests failed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |   2m 10s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   7m 53s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:green}+1{color} | yarn tests |  59m 31s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | | 234m  8s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.mapred.TestNetworkedJob |
|   | hadoop.yarn.client.cli.TestYarnCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756232/0001-YARN-261.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bf2f2b4 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9166/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/9166/artifact/patchprocess/whitespace.txt
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9166/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9166/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9166/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9166/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9166/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9166/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9166/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9166/console |


This message was automatically generated.

> Ability to kill AM attempts
> ---
>
> Key: YARN-261
> URL: https://issues.apache.org/jira/browse/YARN-261
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api
>Affects Versions: 2.0.3-alpha
>Reporter: Jason Lowe
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-261.patch, YARN-261--n2.patch, 
> YARN-261--n3.patch, YARN-261--n4.patch, YARN-261--n5.patch, 
> YARN-261--n6.patch, YARN-261--n7.patch, YARN-261.patch
>
>
> It would be nice if clients could ask for an AM attempt to be killed.  This 
> is analogous to the task attempt kill support provided by MapReduce.
> This feature would be useful in a scenario where AM retries are enabled, the 
> AM supports recovery, and a particular AM attempt is stuck.  Currently if 
> this occurs the user's only recourse is to kill the entire application, 
> requiring them 

[jira] [Commented] (YARN-306) FIFO scheduler doesn't respect changing job priority

2015-09-16 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790676#comment-14790676
 ] 

Rohith Sharma K S commented on YARN-306:


Agree for on-demand implementing priority support in FIFO scheduler. And work 
around also exist with CS.
Sorry I did not give my full thoughts on violating FIFO scheduling policy what 
it by means. In CS, choosing of queues at the same level is based on the sorted 
order of child Queues by used-capacity. If used capacity is same across the 
queues then sorted using guaranteed capacity. At leafQueue level, FIFO policy 
is used which is changed now with priority as first. Basically CS tries to 
choose the queue's at same level by capacity but in FIFO scheduler there is no 
queue concept, applications are assigned in FIFO order. In CS, fifo policy at 
leaf queue is violated which is one part of CS, where as FIFO scheduler will be 
violating fully if priority supports.

> FIFO scheduler doesn't respect changing job priority
> 
>
> Key: YARN-306
> URL: https://issues.apache.org/jira/browse/YARN-306
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Nishan Shetty
>Assignee: Rohith Sharma K S
>
> 1.Submit job
> 2.Change the job priority using setPriority() or CLI command ./mapred 
> job-set-priority  
> Observe that Job priority is not changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2015-09-16 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790688#comment-14790688
 ] 

Varun Saxena commented on YARN-2902:


bq. Container localizers should already have the concept of heartbeating and 
killing themselves if they don't hear from the NM within X seconds, and 
likewise the NM should kill localizers that don't heartbeat in a timely fashion.
For container localizer I think ipc timeout should work, if configured. Will 
check it. 
For NM side, currently it does not kill localizers. We can track PID and kill 
it as discussed earlier if HB doesnt come for a configured period. We can do 
similar to what we do now for containers now. SIGTERM followed by SIGKILL, if 
required.
Should we add this then ? This would mean that we will have to relaunch a new 
localizer again, if container is still running. Or fail the container ?

bq. I'm also not sure we need deletion task cancellation. As you point out it's 
not really necessary.
Ok. Will remove it.

bq. Also do we really need a flag to say whether we want it to ignore missing 
paths? Wondering if we should just ignore cases where the path doesn't exist. 
Yes, we can simply ignore missing paths. Did not change so as not to break 
previous behavior. Doesnt seem like anyone is depending on this behavior though.

bq. What if we have the localizer register the temporary working directory 
(i.e.: the _tmp paths) as deleteOnExit paths?
Currently _tmp paths are deleted in finally in FSDownload#call. Wouldnt that be 
enough to handle case of normal JVM exit ?

bq.  With this I don't think we need to change the localizer protocol – DIE 
means try to cleanup, but NM will always cleanup anyway so no need to wait 
around and try too hard. Its actually more important that the localizer gets 
out of the way in a timely manner than it is for it to cleanup since the NM 
will be the backup in case the localizer fails.
For this the only concern I see is what I mentioned about the issue I found in 
FSDownload, that is the download task running even after cancel because code is 
uninterruptible at places in FSDownload#call. 
In this case, we can never know when the cancelled task will complete and 
create files in the directory. There can be a race which can lead to tmp 
directory being renamed for instance. 
We can in this case in deletion task, first put the _tmp dir and then the real 
one so that first tmp is deleted.
Also we can add an extra number of seconds to localizer HB timeout for 
scheduling file deletion, so that localizer is killed (Assuming we adopt 
approach mentioned in first point, above) before we attempt deletion.

We however would need to change the localizer protocol as well.
Currently localizer deletes the entry from its pending resources map as soon as 
it sends a status. And NM will send a DIE in two cases - 1) If container has 
been killed 2)NM processes HB and finds one of the status to be FETCH_FAILED. 
In this case we cannot know if the resources for which fetch was success were 
actually processed by NM or not. So I maintain a separate list for resources 
reported to localizer. Hence a flag, so that even those resources can be 
deleted. 

> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-2902.002.patch, YARN-2902.03.patch, 
> YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3212) RMNode State Transition Update with DECOMMISSIONING state

2015-09-16 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791007#comment-14791007
 ] 

Wangda Tan commented on YARN-3212:
--

Patch looks good, thanks [~djp]. Will commit in a few days if no opposite 
opinions.

> RMNode State Transition Update with DECOMMISSIONING state
> -
>
> Key: YARN-3212
> URL: https://issues.apache.org/jira/browse/YARN-3212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: RMNodeImpl - new.png, YARN-3212-v1.patch, 
> YARN-3212-v2.patch, YARN-3212-v3.patch, YARN-3212-v4.1.patch, 
> YARN-3212-v4.patch, YARN-3212-v5.1.patch, YARN-3212-v5.patch, 
> YARN-3212-v6.1.patch, YARN-3212-v6.2.patch, YARN-3212-v6.patch
>
>
> As proposed in YARN-914, a new state of “DECOMMISSIONING” will be added and 
> can transition from “running” state triggered by a new event - 
> “decommissioning”. 
> This new state can be transit to state of “decommissioned” when 
> Resource_Update if no running apps on this NM or NM reconnect after restart. 
> Or it received DECOMMISSIONED event (after timeout from CLI).
> In addition, it can back to “running” if user decides to cancel previous 
> decommission by calling recommission on the same node. The reaction to other 
> events is similar to RUNNING state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-16 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4074:
--
Attachment: YARN-4074-YARN-2928.POC.006.patch

v.6 POC patch posted.

Renamed {{TimelineEntityReader.createTable()}} to 
{{TimelineEntityReader.getTable()}}. Reusing the same instance for a given 
table.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (YARN-4154) Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change

2015-09-16 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reopened YARN-4154:
---

> Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change
> ---
>
> Key: YARN-4154
> URL: https://issues.apache.org/jira/browse/YARN-4154
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Jeff Zhang
>Priority: Blocker
>
> {code}
> [ERROR] 
> /mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
>  no suitable constructor found for 
> MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> {code}
> MR might have the same issue.
> \cc [~vinodkv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4154) Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change

2015-09-16 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-4154.
---
Resolution: Duplicate

Tx [~ajisakaa], not sure how I missed it earlier, I was careful about these 
reverts.

Anyways, the new commit applies cleanly.

Ran compilation and TestJobHistoryEventHandler, TestMRTimelineEventHandling, 
TestDistributedShell, TestMiniYarnCluster before the push.

Closing this correctly as a duplicate.

> Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change
> ---
>
> Key: YARN-4154
> URL: https://issues.apache.org/jira/browse/YARN-4154
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Jeff Zhang
>Priority: Blocker
>
> {code}
> [ERROR] 
> /mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
>  no suitable constructor found for 
> MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> {code}
> MR might have the same issue.
> \cc [~vinodkv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2928) YARN Timeline Service: Next generation

2015-09-16 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-2928:
-
Assignee: Sangjin Lee  (was: Vrushali C)

> YARN Timeline Service: Next generation
> --
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx, 
> TimelineServiceStoragePerformanceTestSummaryYARN-2928.pdf
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4129) Refactor the SystemMetricPublisher in RM to better support newer events

2015-09-16 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790872#comment-14790872
 ] 

Naganarasimha G R commented on YARN-4129:
-

Hi [~djp], 
Can you take a look at this patch ?



> Refactor the SystemMetricPublisher in RM to better support newer events
> ---
>
> Key: YARN-4129
> URL: https://issues.apache.org/jira/browse/YARN-4129
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: YARN-4129.YARN-2928.001.patch
>
>
> Currently to add new timeline event/ entity in RM side, one has to add a 
> method in publisher and a method in handler and create a new event class 
> which looks cumbersome and redundant. also further all the events might not 
> be required to be published in V1 & V2. So adopting the approach similar to 
> what was adopted in YARN-3045(NM side)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4148) When killing app, RM releases app's resource before they are released by NM

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790887#comment-14790887
 ] 

Hadoop QA commented on YARN-4148:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m  4s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  1s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 19s | The applied patch generated  1 
new checkstyle issues (total was 211, now 211). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  6s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |   0m 21s | Tests failed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |  54m 45s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  97m 11s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
|   | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue |
|   | hadoop.yarn.server.resourcemanager.resourcetracker.TestNMReconnect |
|   | hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry |
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756300/YARN-4148.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bf2f2b4 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9170/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9170/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9170/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9170/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9170/console |


This message was automatically generated.

> When killing app, RM releases app's resource before they are released by NM
> ---
>
> Key: YARN-4148
> URL: https://issues.apache.org/jira/browse/YARN-4148
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Jun Gong
>Assignee: Jun Gong
> Attachments: YARN-4148.001.patch, YARN-4148.wip.patch
>
>
> When killing a app, RM scheduler releases app's resource as soon as possible, 
> then it might allocate these resource for new requests. But NM have not 
> released them at that time.
> The problem was found when we supported GPU as a resource(YARN-4122).  Test 
> environment: a NM had 6 GPUs, app A used all 6 GPUs, app B was requesting 3 
> GPUs. Killed app A, then RM released A's 6 GPUs, and allocated 3 GPUs to B. 
> But when B tried to start container on NM, NM found it didn't have 3 GPUs to 
> allocate because it had not released A's GPUs.
> I think the problem also exists for CPU/Memory. It might cause OOM when 
> memory is overused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4129) Refactor the SystemMetricPublisher in RM to better support newer events

2015-09-16 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790889#comment-14790889
 ] 

Junping Du commented on YARN-4129:
--

Sure. I put this on my review queue and will look at it after settle down 
YARN-3816.

> Refactor the SystemMetricPublisher in RM to better support newer events
> ---
>
> Key: YARN-4129
> URL: https://issues.apache.org/jira/browse/YARN-4129
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: YARN-4129.YARN-2928.001.patch
>
>
> Currently to add new timeline event/ entity in RM side, one has to add a 
> method in publisher and a method in handler and create a new event class 
> which looks cumbersome and redundant. also further all the events might not 
> be required to be published in V1 & V2. So adopting the approach similar to 
> what was adopted in YARN-3045(NM side)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4157) Merge YARN-1197 back to trunk

2015-09-16 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4157:
-
Attachment: YARN-1197.diff.4.patch

Attached ver.4 patch, sync to latest trunk and fixed findbugs/javac warnings.

> Merge YARN-1197 back to trunk
> -
>
> Key: YARN-4157
> URL: https://issues.apache.org/jira/browse/YARN-4157
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, nodemanager, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-1197.diff.1.patch, YARN-1197.diff.2.patch, 
> YARN-1197.diff.3.patch, YARN-1197.diff.4.patch
>
>
> The purpose of this jira is to generate a uber patch from current YARN-1197 
> branch and run against trunk to fix any uncaught warnings and test failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4173) Figure out which ones are the final values for metrics

2015-09-16 Thread Vrushali C (JIRA)
Vrushali C created YARN-4173:


 Summary: Figure out which ones are the final values for metrics
 Key: YARN-4173
 URL: https://issues.apache.org/jira/browse/YARN-4173
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vrushali C



For the flow run table (YARN-3901), we need to know which values are the final 
ones for metrics so that they can be tagged accordingly.

Filing jira to deal with that



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2928) YARN Timeline Service: Next generation

2015-09-16 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C reassigned YARN-2928:


Assignee: Vrushali C  (was: Sangjin Lee)

> YARN Timeline Service: Next generation
> --
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx, 
> TimelineServiceStoragePerformanceTestSummaryYARN-2928.pdf
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4163) Audit getQueueInfo and getApplications calls

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790980#comment-14790980
 ] 

Hadoop QA commented on YARN-4163:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 38s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 51s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 48s | The applied patch generated  1 
new checkstyle issues (total was 50, now 49). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 27s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  58m  5s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  97m 32s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService
 |
|   | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueMappings |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerQueueACLs
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756306/YARN-4163.2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bf2f2b4 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9171/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9171/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9171/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9171/console |


This message was automatically generated.

> Audit getQueueInfo and getApplications calls
> 
>
> Key: YARN-4163
> URL: https://issues.apache.org/jira/browse/YARN-4163
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4163.2.patch, YARN-4163.patch
>
>
> getQueueInfo and getApplications seem to sometimes cause spike of load but 
> not able to confirm due to they are not audit logged. This patch propose to 
> add them to audit log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791024#comment-14791024
 ] 

Hadoop QA commented on YARN-4009:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  28m 25s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 53s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  1s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   3m  2s | Site still builds. |
| {color:red}-1{color} | checkstyle |   2m 32s | The applied patch generated  5 
new checkstyle issues (total was 0, now 5). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   5m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m 12s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   2m  0s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   3m 13s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-server-common. |
| | |  89m 44s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756279/YARN-4009.004.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / bf2f2b4 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9174/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9174/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9174/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9174/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9174/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9174/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9174/console |


This message was automatically generated.

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4140) RM container allocation delayed incase of app submitted to Nodelabel partition

2015-09-16 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791044#comment-14791044
 ] 

Wangda Tan commented on YARN-4140:
--

[~bibinchundatt],
bq. The request order ANY will not be always the first one rt?
It won't be always the first one.

The pseudo code should be able to handle request.label correctly whether ANY 
requests come first or not.

> RM container allocation delayed incase of app submitted to Nodelabel partition
> --
>
> Key: YARN-4140
> URL: https://issues.apache.org/jira/browse/YARN-4140
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4140.patch, 0002-YARN-4140.patch, 
> 0003-YARN-4140.patch, 0004-YARN-4140.patch
>
>
> Trying to run application on Nodelabel partition I  found that the 
> application execution time is delayed by 5 – 10 min for 500 containers . 
> Total 3 machines 2 machines were in same partition and app submitted to same.
> After enabling debug was able to find the below
> # From AM the container ask is for OFF-SWITCH
> # RM allocating all containers to NODE_LOCAL as shown in logs below.
> # So since I was having about 500 containers time taken was about – 6 minutes 
> to allocate 1st map after AM allocation.
> # Tested with about 1K maps using PI job took 17 minutes to allocate  next 
> container after AM allocation
> Once 500 container allocation on NODE_LOCAL is done the next container 
> allocation is done on OFF_SWITCH
> {code}
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: 
> /default-rack, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: *, Relax 
> Locality: true, Node Label Expression: 3}
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: 
> host-10-19-92-143, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
>  showRequests: application=application_1441791998224_0001 request={Priority: 
> 20, Capability: , # Containers: 500, Location: 
> host-10-19-92-117, Relax Locality: true, Node Label Expression: }
> 2015-09-09 15:21:58,954 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> {code}
>  
> {code}
> 2015-09-09 14:35:45,467 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:45,831 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:46,469 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> 2015-09-09 14:35:46,832 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Assigned to queue: root.b.b1 stats: b1: capacity=1.0, absoluteCapacity=0.5, 
> usedResources=, usedCapacity=0.0, 
> absoluteUsedCapacity=0.0, numApps=1, numContainers=1 -->  vCores:0>, NODE_LOCAL
> {code}
> {code}
> dsperf@host-127:/opt/bibin/dsperf/HAINSTALL/install/hadoop/resourcemanager/logs1>
>  cat 

[jira] [Updated] (YARN-4171) Resolve findbugs/javac warnings in YARN-1197 branch

2015-09-16 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4171:
-
Attachment: YARN-4171-YARN-1197.1.patch

Attached ver.1 patch.

> Resolve findbugs/javac warnings in YARN-1197 branch
> ---
>
> Key: YARN-4171
> URL: https://issues.apache.org/jira/browse/YARN-4171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, nodemanager, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4171-YARN-1197.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2015-09-16 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791012#comment-14791012
 ] 

Junping Du commented on YARN-3816:
--

Thanks Naga, Varun and Li for review and comments! Let me address them one by 
one. 
First for Naga's comments:
bq. Following are not completely achieved right? number of containers 
launched/completed/failed, framework specific metrics, e.g. HDFS_BYTES_READ, 
should be aggregated to show details of states in framework level.
We are almost there. number of containers should be an existing info which get 
addressed in YARN-3880. Also, framework specific metrics is another topic and 
we were still discussing different requirements for MapReduce and other apps 
which is out of scope of this JIRA - that's why we have YARN system metrics in 
the title.

bq. In the doc, ApplicationState Table (aggregated from 
AppLevelTimelineCollector​) has Container Aggregate metrics (allocated: 0 
preempted:0 failed: 0 reuse: 0 ) is this req @ AppLevelTimelineCollector​ felt 
it should be only @ aggregated from ​RMTimelineCollector. Also time(start: 
last_modification: avg_execution ) is required as metric? may be i misread the 
table description?
Like said above, YARN-3880 is supposed to track container number metrics. May 
be we can move discussion there?

bq. In the doc aggregation-design-discussion.pdf, you had mentioned that time 
average & max is what will be considered, but in the patch it seems more like 
only SUM is supported neither avg or max, so is sum more imp than the other(or 
am i missing something) ? Also would like to know the significance of this 
measurement as i felt per‐container average more helpful as it can be useful 
for calibrating RM.
We had a previous discussion before and we choose SUM as the first operation to 
support on aggregating metrics. There are definetely other operations that are 
useful that we could add and extend later.

bq. IIUC Based on the current design aggregation seems to be happening @ the 
collector end. in that case do we require 
TimelineWriter.aggregate(TimelineEntity data, TimelineAggregationTrack track) ? 
Is there any idea to push some logic to writer for aggregation?
No. App aggregation is per collector but not per writer as currently we are 
sharing a single writer on NM for all app collector. I would prefer to make 
each collector thread to maintain their own states and calculation.

bq. TimelineAggregationBasis doesnt have value for queue, as this is used in 
TimelineReaderWebServices, inst it required for reader?
If my understanding is correct, queue info is not a must for app entity I 
think. We only require flow info, etc. However, I will do double check on 
reader side for this.

bq. will it be required to accumulate time series data with single value data 
and viceversa ? would accumulation need to be done on same type ? if not some 
real scenarios where it can be possibly happen.
In toAccumulate, we support accumulate time series data on a single value data 
(basis data) because we can assume basis data is always single value data which 
comes from last time accumulation result. If there are scenarios that we want 
accumulated result to be time series data, then we can have a separated method 
to extend later. Make sense?

bq. Would it be better to have set of operation which can be performed in 
TimelineMetric so that accumulateTo automatically detect and accumulate for 
diff operations ? currently it seems like statically set to SUM in 
TimelineCollecor.
We support SUM and REP (replace) already. Like above comments, we can add more 
operations later with more specific requirement.

bq. Currently for each putEntity call in collector we are not only aggregating 
& invoking accumulateTo but also sending it to be written to the writer, but in 
the doc its mentioned that it will cache for 15 seconds and then update right?
No. We were choosing to aggregate and accumulate (can be disabled by 
configuration) immediately like current implementation. The previous concern is 
for performance delay but it sounds unnecessary now. We can rethink on this if 
we meet perf bottleneck for this in future.

bq. Not sure earlier why was pid added for a container cpu and mem usage metric 
and not sure why we are removing it. But seems like for a given container we do 
not req pid to be appended as it will be unique to it. is that the reason we 
are removing it?
Pid is added wrongly previously as this info is useless: The outer side of 
TimelineEnity (container entity) already have container id which make this 
metrics unique enough. And we need metric ID to keep the same type (CPU, 
Memory, etc) for aggregation and accumulation.

bq. do we need to set aggregateTo to true for container metrics(cputotalCore% & 
pmemUsage) to ? also we are currently not capturing vmemUsage do we need to 
capture it?
We choose to record these two metrics only in previous 

[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-16 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790830#comment-14790830
 ] 

Varun Saxena commented on YARN-4074:


The patch looks fine to me. I tested some parts related to flow as well.

I see that in flow activity table, the row key is cluster id followed by an 
inverted timestamp. This I guess is to retrieve entities by a certain time 
range. I havent added this in REST related JIRA and dont see support even here. 
Will handle it after PoC I guess. Correct ?
Also I had added user as an optional query param in REST API code. I think 
querying by user wont really be a good idea looking at the row key. Will remove 
it.

The major code added in this patch is about the different readers based on 
table and a factory. It looks fine. However, number of parameters to the 
methods are quite a lot just like Reader API. As you mentioned elsewhere, maybe 
I can refactor this later. We should club together some things into logical 
things like context, filters, etc. That will reduce number of params.

In the factory class, we have a sequence of if-else statements. Although its a 
matter of perspective, sequence of if-else look a little inelegant. But we may 
not have too many great options here. Thought of enums i.e. having create 
methods with implementation tied to each enum but entity type enum is not HBase 
specific. Any other option ? I guess if-else should be fine for now because not 
too many tables should be added in future, if any.

In xxxEntityReader classes, maybe createTable should be renamed to getTable 
because we are not really creating any table here. We are just getting/creating 
a table object. 
Also if I am not wrong, no need to create this object again and again as well. 
All it really holds is static information such as table name and conf.


> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4172) Extend DominantResourceCalculator to account for all resources

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790861#comment-14790861
 ] 

Hadoop QA commented on YARN-4172:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m  8s | Findbugs (version ) appears to 
be broken on YARN-3926. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  3s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 20s | The applied patch generated  1 
new checkstyle issues (total was 0, now 1). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 13s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 59s | Tests passed in 
hadoop-yarn-common. |
| | |  44m 56s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756308/YARN-4172-YARN-3926.001.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-3926 / 6caa0a2 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9172/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9172/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9172/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9172/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9172/console |


This message was automatically generated.

> Extend DominantResourceCalculator to account for all resources
> --
>
> Key: YARN-4172
> URL: https://issues.apache.org/jira/browse/YARN-4172
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-4172-YARN-3926.001.patch
>
>
> Now that support for multiple resources is present in the resource class, we 
> need to modify DominantResourceCalculator to account for the new resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-16 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790948#comment-14790948
 ] 

Sangjin Lee commented on YARN-4074:
---

Thanks for your comments [~varun_saxena]!

bq. This I guess is to retrieve entities by a certain time range. I havent 
added this in REST related JIRA and dont see support even here. Will handle it 
after PoC I guess. Correct ?

The flow activity table is a time-based set of data. The timestamp (day marker 
really) is there to order the activity in time. It is feasible to query the 
flow activity table based on time (e.g. "give me all the activity in the past 3 
days"). I didn't get around to it, but it should be pretty straightforward to 
support that after the POC. I'll file a JIRA for adding that support.

bq. Also I had added user as an optional query param in REST API code. I think 
querying by user wont really be a good idea looking at the row key. Will remove 
it.

Yes, the way the data is laid out, cluster + user will not be an efficient 
query, as time is the component that gets in before the user.

{quote}
The major code added in this patch is about the different readers based on 
table and a factory. It looks fine. However, number of parameters to the 
methods are quite a lot just like Reader API. As you mentioned elsewhere, maybe 
I can refactor this later. We should club together some things into logical 
things like context, filters, etc. That will reduce number of params.
{quote}

That is spot on. I really didn't like having to repeat the long list of 
arguments. But since you're looking into a better way of capturing the filters 
and predicates, I'm not really changing things as part of this JIRA. Hope that 
is consistent with your understanding.

{quote}
In the factory class, we have a sequence of if-else statements. Although its a 
matter of perspective, sequence of if-else look a little inelegant. But we may 
not have too many great options here. Thought of enums i.e. having create 
methods with implementation tied to each enum but entity type enum is not HBase 
specific. Any other option ? I guess if-else should be fine for now because not 
too many tables should be added in future, if any.
{quote}

I agree. I wanted to use the switch-case statements, but the main issue was 
that the input is string, not enums. If it were enums, it could have been 
trivial...

{quote}
In xxxEntityReader classes, maybe createTable should be renamed to getTable 
because we are not really creating any table here. We are just getting/creating 
a table object. 
Also if I am not wrong, no need to create this object again and again as well. 
All it really holds is static information such as table name and conf.
{quote}

Those are good suggestions. Yes, the {{BaseTable}} instances are thread-safe, 
and I think they can be reused. I'll update the patch to make those changes.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table

2015-09-16 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790987#comment-14790987
 ] 

Joep Rottinghuis commented on YARN-4062:


While discussing flush and compaction with [~vrushalic] I just realized that 
there might be a complication with cross-dc replication.

Potentially the RS in two different datacenters might decide to flush/compact 
values for one row at the same time. We need to think through the consequences 
what happens if they make a different decision (because one DC might have later 
information that hasn't been replicated across such as app completion for 
example). Even if the order and the decisions are deterministic, we need to 
consider what happens if two regions modify the same row.
With hRaven we have been able to make master-master replication work because we 
were guaranteed that every row is "owned" and therefore manipulated only 
locally.

Perhaps we can do the same here, where flush and compactions happen only in the 
HBase cluster located in the datacenter where the row is owned. For example, 
only if the rowkey starts with the same datacenter as where the copro runs. 
This would ensure that each row is flushed/compacted only in one DC and the 
other DCs would be followers.

This would have to be configurable and disabled for installations with a single 
HBase instance that are written to remotely by multiple datacenters, otherwise 
no compaction will happen at all (at least perhaps functionally correct even if 
not optimal for space usage).


> Add the flush and compaction functionality via coprocessors and scanners for 
> flow run table
> ---
>
> Key: YARN-4062
> URL: https://issues.apache.org/jira/browse/YARN-4062
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Vrushali C
>
> As part of YARN-3901, coprocessor and scanner is being added for storing into 
> the flow_run table. It also needs a flush & compaction processing in the 
> coprocessor and perhaps a new scanner to deal with the data during flushing 
> and compaction stages. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791225#comment-14791225
 ] 

Hitesh Shah commented on YARN-4009:
---

Couple of questions: 

{code}
if (!initializers.contains(CrossOriginFilterInitializer.class.getName())) {
  if(conf.getBoolean(YarnConfiguration
  .TIMELINE_SERVICE_HTTP_CROSS_ORIGIN_ENABLED, YarnConfiguration
  .TIMELINE_SERVICE_HTTP_CROSS_ORIGIN_ENABLED_DEFAULT)) {
initializers = CrossOriginFilterInitializer.class.getName() + ","
+ initializers;
modifiedInitializers = true;
  }
}
{code}

I see this code in Timeline which makes it easier to enable cross-origin 
support just for Timeline. I am assuming Timeline also looks at the hadoop 
filters defined in core-site? What happens when both of these are enabled at 
the same time with different settings?

Not sure if there is a question of selecting enabling cors support for 
different services such as NN webservices vs RM webservices.

Apart from the above, if a global config is good enough, patch looks good.
 



> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2015-09-16 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791115#comment-14791115
 ] 

Jason Lowe commented on YARN-2902:
--

bq. For NM side, currently it does not kill localizers. We can track PID and 
kill it as discussed earlier if HB doesnt come for a configured period.
Yeah, I think long-term to make this a lot more stable and reliable we're going 
to need the ability for the NM to kill localizers explicitly rather than via 
request.  As you mentioned, the concern is that the localizer will not actually 
interrupt and stop the localization.  Having the NM forcibly kill the localizer 
means we can put less trust in the localizer to always get that right.  However 
that's probably a lot of work and churn to the code which makes it less 
palatable for a 2.7 inclusion.  Ideally we should target a minimal change for 
2.7 that gets us past the main problems we're having today, and we can add more 
bulletproofing in followup JIRAs for subsequent releases.

As far as properly handling DIE so we actually stop downloading and problems 
canceling active transfers, can't we just have the localizer forcibly tear down 
the JVM?  If we're being told to DIE then I assume we really don't care about 
pending transfers completing and just want to get out.  If the NM is going to 
clean up after the localizer anyway, seems like we can drastically simplify DIE 
handling and just exit the JVM.  That seems like a change that's targeted 
enough to be appropriate for 2.7 instead of adding localizer kill support, etc.

> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-2902.002.patch, YARN-2902.03.patch, 
> YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791228#comment-14791228
 ] 

Hitesh Shah commented on YARN-4009:
---

[~jeagles] Any comments on the patch?

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791226#comment-14791226
 ] 

Hitesh Shah commented on YARN-4009:
---

[~jeagles] ? 

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-09-16 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3985:

Attachment: YARN-3985.002.patch

All the failed tests passed locally for me. Rerunning the tests

> Make ReservationSystem persist state using RMStateStore reservation APIs 
> -
>
> Key: YARN-3985
> URL: https://issues.apache.org/jira/browse/YARN-3985
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3985.001.patch, YARN-3985.002.patch, 
> YARN-3985.002.patch, YARN-3985.002.patch
>
>
> YARN-3736 adds the RMStateStore apis to store and load reservation state. 
> This jira adds the actual storing of state from ReservationSystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4009) CORS support for ResourceManager REST API

2015-09-16 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791251#comment-14791251
 ] 

Jonathan Eagles commented on YARN-4009:
---

I have reviewed the patch and have run with the timeline cross origin filter 
for a long time now. From a usage concern, Cross origin support is only needed 
for browser running javascript that reaches out to web services. Is there a 
need to allow the design to enable this only for webservices (REST APIs) 
instead of the whole webserver (builtin UIs and REST apis)? Also, seconding 
[~hitesh]'s comment regarding turning this on per service. Patch looks good 
other that these design questions.

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3901) Populate flow run data in the flow_run & flow activity tables

2015-09-16 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-3901:
-
Attachment: (was: YARN-3901-YARN-2928.9.patch)

> Populate flow run data in the flow_run & flow activity tables
> -
>
> Key: YARN-3901
> URL: https://issues.apache.org/jira/browse/YARN-3901
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Vrushali C
> Attachments: YARN-3901-YARN-2928.1.patch, 
> YARN-3901-YARN-2928.2.patch, YARN-3901-YARN-2928.3.patch, 
> YARN-3901-YARN-2928.4.patch, YARN-3901-YARN-2928.5.patch, 
> YARN-3901-YARN-2928.6.patch, YARN-3901-YARN-2928.7.patch, 
> YARN-3901-YARN-2928.8.patch, YARN-3901-YARN-2928.9.patch
>
>
> As per the schema proposed in YARN-3815 in 
> https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf
> filing jira to track creation and population of data in the flow run table. 
> Some points that are being  considered:
> - Stores per flow run information aggregated across applications, flow version
> RM’s collector writes to on app creation and app completion
> - Per App collector writes to it for metric updates at a slower frequency 
> than the metric updates to application table
> primary key: cluster ! user ! flow ! flow run id
> - Only the latest version of flow-level aggregated metrics will be kept, even 
> if the entity and application level keep a timeseries.
> - The running_apps column will be incremented on app creation, and 
> decremented on app completion.
> - For min_start_time the RM writer will simply write a value with the tag for 
> the applicationId. A coprocessor will return the min value of all written 
> values. - 
> - Upon flush and compactions, the min value between all the cells of this 
> column will be written to the cell without any tag (empty tag) and all the 
> other cells will be discarded.
> - Ditto for the max_end_time, but then the max will be kept.
> - Tags are represented as #type:value. The type can be not set (0), or can 
> indicate running (1) or complete (2). In those cases (for metrics) only 
> complete app metrics are collapsed on compaction.
> - The m! values are aggregated (summed) upon read. Only when applications are 
> completed (indicated by tag type 2) can the values be collapsed.
> - The application ids that have completed and been aggregated into the flow 
> numbers are retained in a separate column for historical tracking: we don’t 
> want to re-aggregate for those upon replay
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >