[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-03-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192771#comment-15192771
 ] 

Sunil G commented on YARN-4807:
---

Yes [~kasha]. Many of the waiteState has cumulative timeout value hardcoded and 
also per-round waitSecs also is hardcoded differently from method to method. So 
total no: of rounds to wait for resource allocations were found by 
(timeout/per-round-wait) value. As I see its, no: of times we wait OR total 
number of rounds also can contribute little bit to the duration of the tests.

So if we are starting with 10ms, we can define how many rounds are also needed 
for a minimum try. With few dry-runs, a good figure could be calculated. 

bq.I have seen test cases where there is no reference to an RM or MockRM.
Yes, there are places even from MokRM we use this. SO as you suggested, its 
better to focus on improving test time first and later some cleanup can be 
tried.

> MockAM#waitForState sleep duration is too long
> --
>
> Key: YARN-4807
> URL: https://issues.apache.org/jira/browse/YARN-4807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
>  Labels: newbie
>
> MockAM#waitForState sleep duration (500 ms) is too long. Also, there is 
> significant duplication with MockRM#waitForState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-03-13 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192753#comment-15192753
 ] 

Varun Vasudev commented on YARN-4807:
-

I'd prefer to start with Karthik's approach. Let's make the sleep time as low 
as possible. 

> MockAM#waitForState sleep duration is too long
> --
>
> Key: YARN-4807
> URL: https://issues.apache.org/jira/browse/YARN-4807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
>  Labels: newbie
>
> MockAM#waitForState sleep duration (500 ms) is too long. Also, there is 
> significant duplication with MockRM#waitForState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4773) Log aggregation performs extraneous filesystem operations when rolling log aggregation is disabled

2016-03-13 Thread Jun Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192735#comment-15192735
 ] 

Jun Gong commented on YARN-4773:


hi [~bibinchundatt], if the aggregation restarts again, previous tmp file will 
be overridden because tmp file name does not change.

> Log aggregation performs extraneous filesystem operations when rolling log 
> aggregation is disabled
> --
>
> Key: YARN-4773
> URL: https://issues.apache.org/jira/browse/YARN-4773
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Jun Gong
>Priority: Minor
> Attachments: YARN-4773.01.patch
>
>
> I noticed when log aggregation occurs for an application the nodemanager is 
> listing the application's log directory in HDFS.  Apparently this is for 
> removing old logs before uploading new ones.  This is a wasteful operation 
> when rolling log aggregation is disabled, since there will be no prior logs 
> in HDFS -- aggregation only occurs once when rolling log aggregation is 
> disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4808) SchedulerNode can use a few more cosmetic changes

2016-03-13 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192693#comment-15192693
 ] 

Inigo Goiri commented on YARN-4808:
---

I'd say is slightly cleaner using just {{RMNode}} to hold utilization data and 
not having to update
Not 100% sure we are updating everything but 
{{TestMiniYarnClusterNodeUtilization} from YARN-3980 should be checking that so 
we should be good.

> SchedulerNode can use a few more cosmetic changes
> -
>
> Key: YARN-4808
> URL: https://issues.apache.org/jira/browse/YARN-4808
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-4808-1.patch
>
>
> We have made some cosmetic changes to SchedulerNode recently. While working 
> on YARN-4511, realized we could improve it a little more:
> # Remove volatile variables - don't see the need for them being volatile
> # Some methods end up doing very similar things, so consolidating them
> # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity 
> to include the un-utilized resources, and having two totals can be a little 
> confusing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4773) Log aggregation performs extraneous filesystem operations when rolling log aggregation is disabled

2016-03-13 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192685#comment-15192685
 ] 

Bibin A Chundatt commented on YARN-4773:


[~hex108]/[~jlowe]
{quote}
however we do not need call AppLogAggregatorImpl#cleanOldLogs because there 
have been no containers' logs uploaded before.
{quote}
While aggregation was going on NM got killed and restarted again .Will thr be a 
possiblity for .tmp file exists in folder?? 



> Log aggregation performs extraneous filesystem operations when rolling log 
> aggregation is disabled
> --
>
> Key: YARN-4773
> URL: https://issues.apache.org/jira/browse/YARN-4773
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Jun Gong
>Priority: Minor
> Attachments: YARN-4773.01.patch
>
>
> I noticed when log aggregation occurs for an application the nodemanager is 
> listing the application's log directory in HDFS.  Apparently this is for 
> removing old logs before uploading new ones.  This is a wasteful operation 
> when rolling log aggregation is disabled, since there will be no prior logs 
> in HDFS -- aggregation only occurs once when rolling log aggregation is 
> disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3054) Preempt policy in FairScheduler may cause mapreduce job never finish

2016-03-13 Thread Peng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192683#comment-15192683
 ] 

Peng Zhang commented on YARN-3054:
--

Thanks [~kasha]
I agreed with "have a preemption priority or even a preemption cost per 
container". 

And in my temporary fix, I preempt latest scheduled container instead of low or 
high priority containers. 
I think this will make containers for a mount of resources(at least steady fair 
share) steady. So MapReduce job progress will proceed. 

> Preempt policy in FairScheduler may cause mapreduce job never finish
> 
>
> Key: YARN-3054
> URL: https://issues.apache.org/jira/browse/YARN-3054
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Peng Zhang
>
> Preemption policy is related with schedule policy now. Using comparator of 
> schedule policy to find preemption candidate cannot guarantee a subset of 
> containers never be preempted. And this may cause tasks to be preempted 
> periodically before they finish. So job cannot make any progress. 
> I think preemption in YARN should got below assurance:
> 1. Mapreduce jobs can get additional resources when others are idle;
> 2. Mapreduce jobs for one user in one queue can still progress with its min 
> share when others preempt resources back.
> Maybe always preempt the latest app and container can get this? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4808) SchedulerNode can use a few more cosmetic changes

2016-03-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192599#comment-15192599
 ] 

Hadoop QA commented on YARN-4808:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 1 new + 259 unchanged - 1 fixed = 260 total (was 260) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 38s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 8s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 159m 50s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_74 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  

[jira] [Commented] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.

2016-03-13 Thread Shiwei Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192582#comment-15192582
 ] 

Shiwei Guo commented on YARN-3933:
--

I noticed this, will fix it soon.

> Race condition when calling AbstractYarnScheduler.completedContainer.
> -
>
> Key: YARN-3933
> URL: https://issues.apache.org/jira/browse/YARN-3933
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0, 2.7.0, 2.5.2, 2.7.1
>Reporter: Lavkesh Lahngir
>Assignee: Shiwei Guo
> Attachments: YARN-3933.001.patch, YARN-3933.002.patch
>
>
> In our cluster we are seeing available memory and cores being negative. 
> Initial inspection:
> Scenario no. 1: 
> In capacity scheduler the method allocateContainersToNode() checks if 
> there are excess reservation of containers for an application, and they are 
> no longer needed then it calls queue.completedContainer() which causes 
> resources being negative. And they were never assigned in the first place. 
> I am still looking through the code. Can somebody suggest how to simulate 
> excess containers assignments ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4809) De-duplicate container completion across schedulers

2016-03-13 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-4809:
--

 Summary: De-duplicate container completion across schedulers
 Key: YARN-4809
 URL: https://issues.apache.org/jira/browse/YARN-4809
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Karthik Kambatla


CapacityScheduler and FairScheduler implement containerCompleted the exact same 
way. Duplication across the schedulers can be avoided. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4808) SchedulerNode can use a few more cosmetic changes

2016-03-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-4808:
---
Attachment: yarn-4808-1.patch

> SchedulerNode can use a few more cosmetic changes
> -
>
> Key: YARN-4808
> URL: https://issues.apache.org/jira/browse/YARN-4808
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-4808-1.patch
>
>
> We have made some cosmetic changes to SchedulerNode recently. While working 
> on YARN-4511, realized we could improve it a little more:
> # Remove volatile variables - don't see the need for them being volatile
> # Some methods end up doing very similar things, so consolidating them
> # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity 
> to include the un-utilized resources, and having two totals can be a little 
> confusing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4808) SchedulerNode can use a few more cosmetic changes

2016-03-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192557#comment-15192557
 ] 

Karthik Kambatla commented on YARN-4808:


[~elgoiri], [~leftnoteasy] - could you guys take a quick look? 

> SchedulerNode can use a few more cosmetic changes
> -
>
> Key: YARN-4808
> URL: https://issues.apache.org/jira/browse/YARN-4808
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-4808-1.patch
>
>
> We have made some cosmetic changes to SchedulerNode recently. While working 
> on YARN-4511, realized we could improve it a little more:
> # Remove volatile variables - don't see the need for them being volatile
> # Some methods end up doing very similar things, so consolidating them
> # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity 
> to include the un-utilized resources, and having two totals can be a little 
> confusing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4808) SchedulerNode can use a few more cosmetic changes

2016-03-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-4808:
---
Attachment: yarn-4808-1.patch

> SchedulerNode can use a few more cosmetic changes
> -
>
> Key: YARN-4808
> URL: https://issues.apache.org/jira/browse/YARN-4808
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-4808-1.patch
>
>
> We have made some cosmetic changes to SchedulerNode recently. While working 
> on YARN-4511, realized we could improve it a little more:
> # Remove volatile variables - don't see the need for them being volatile
> # Some methods end up doing very similar things, so consolidating them
> # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity 
> to include the un-utilized resources, and having two totals can be a little 
> confusing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4808) SchedulerNode can use a few more cosmetic changes

2016-03-13 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-4808:
--

 Summary: SchedulerNode can use a few more cosmetic changes
 Key: YARN-4808
 URL: https://issues.apache.org/jira/browse/YARN-4808
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.8.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla


We have made some cosmetic changes to SchedulerNode recently. While working on 
YARN-4511, realized we could improve it a little more:
# Remove volatile variables - don't see the need for them being volatile
# Some methods end up doing very similar things, so consolidating them
# Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity 
to include the un-utilized resources, and having two totals can be a little 
confusing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1297) Miscellaneous Fair Scheduler speedups

2016-03-13 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192547#comment-15192547
 ] 

Yufei Gu commented on YARN-1297:


Thanks, [~asuresh].

> Miscellaneous Fair Scheduler speedups
> -
>
> Key: YARN-1297
> URL: https://issues.apache.org/jira/browse/YARN-1297
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Sandy Ryza
>Assignee: Yufei Gu
> Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.3.patch, 
> YARN-1297.4.patch, YARN-1297.4.patch, YARN-1297.patch, YARN-1297.patch
>
>
> I ran the Fair Scheduler's core scheduling loop through a profiler tool and 
> identified a bunch of minimally invasive changes that can shave off a few 
> milliseconds.
> The main one is demoting a couple INFO log messages to DEBUG, which brought 
> my benchmark down from 16000 ms to 6000.
> A few others (which had way less of an impact) were
> * Most of the time in comparisons was being spent in Math.signum.  I switched 
> this to direct ifs and elses and it halved the percent of time spent in 
> comparisons.
> * I removed some unnecessary instantiations of Resource objects
> * I made it so that queues' usage wasn't calculated from the applications up 
> each time getResourceUsage was called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-03-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192416#comment-15192416
 ] 

Karthik Kambatla commented on YARN-4807:


Actually, the problem seems worse. We have multiple {{waitForState}} methods in 
MockRM - for app, appattempt, container. There is some duplication there, but 
what is worse each method has its own sleepTime and cumulative_wait_time. It 
would be nice to use a constant for sleep time instead of hard coded values. 

If it is not a problem, I would like for us to start with a really small value 
like 10 ms. It seems to be working for one of the {{waitForState}} methods in 
MockRM.

Regarding removing MockAM#waitForState methods altogether, I am not so sure if 
that will be straight-forward or desired. I have seen test cases where there is 
no reference to an RM or MockRM. 

> MockAM#waitForState sleep duration is too long
> --
>
> Key: YARN-4807
> URL: https://issues.apache.org/jira/browse/YARN-4807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
>  Labels: newbie
>
> MockAM#waitForState sleep duration (500 ms) is too long. Also, there is 
> significant duplication with MockRM#waitForState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4795) ContainerMetrics drops records

2016-03-13 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-4795:
---
Attachment: YARN-4795.001.patch

> ContainerMetrics drops records
> --
>
> Key: YARN-4795
> URL: https://issues.apache.org/jira/browse/YARN-4795
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
> Attachments: YARN-4795.001.patch
>
>
> The metrics2 system was implemented to deal with persistent sources.  
> {{ContainerMetrics}} is an ephemeral source, and so it causes problems.  
> Specifically, the {{ContainerMetrics}} only reports metrics once after the 
> container has been stopped.  This behavior is a problem because the metrics2 
> system can ask sources for reports that will be quietly dropped by the sinks 
> that care.  (It's a metrics2 feature, not a bug.)  If that final report is 
> silently dropped, it's lost, because the {{ContainerMetrics}} won't report 
> anything else ever anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4802) ContainerMetrics drops records

2016-03-13 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton resolved YARN-4802.

Resolution: Duplicate

> ContainerMetrics drops records
> --
>
> Key: YARN-4802
> URL: https://issues.apache.org/jira/browse/YARN-4802
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The metrics2 system was implemented to deal with persistent sources.  
> {{ContainerMetrics}} is an ephemeral source, and so it causes problems.  
> Specifically, the {{ContainerMetrics}} only reports metrics once after the 
> container has been stopped.  This behavior is a problem because the metrics2 
> system can ask sources for reports that will be quietly dropped by the sinks 
> that care.  (It's a metrics2 feature, not a bug.)  If that final report is 
> silently dropped, it's lost, because the {{ContainerMetrics}} won't report 
> anything else ever anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4803) ContainerMetrics drops records

2016-03-13 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton resolved YARN-4803.

Resolution: Duplicate

> ContainerMetrics drops records
> --
>
> Key: YARN-4803
> URL: https://issues.apache.org/jira/browse/YARN-4803
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The metrics2 system was implemented to deal with persistent sources.  
> {{ContainerMetrics}} is an ephemeral source, and so it causes problems.  
> Specifically, the {{ContainerMetrics}} only reports metrics once after the 
> container has been stopped.  This behavior is a problem because the metrics2 
> system can ask sources for reports that will be quietly dropped by the sinks 
> that care.  (It's a metrics2 feature, not a bug.)  If that final report is 
> silently dropped, it's lost, because the {{ContainerMetrics}} won't report 
> anything else ever anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4801) ContainerMetrics drops records

2016-03-13 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton resolved YARN-4801.

Resolution: Duplicate

> ContainerMetrics drops records
> --
>
> Key: YARN-4801
> URL: https://issues.apache.org/jira/browse/YARN-4801
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The metrics2 system was implemented to deal with persistent sources.  
> {{ContainerMetrics}} is an ephemeral source, and so it causes problems.  
> Specifically, the {{ContainerMetrics}} only reports metrics once after the 
> container has been stopped.  This behavior is a problem because the metrics2 
> system can ask sources for reports that will be quietly dropped by the sinks 
> that care.  (It's a metrics2 feature, not a bug.)  If that final report is 
> silently dropped, it's lost, because the {{ContainerMetrics}} won't report 
> anything else ever anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4796) ContainerMetrics drops records

2016-03-13 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton resolved YARN-4796.

Resolution: Duplicate

> ContainerMetrics drops records
> --
>
> Key: YARN-4796
> URL: https://issues.apache.org/jira/browse/YARN-4796
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The metrics2 system was implemented to deal with persistent sources.  
> {{ContainerMetrics}} is an ephemeral source, and so it causes problems.  
> Specifically, the {{ContainerMetrics}} only reports metrics once after the 
> container has been stopped.  This behavior is a problem because the metrics2 
> system can ask sources for reports that will be quietly dropped by the sinks 
> that care.  (It's a metrics2 feature, not a bug.)  If that final report is 
> silently dropped, it's lost, because the {{ContainerMetrics}} won't report 
> anything else ever anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4800) ContainerMetrics drops records

2016-03-13 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton resolved YARN-4800.

Resolution: Duplicate

> ContainerMetrics drops records
> --
>
> Key: YARN-4800
> URL: https://issues.apache.org/jira/browse/YARN-4800
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The metrics2 system was implemented to deal with persistent sources.  
> {{ContainerMetrics}} is an ephemeral source, and so it causes problems.  
> Specifically, the {{ContainerMetrics}} only reports metrics once after the 
> container has been stopped.  This behavior is a problem because the metrics2 
> system can ask sources for reports that will be quietly dropped by the sinks 
> that care.  (It's a metrics2 feature, not a bug.)  If that final report is 
> silently dropped, it's lost, because the {{ContainerMetrics}} won't report 
> anything else ever anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4797) ContainerMetrics drops records

2016-03-13 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton resolved YARN-4797.

Resolution: Duplicate

> ContainerMetrics drops records
> --
>
> Key: YARN-4797
> URL: https://issues.apache.org/jira/browse/YARN-4797
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The metrics2 system was implemented to deal with persistent sources.  
> {{ContainerMetrics}} is an ephemeral source, and so it causes problems.  
> Specifically, the {{ContainerMetrics}} only reports metrics once after the 
> container has been stopped.  This behavior is a problem because the metrics2 
> system can ask sources for reports that will be quietly dropped by the sinks 
> that care.  (It's a metrics2 feature, not a bug.)  If that final report is 
> silently dropped, it's lost, because the {{ContainerMetrics}} won't report 
> anything else ever anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4798) ContainerMetrics drops records

2016-03-13 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton resolved YARN-4798.

Resolution: Duplicate

> ContainerMetrics drops records
> --
>
> Key: YARN-4798
> URL: https://issues.apache.org/jira/browse/YARN-4798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The metrics2 system was implemented to deal with persistent sources.  
> {{ContainerMetrics}} is an ephemeral source, and so it causes problems.  
> Specifically, the {{ContainerMetrics}} only reports metrics once after the 
> container has been stopped.  This behavior is a problem because the metrics2 
> system can ask sources for reports that will be quietly dropped by the sinks 
> that care.  (It's a metrics2 feature, not a bug.)  If that final report is 
> silently dropped, it's lost, because the {{ContainerMetrics}} won't report 
> anything else ever anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4799) ContainerMetrics drops records

2016-03-13 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton resolved YARN-4799.

Resolution: Duplicate

> ContainerMetrics drops records
> --
>
> Key: YARN-4799
> URL: https://issues.apache.org/jira/browse/YARN-4799
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The metrics2 system was implemented to deal with persistent sources.  
> {{ContainerMetrics}} is an ephemeral source, and so it causes problems.  
> Specifically, the {{ContainerMetrics}} only reports metrics once after the 
> container has been stopped.  This behavior is a problem because the metrics2 
> system can ask sources for reports that will be quietly dropped by the sinks 
> that care.  (It's a metrics2 feature, not a bug.)  If that final report is 
> silently dropped, it's lost, because the {{ContainerMetrics}} won't report 
> anything else ever anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs

2016-03-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192356#comment-15192356
 ] 

Steve Loughran commented on YARN-4545:
--

LGTM, +1

> Allow YARN distributed shell to use ATS v1.5 APIs
> -
>
> Key: YARN-4545
> URL: https://issues.apache.org/jira/browse/YARN-4545
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-4545-YARN-4265.001.patch, 
> YARN-4545-trunk.001.patch, YARN-4545-trunk.002.patch, 
> YARN-4545-trunk.003.patch, YARN-4545-trunk.004.patch, 
> YARN-4545-trunk.005.patch, YARN-4545-trunk.006.patch, 
> YARN-4545-trunk.007.patch, YARN-4545-trunk.008.patch, 
> YARN-4545-trunk.009.patch
>
>
> We can use YARN distributed shell as a demo for the ATS v1.5 APIs. We need to 
> allow distributed shell post data with ATS v1.5 API if 1.5 is enabled in the 
> system. We also need to provide a sample plugin to read those data out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4678) Cluster used capacity is > 100 when container reserved

2016-03-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192334#comment-15192334
 ] 

Hadoop QA commented on YARN-4678:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 8 new + 333 unchanged - 1 fixed = 341 total (was 334) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 20s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 54s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 158m 24s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_74 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | 

[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-03-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192270#comment-15192270
 ] 

Sunil G commented on YARN-4807:
---

As I see, {{MockRM#waitForState}} is also considering 500ms to 1sec in various 
methods. I think that can be made smaller to 200ms. Recently myslef and 
[~rohithsharma] had worked on few random test cases failures in CS and all were 
fixed when we properly made use of {{waitForState}} with correct timeout. I 
think 200ms is more than enough, and we can remove MockAM method and 
consolidate all into MockRM. thoughts?

> MockAM#waitForState sleep duration is too long
> --
>
> Key: YARN-4807
> URL: https://issues.apache.org/jira/browse/YARN-4807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
>  Labels: newbie
>
> MockAM#waitForState sleep duration (500 ms) is too long. Also, there is 
> significant duplication with MockRM#waitForState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Commented] (YARN-4751) In 2.7, Labeled queue usage not shown properly in capacity scheduler UI

2016-03-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192268#comment-15192268
 ] 

Sunil G commented on YARN-4751:
---

Thanks [~eepayne] for pointing out the dependencies. Yes, its come with a chain 
of patches. We are trying to discuss in YARN-3216 about getting it ready for 
2.7 line. If we can confirm that, I can make a clean patch with minimal changes 
needed to get AM Resource Percent for labels and it can get almost all new 
changes what we have done recently. But its more a difference approach as we 
cant cherry pick the needed patch, so it may be complex 2.7 line later. I would 
like to hear your thoughts on that also.

Coming to this patch, I think we needed only to verify the resource usages in 
NodeLabel scenario. In trunk {{TestCapacitySchedulerNodeLabelUpdate}} covers 
various cases. And it uses {{NullRMNodeLabelsManager}}, so many events wont be 
done and it ll be good w.r.t unit test cases. Could we consider this test class?

> In 2.7, Labeled queue usage not shown properly in capacity scheduler UI
> ---
>
> Key: YARN-4751
> URL: https://issues.apache.org/jira/browse/YARN-4751
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 2.7.3
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: 2.7 CS UI No BarGraph.jpg, 
> YARH-4752-branch-2.7.001.patch, YARH-4752-branch-2.7.002.patch
>
>
> In 2.6 and 2.7, the capacity scheduler UI does not have the queue graphs 
> separated by partition. When applications are running on a labeled queue, no 
> color is shown in the bar graph, and several of the "Used" metrics are zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4678) Cluster used capacity is > 100 when container reserved

2016-03-13 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-4678:
--
Attachment: 0003-YARN-4678.patch

Updating patch by removing reservedCapacity from LeafQueue also. Hence this 
will be in line with ParentQueue / root. Since we have more information about 
reservedCapacity in ClusterMetrics and in QueueInfo area, I think this will be 
fine. [~brahmareddy], could you pls confirm if its fine.

> Cluster used capacity is > 100 when container reserved 
> ---
>
> Key: YARN-4678
> URL: https://issues.apache.org/jira/browse/YARN-4678
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Sunil G
> Attachments: 0001-YARN-4678.patch, 0002-YARN-4678.patch, 
> 0003-YARN-4678.patch
>
>
>  *Scenario:* 
> * Start cluster with Three NM's each having 8GB (cluster memory:24GB).
> * Configure queues with elasticity and userlimitfactor=10.
> * disable pre-emption.
> * run two job with different priority in different queue at the same time
> ** yarn jar hadoop-mapreduce-examples-2.7.2.jar pi -Dyarn.app.priority=LOW 
> -Dmapreduce.job.queuename=QueueA -Dmapreduce.map.memory.mb=4096 
> -Dyarn.app.mapreduce.am.resource.mb=1536 
> -Dmapreduce.job.reduce.slowstart.completedmaps=1.0 10 1
> ** yarn jar hadoop-mapreduce-examples-2.7.2.jar pi -Dyarn.app.priority=HIGH 
> -Dmapreduce.job.queuename=QueueB -Dmapreduce.map.memory.mb=4096 
> -Dyarn.app.mapreduce.am.resource.mb=1536 3 1
> * observe the cluster capacity which was used in RM web UI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4781) Support intra-queue preemption for fairness ordering policy.

2016-03-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192255#comment-15192255
 ] 

Sunil G commented on YARN-4781:
---

Thanks [~leftnoteasy] for sharing more thoughts on same. IIUC, this will 
majorly deal inbalances within a queue w.r.t resource usage when 
FairOrderingPolicy is used, correct?. I think this will be very helpful, So may 
be existing PCPP will take of identifying this inbalances or a new 
SchedulerMonitor will be created for same. If a new SchedulerMonitor is 
planned, YARN-2009 also can make use of the same. Thoughts?

> Support intra-queue preemption for fairness ordering policy.
> 
>
> Key: YARN-4781
> URL: https://issues.apache.org/jira/browse/YARN-4781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>
> We introduced fairness queue policy since YARN-3319, which will let large 
> applications make progresses and not starve small applications. However, if a 
> large application takes the queue’s resources, and containers of the large 
> app has long lifespan, small applications could still wait for resources for 
> long time and SLAs cannot be guaranteed.
> Instead of wait for application release resources on their own, we need to 
> preempt resources of queue with fairness policy enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192239#comment-15192239
 ] 

Hadoop QA commented on YARN-4712:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
21s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
56s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 patch generated 1 new + 22 unchanged - 2 fixed = 23 total (was 24) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 34s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 6s {color} | 
{color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 30s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793186/YARN-4712-YARN-2928.v1.005.patch
 |
| JIRA Issue | YARN-4712 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 6d596507dcba 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-13 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-4712:

Attachment: YARN-4712-YARN-2928.v1.005.patch

Hi [~sjlee0] & [~varun_saxena], 
Considering the discussions till now for aggregation {{cpuUsagePercentPerCore}} 
would be ideal for aggregation. I have uploaded a new patch doing the same. If 
anything else needs to be captured please inform.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4805) Don't go through all schedulers in ParameterizedTestBase

2016-03-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192221#comment-15192221
 ] 

Karthik Kambatla commented on YARN-4805:


Comparing against earlier builds from today, this change seems to shave 22 
minutes off the build. 

[~jianhe] - will you be able to review this? Thanks. 

> Don't go through all schedulers in ParameterizedTestBase
> 
>
> Key: YARN-4805
> URL: https://issues.apache.org/jira/browse/YARN-4805
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-4805-1.patch
>
>
> ParameterizedSchedulerTestBase was created to make sure tests that were 
> written with CapacityScheduler in mind don't fail when run against 
> FairScheduler. Before this was introduced, tests would fail because 
> FairScheduler requires an allocation file. 
> However, the tests that extend it take about 10 minutes per scheduler. So, 
> instead of running against both schedulers, we could setup the scheduler 
> appropriately so the tests pass against both schedulers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4796) ContainerMetrics drops records

2016-03-13 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192215#comment-15192215
 ] 

Varun Vasudev commented on YARN-4796:
-

[~templedf] - I think this got filed twice by mistake. YARN-4795 seems to be 
the same issue.

> ContainerMetrics drops records
> --
>
> Key: YARN-4796
> URL: https://issues.apache.org/jira/browse/YARN-4796
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> The metrics2 system was implemented to deal with persistent sources.  
> {{ContainerMetrics}} is an ephemeral source, and so it causes problems.  
> Specifically, the {{ContainerMetrics}} only reports metrics once after the 
> container has been stopped.  This behavior is a problem because the metrics2 
> system can ask sources for reports that will be quietly dropped by the sinks 
> that care.  (It's a metrics2 feature, not a bug.)  If that final report is 
> silently dropped, it's lost, because the {{ContainerMetrics}} won't report 
> anything else ever anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3122) Metrics for container's actual CPU usage

2016-03-13 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192201#comment-15192201
 ] 

Naganarasimha G R commented on YARN-3122:
-

Any thoughts on the above comment [~adhoot] & [~kasha]. Is it a bug ?

> Metrics for container's actual CPU usage
> 
>
> Key: YARN-3122
> URL: https://issues.apache.org/jira/browse/YARN-3122
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Fix For: 2.7.0
>
> Attachments: YARN-3122.001.patch, YARN-3122.002.patch, 
> YARN-3122.003.patch, YARN-3122.004.patch, YARN-3122.005.patch, 
> YARN-3122.006.patch, YARN-3122.007.patch, YARN-3122.prelim.patch, 
> YARN-3122.prelim.patch
>
>
> It would be nice to capture resource usage per container, for a variety of 
> reasons. This JIRA is to track CPU usage. 
> YARN-2965 tracks the resource usage on the node, and the two implementations 
> should reuse code as much as possible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)