[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-03-14 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4108:
-
Attachment: YARN-4108.11.patch

Removed unrelated changes. (ver.11)

> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --
>
> Key: YARN-4108
> URL: https://issues.apache.org/jira/browse/YARN-4108
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, YARN-4108.1.patch, 
> YARN-4108.10.patch, YARN-4108.11.patch, YARN-4108.2.patch, YARN-4108.3.patch, 
> YARN-4108.4.patch, YARN-4108.5.patch, YARN-4108.6.patch, YARN-4108.7.patch, 
> YARN-4108.8.patch, YARN-4108.9.patch, YARN-4108.poc.1.patch, 
> YARN-4108.poc.2-WIP.patch, YARN-4108.poc.3-WIP.patch, 
> YARN-4108.poc.4-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-03-14 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4108:
-
Attachment: YARN-4108.10.patch

> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --
>
> Key: YARN-4108
> URL: https://issues.apache.org/jira/browse/YARN-4108
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, YARN-4108.1.patch, 
> YARN-4108.10.patch, YARN-4108.2.patch, YARN-4108.3.patch, YARN-4108.4.patch, 
> YARN-4108.5.patch, YARN-4108.6.patch, YARN-4108.7.patch, YARN-4108.8.patch, 
> YARN-4108.9.patch, YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch, 
> YARN-4108.poc.3-WIP.patch, YARN-4108.poc.4-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-03-14 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4108:
-
Attachment: (was: YARN-4108.10.patch)

> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --
>
> Key: YARN-4108
> URL: https://issues.apache.org/jira/browse/YARN-4108
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, YARN-4108.1.patch, 
> YARN-4108.2.patch, YARN-4108.3.patch, YARN-4108.4.patch, YARN-4108.5.patch, 
> YARN-4108.6.patch, YARN-4108.7.patch, YARN-4108.8.patch, YARN-4108.9.patch, 
> YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch, YARN-4108.poc.3-WIP.patch, 
> YARN-4108.poc.4-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-03-14 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4108:
-
Attachment: YARN-4108.10.patch

Thanks [~jianhe] for such detailed reviews.

Attached ver.10 patch addressed all your comments except:

bq. may be check the queue resource here directly instead of this 
isAllowPreemption flag?
I deliberately avoided this because we want to draw a border between LeafQueue 
and ContainerAllocator, and with this we only need to check queue capacity once 
for every LeafQueue allocation, otherwise we have to do this in each 
application.

> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --
>
> Key: YARN-4108
> URL: https://issues.apache.org/jira/browse/YARN-4108
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, YARN-4108.1.patch, 
> YARN-4108.10.patch, YARN-4108.2.patch, YARN-4108.3.patch, YARN-4108.4.patch, 
> YARN-4108.5.patch, YARN-4108.6.patch, YARN-4108.7.patch, YARN-4108.8.patch, 
> YARN-4108.9.patch, YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch, 
> YARN-4108.poc.3-WIP.patch, YARN-4108.poc.4-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4816) SystemClock API broken in 2.9.0

2016-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194726#comment-15194726
 ] 

Hudson commented on YARN-4816:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9462 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9462/])
YARN-4816. Fix incompatible change in SystemClock. (sseth: rev 
eba66a64d28b50a660d6f537c767677f5fa0f7ea)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/SystemClock.java


> SystemClock API broken in 2.9.0
> ---
>
> Key: YARN-4816
> URL: https://issues.apache.org/jira/browse/YARN-4816
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.9.0
>
> Attachments: YARN-4816.1.txt
>
>
> https://issues.apache.org/jira/browse/YARN-4526 removed the public 
> constructor on SystemClock - making it an incompatible change.
> cc [~kasha]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-14 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-4712:

Attachment: YARN-4712-YARN-2928.v1.006.patch

Hi [~sjlee0],
Have fixed your comments please review

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, 
> YARN-4712-YARN-2928.v1.006.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4815) ATS 1.5 timelineclinet impl try to create attempt directory for every event call

2016-03-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194708#comment-15194708
 ] 

Hadoop QA commented on YARN-4815:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 28s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 8s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 0s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
7s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 29s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 50s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 50s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 3 new + 
215 unchanged - 0 fixed = 218 total (was 215) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 6s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_74. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 56s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 30s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 42s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
26s {color} | {color:green} Patch 

[jira] [Updated] (YARN-4711) NM is going down with NPE's due to single thread processing of events by Timeline client

2016-03-14 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-4711:

Attachment: 4711Analysis.txt

Hi [~sjlee0], 
Sorry for the long delay ! 
>From the analysis i am able to identify that it happens on any exception from 
>the web server while publishing the entity. Earlier i suspected it might be 
>due to the time taken to publish but other imp cause could be we are retrying 
>to publish the entity irrespective of the exception type.
So basically 
{{ContainerManagerImpl.ContainerEventDispatcher.handle(ContainerEvent)}} -> 
{{nmMetricsPublisher.publishContainerEvent}} -> 
{{NMTimelinePublisher.ContainerEventHandler.handle(ContainerEvent)}}. So 
synchronous container metric event dispatching in 
{{NMTimelinePublisher.dispatcher}} is getting slowed down as 
{{TimelineClientImpl.putObjects}} is retrying on exception.
  


> NM is going down with NPE's due to single thread processing of events by 
> Timeline client
> 
>
> Key: YARN-4711
> URL: https://issues.apache.org/jira/browse/YARN-4711
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Critical
>  Labels: yarn-2928-1st-milestone
> Attachments: 4711Analysis.txt
>
>
> After YARN-3367, while testing the latest 2928 branch came across few NPEs 
> due to which NM is shutting down.
> {code}
> 2016-02-21 23:19:54,078 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ContainerEventHandler.handle(NMTimelinePublisher.java:306)
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ContainerEventHandler.handle(NMTimelinePublisher.java:296)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.putEntity(NMTimelinePublisher.java:213)
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerFinishedEvent(NMTimelinePublisher.java:192)
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.access$400(NMTimelinePublisher.java:63)
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ApplicationEventHandler.handle(NMTimelinePublisher.java:289)
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ApplicationEventHandler.handle(NMTimelinePublisher.java:280)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> On analysis found that the there was delay in processing of events, as after 
> YARN-3367 all the events were getting processed by a single thread inside the 
> timeline client. 
> Additionally found one scenario where there is possibility of NPE:
> * TimelineEntity.toString() when {{real}} is not null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4816) SystemClock API broken in 2.9.0

2016-03-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned YARN-4816:


Assignee: Siddharth Seth

> SystemClock API broken in 2.9.0
> ---
>
> Key: YARN-4816
> URL: https://issues.apache.org/jira/browse/YARN-4816
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: YARN-4816.1.txt
>
>
> https://issues.apache.org/jira/browse/YARN-4526 removed the public 
> constructor on SystemClock - making it an incompatible change.
> cc [~kasha]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4816) SystemClock API broken in 2.9.0

2016-03-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194673#comment-15194673
 ] 

Siddharth Seth commented on YARN-4816:
--

Thanks for the review [~kasha] - committing to master and branch-2.

> SystemClock API broken in 2.9.0
> ---
>
> Key: YARN-4816
> URL: https://issues.apache.org/jira/browse/YARN-4816
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Siddharth Seth
> Attachments: YARN-4816.1.txt
>
>
> https://issues.apache.org/jira/browse/YARN-4526 removed the public 
> constructor on SystemClock - making it an incompatible change.
> cc [~kasha]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4766) NM should not aggregate logs older than the retention policy

2016-03-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194670#comment-15194670
 ] 

Hadoop QA commented on YARN-4766:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 34s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
42s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 59s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 56s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
5s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 32s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 3m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 54s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 54s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 37s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 2 new + 
121 unchanged - 2 fixed = 123 total (was 123) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 54s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m 34s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 33s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 36s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch 

[jira] [Commented] (YARN-4812) TestFairScheduler#testContinuousScheduling fails intermittently

2016-03-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194658#comment-15194658
 ] 

Hadoop QA commented on YARN-4812:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 11s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 17s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 160m 4s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_74 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793398/yarn-4812-1.patch |
| JIRA Issue | YARN-4812 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| 

[jira] [Updated] (YARN-4818) AggregatedLogFormat.LogValue.write() incorrectly truncates files

2016-03-14 Thread Brook Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brook Zhou updated YARN-4818:
-
Summary: AggregatedLogFormat.LogValue.write() incorrectly truncates files  
(was: AggregatedLogFormat.LogValue writes only in blocks of buffer size)

> AggregatedLogFormat.LogValue.write() incorrectly truncates files
> 
>
> Key: YARN-4818
> URL: https://issues.apache.org/jira/browse/YARN-4818
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Brook Zhou
>Assignee: Brook Zhou
> Fix For: 2.8.0
>
>
> AggregatedLogFormat.LogValue.write() currently has a bug where it only writes 
> in blocks of the buffer size (65535). This is because 
> FileInputStream.read(byte[] buf) returns -1 if there are less than buf.length 
> bytes remaining. In cases where the file size is not an exact multiple of 
> 65535 bytes, the remaining bytes are truncated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4818) AggregatedLogFormat.LogValue writes only in blocks of buffer size

2016-03-14 Thread Brook Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brook Zhou updated YARN-4818:
-
Description: AggregatedLogFormat.LogValue.write() currently has a bug where 
it only writes in blocks of the buffer size (65535). This is because 
FileInputStream.read(byte[] buf) returns -1 if there are less than buf.length 
bytes remaining. In cases where the file size is not an exact multiple of 65535 
bytes, the remaining bytes are truncated.  (was: 
AggregatedLogFormat.LogValue.write() currently has a bug where it only writes 
in blocks of the buffer size (65535). This is because 
FileInputStream.read(byte[] buf) returns -1 if there are less than 65535 bytes 
remaining. In cases where the file is less than 65535 bytes, 0 bytes are 
written.)

> AggregatedLogFormat.LogValue writes only in blocks of buffer size
> -
>
> Key: YARN-4818
> URL: https://issues.apache.org/jira/browse/YARN-4818
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Brook Zhou
>Assignee: Brook Zhou
> Fix For: 2.8.0
>
>
> AggregatedLogFormat.LogValue.write() currently has a bug where it only writes 
> in blocks of the buffer size (65535). This is because 
> FileInputStream.read(byte[] buf) returns -1 if there are less than buf.length 
> bytes remaining. In cases where the file size is not an exact multiple of 
> 65535 bytes, the remaining bytes are truncated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4818) AggregatedLogFormat.LogValue writes only in blocks of buffer size

2016-03-14 Thread Brook Zhou (JIRA)
Brook Zhou created YARN-4818:


 Summary: AggregatedLogFormat.LogValue writes only in blocks of 
buffer size
 Key: YARN-4818
 URL: https://issues.apache.org/jira/browse/YARN-4818
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Brook Zhou
Assignee: Brook Zhou
 Fix For: 2.8.0


AggregatedLogFormat.LogValue.write() currently has a bug where it only writes 
in blocks of the buffer size (65535). This is because 
FileInputStream.read(byte[] buf) returns -1 if there are less than 65535 bytes 
remaining. In cases where the file is less than 65535 bytes, 0 bytes are 
written.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3150) [Documentation] Documenting the timeline service v2

2016-03-14 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194571#comment-15194571
 ] 

Li Lu commented on YARN-3150:
-

Thanks [~sjlee0]! Currently both [~xgong] and I are still waiting for some free 
cycles to work on the documentations of ATS v1.5. YARN-4694 will be the JIRA to 
track that work. 

> [Documentation] Documenting the timeline service v2
> ---
>
> Key: YARN-3150
> URL: https://issues.apache.org/jira/browse/YARN-3150
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Sangjin Lee
>  Labels: yarn-2928-1st-milestone
>
> Let's make sure we will have a document to describe what's new in TS v2, the 
> APIs, the client libs and so on. We should do better around documentation in 
> v2 than v1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4817) Change Log Level to DEBUG for putDomain call in ATS 1.5

2016-03-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194568#comment-15194568
 ] 

Hadoop QA commented on YARN-4817:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
50s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 52s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 9s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 37s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793422/YARN-4817.1.patch |
| JIRA Issue | YARN-4817 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 73cd428530aa 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 

[jira] [Commented] (YARN-4815) ATS 1.5 timelineclinet impl try to create attempt directory for every event call

2016-03-14 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194569#comment-15194569
 ] 

Li Lu commented on YARN-4815:
-

OK, one question, why we're not using the Guava LRU cache here? Thanks! 

> ATS 1.5 timelineclinet impl try to create attempt directory for every event 
> call
> 
>
> Key: YARN-4815
> URL: https://issues.apache.org/jira/browse/YARN-4815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4815.1.patch
>
>
> ATS 1.5 timelineclinet impl, try to create attempt directory for every event 
> call. Since per attempt only one call to create directory is enough, this is 
> causing perf issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4817) Change Log Level to DEBUG for putDomain call in ATS 1.5

2016-03-14 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194564#comment-15194564
 ] 

Li Lu commented on YARN-4817:
-

LGTM. +1. 

> Change Log Level to DEBUG for putDomain call in ATS 1.5
> ---
>
> Key: YARN-4817
> URL: https://issues.apache.org/jira/browse/YARN-4817
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Trivial
> Attachments: YARN-4817.1.patch
>
>
> We have already changed the log level to DEBUG for putEntity call. Let us 
> make it consistence for the putDomain call



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4814) ATS 1.5 timelineclient impl call flush after every event write

2016-03-14 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194563#comment-15194563
 ] 

Li Lu commented on YARN-4814:
-

Oops I didn't look at the patch when posting the last comment. What a big 
patch. LGTM. +1. Will commit in half a day if there's no objections. 

> ATS 1.5 timelineclient impl call flush after every event write
> --
>
> Key: YARN-4814
> URL: https://issues.apache.org/jira/browse/YARN-4814
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4814.1.patch
>
>
> ATS 1.5 timelineclient impl call flush after every event write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4814) ATS 1.5 timelineclient impl call flush after every event write

2016-03-14 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194562#comment-15194562
 ] 

Li Lu commented on YARN-4814:
-

Sure. Will take a look at it soon. 

> ATS 1.5 timelineclient impl call flush after every event write
> --
>
> Key: YARN-4814
> URL: https://issues.apache.org/jira/browse/YARN-4814
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4814.1.patch
>
>
> ATS 1.5 timelineclient impl call flush after every event write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4785) inconsistent value type of the "type" field for LeafQueueInfo in response of RM REST API - cluster/scheduler

2016-03-14 Thread Jayesh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194535#comment-15194535
 ] 

Jayesh commented on YARN-4785:
--

prod env: linux 2.6.32-431.el6.x86_64 ( jdk 1.7.0_79)
more info: I have following libs in classpath
{code}
jackson-annotations-2.2.3.jar  
jackson-core-asl-1.8.8.jar 
jackson-jaxrs-1.8.8.jar   
jackson-xc-1.8.8.jar
jackson-core-2.2.3.jar 
jackson-databind-2.2.3.jar  
jackson-mapper-asl-1.8.8.jar
jersey-client-1.8.jar  
jersey-core-1.8.jar  
jersey-json-1.8.jar  
jersey-server-1.8.jar 
jersey-servlet-1.14.jar
{code}

dev env ( where I can reproduce this issue) : mac os Yosemite (10.11.3 (15D21)) 
- jdk 1.7.0_79
more info : this is bare hadoop code (hdp though - HDP-2.2.9.0-tag ) on which I 
am running the test cases to reproduce.

Thanks for looking into this..   did you add test for type assessment in 
verifySubQueue() ?

> inconsistent value type of the "type" field for LeafQueueInfo in response of 
> RM REST API - cluster/scheduler
> 
>
> Key: YARN-4785
> URL: https://issues.apache.org/jira/browse/YARN-4785
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.6.0
>Reporter: Jayesh
>  Labels: REST_API
>
> I see inconsistent value type ( String and Array ) of the "type" field for 
> LeafQueueInfo in response of RM REST API - cluster/scheduler
> as per the spec it should be always String.
> here is the sample output ( removed non-relevant fields )
> {code}
> {
>   "scheduler": {
> "schedulerInfo": {
>   "type": "capacityScheduler",
>   "capacity": 100,
>   ...
>   "queueName": "root",
>   "queues": {
> "queue": [
>   {
> "type": "capacitySchedulerLeafQueueInfo",
> "capacity": 0.1,
> 
>   },
>   {
> "type": [
>   "capacitySchedulerLeafQueueInfo"
> ],
> "capacity": 0.1,
> "queueName": "test-queue",
> "state": "RUNNING",
> 
>   },
>   {
> "type": [
>   "capacitySchedulerLeafQueueInfo"
> ],
> "capacity": 2.5,
> 
>   },
>   {
> "capacity": 25,
> 
> "state": "RUNNING",
> "queues": {
>   "queue": [
> {
>   "capacity": 6,
>   "state": "RUNNING",
>   "queues": {
> "queue": [
>   {
> "type": "capacitySchedulerLeafQueueInfo",
> "capacity": 100,
> ...
>   }
> ]
>   },
>   
> },
> {
>   "capacity": 6,
>   ...
>   "state": "RUNNING",
>   "queues": {
> "queue": [
>   {
> "type": "capacitySchedulerLeafQueueInfo",
> "capacity": 100,
> ...
>   }
> ]
>   },
>   ...
> },
> ...
>   ]
> },
> ...
>   }
> ]
>   }
> }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-03-14 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194534#comment-15194534
 ] 

Jian He commented on YARN-4108:
---

- revert RMAppAttemptMetrics changes.
- remove setIsAlive method
- remove RMContainer#setLeafQueue too as RMContainer#getQueueName is not used 
any where
- usedConsideredKillable -> usedExceptKillable
- remove CapacityScheduler#liveContainers
- PreemptableEntity -> PreemptableQueue
- markContainerForKillableInternal does not need to be a separate method, it 
can be merged into markContainerForKillable.
- parentMaxAvailableResource is no_label resource ?
{code}
// Deduct killable from used
Resources.addTo(parentMaxAvailableResource,

getTotalKillableResource(nodePartition));
{code}

- availableConsidersKillable -> availableAndKillable
- add comments for why killContainersToEnforceMaxQueueCapacity is needed.
- may be check the queue resource here directly instead of this 
isAllowPreemption flag?
 {{if (availableContainers == 0 && currentResoureLimits.isAllowPreemption()) {}}
- add a test case that container will not be preempted if user limit is hit ?


> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --
>
> Key: YARN-4108
> URL: https://issues.apache.org/jira/browse/YARN-4108
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, YARN-4108.1.patch, 
> YARN-4108.2.patch, YARN-4108.3.patch, YARN-4108.4.patch, YARN-4108.5.patch, 
> YARN-4108.6.patch, YARN-4108.7.patch, YARN-4108.8.patch, YARN-4108.9.patch, 
> YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch, YARN-4108.poc.3-WIP.patch, 
> YARN-4108.poc.4-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4809) De-duplicate container completion across schedulers

2016-03-14 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G reassigned YARN-4809:
-

Assignee: Sunil G

> De-duplicate container completion across schedulers
> ---
>
> Key: YARN-4809
> URL: https://issues.apache.org/jira/browse/YARN-4809
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Karthik Kambatla
>Assignee: Sunil G
>
> CapacityScheduler and FairScheduler implement containerCompleted the exact 
> same way. Duplication across the schedulers can be avoided. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3150) [Documentation] Documenting the timeline service v2

2016-03-14 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194487#comment-15194487
 ] 

Sangjin Lee commented on YARN-3150:
---

This is to start the discussion. The main documentation for Timeline Service is 
here: 
http://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/TimelineServer.html

This contains a fairly significant amount of information. I think we have some 
options. We could either (1) add v.2-related information within the same doc, 
or (2) create a separate doc that's linked off of this doc. I think a separate 
doc might be easier for users to consume. Otherwise, v.2-related info would be 
sprinkled throughout the existing document. Thoughts?

Also, I'm noticing it has not been updated with v.1.5. Is that planned? If so, 
where will it be done? cc [~gtCarrera9]

> [Documentation] Documenting the timeline service v2
> ---
>
> Key: YARN-3150
> URL: https://issues.apache.org/jira/browse/YARN-3150
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Sangjin Lee
>  Labels: yarn-2928-1st-milestone
>
> Let's make sure we will have a document to describe what's new in TS v2, the 
> APIs, the client libs and so on. We should do better around documentation in 
> v2 than v1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4815) ATS 1.5 timelineclinet impl try to create attempt directory for every event call

2016-03-14 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-4815:

Attachment: YARN-4815.1.patch

> ATS 1.5 timelineclinet impl try to create attempt directory for every event 
> call
> 
>
> Key: YARN-4815
> URL: https://issues.apache.org/jira/browse/YARN-4815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4815.1.patch
>
>
> ATS 1.5 timelineclinet impl, try to create attempt directory for every event 
> call. Since per attempt only one call to create directory is enough, this is 
> causing perf issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4814) ATS 1.5 timelineclient impl call flush after every event write

2016-03-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194480#comment-15194480
 ] 

Hadoop QA commented on YARN-4814:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 52s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 9s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 10s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793412/YARN-4814.1.patch |
| JIRA Issue | YARN-4814 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 3e2880c2af76 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 

[jira] [Commented] (YARN-4816) SystemClock API broken in 2.9.0

2016-03-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194475#comment-15194475
 ] 

Hadoop QA commented on YARN-4816:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 55s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 1s 
{color} | {color:blue} The patch file was not named according to hadoop's 
naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute 
for instructions. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 4s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 19s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 12s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793414/YARN-4816.1.txt |
| JIRA Issue | YARN-4816 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  

[jira] [Updated] (YARN-4766) NM should not aggregate logs older than the retention policy

2016-03-14 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-4766:
-
Attachment: yarn4766.002.patch

Addressed  checkstyle issues

> NM should not aggregate logs older than the retention policy
> 
>
> Key: YARN-4766
> URL: https://issues.apache.org/jira/browse/YARN-4766
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: yarn4766.001.patch, yarn4766.002.patch
>
>
> When a log aggregation fails on the NM the information is for the attempt is 
> kept in the recovery DB. Log aggregation can fail for multiple reasons which 
> are often related to HDFS space or permissions.
> On restart the recovery DB is read and if an application attempt needs its 
> logs aggregated, the files are scheduled for aggregation without any checks. 
> The log files could be older than the retention limit in which case we should 
> not aggregate them but immediately mark them for deletion from the local file 
> system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4736) Issues with HBaseTimelineWriterImpl

2016-03-14 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194418#comment-15194418
 ] 

Naganarasimha G R commented on YARN-4736:
-

Also seems like [~anoop.hbase] has got the cause for this issue in HBASE-15436, 
so i think no handling from YARN side right ?

> Issues with HBaseTimelineWriterImpl
> ---
>
> Key: YARN-4736
> URL: https://issues.apache.org/jira/browse/YARN-4736
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Naganarasimha G R
>Assignee: Vrushali C
>Priority: Critical
>  Labels: yarn-2928-1st-milestone
> Attachments: NM_Hang_hbase1.0.3.tar.gz, hbaseException.log, 
> threaddump.log
>
>
> Faced some issues while running ATSv2 in single node Hadoop cluster and in 
> the same node had launched Hbase with embedded zookeeper.
> # Due to some NPE issues i was able to see NM was trying to shutdown, but the 
> NM daemon process was not completed due to the locks.
> # Got some exception related to Hbase after application finished execution 
> successfully. 
> will attach logs and the trace for the same



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4736) Issues with HBaseTimelineWriterImpl

2016-03-14 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-4736:

Attachment: NM_Hang_hbase1.0.3.tar.gz

I was able to reproduce Issue 2 with Hbase 1.0.3 too, well its not as frequent 
as i was able to reproduce it with Hbase 1.0.2. Also one more thing i was able 
to note was that i was able to reproduce when 
{{hbase.zookeeper.property.dataDir}} was not configured i.e. when zookeeper's 
datadir is in */tmp/hbase-*.
Well if anything more is required will be ready to share

> Issues with HBaseTimelineWriterImpl
> ---
>
> Key: YARN-4736
> URL: https://issues.apache.org/jira/browse/YARN-4736
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Naganarasimha G R
>Assignee: Vrushali C
>Priority: Critical
>  Labels: yarn-2928-1st-milestone
> Attachments: NM_Hang_hbase1.0.3.tar.gz, hbaseException.log, 
> threaddump.log
>
>
> Faced some issues while running ATSv2 in single node Hadoop cluster and in 
> the same node had launched Hbase with embedded zookeeper.
> # Due to some NPE issues i was able to see NM was trying to shutdown, but the 
> NM daemon process was not completed due to the locks.
> # Got some exception related to Hbase after application finished execution 
> successfully. 
> will attach logs and the trace for the same



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4815) ATS 1.5 timelineclinet impl try to create attempt directory for every event call

2016-03-14 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong reassigned YARN-4815:
---

Assignee: Xuan Gong

> ATS 1.5 timelineclinet impl try to create attempt directory for every event 
> call
> 
>
> Key: YARN-4815
> URL: https://issues.apache.org/jira/browse/YARN-4815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>
> ATS 1.5 timelineclinet impl, try to create attempt directory for every event 
> call. Since per attempt only one call to create directory is enough, this is 
> causing perf issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4245) Clean up container-executor binary invocation interface

2016-03-14 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4245:
--
Summary: Clean up container-executor binary invocation interface  (was: 
Clean up container-executor invocation interface)

> Clean up container-executor binary invocation interface
> ---
>
> Key: YARN-4245
> URL: https://issues.apache.org/jira/browse/YARN-4245
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Sidharta Seethana
>Assignee: Sidharta Seethana
>
> The current container-executor invocation interface (especially for launch 
> container) is cumbersome to use . Launching a container now requires 13-15 
> arguments.  This becomes especially problematic when additional, potentially 
> optional, arguments are required. We need a better mechanism to deal with 
> this. One such mechanism could be to handle this could be to use a file 
> containing key/value pairs (similar to container-executor.cfg) corresponding 
> to the arguments each invocation needs. Such a mechanism would make it easier 
> to add new optional arguments to container-executor and better manage 
> existing ones. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4245) Clean up container-executor invocation interface

2016-03-14 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194345#comment-15194345
 ] 

Vinod Kumar Vavilapalli commented on YARN-4245:
---

bq. We need a better mechanism to deal with this. One such mechanism could be 
to handle this could be to use a file containing key/value pairs (similar to 
container-executor.cfg) corresponding to the arguments each invocation needs.
A little late, but +10!

> Clean up container-executor invocation interface
> 
>
> Key: YARN-4245
> URL: https://issues.apache.org/jira/browse/YARN-4245
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Sidharta Seethana
>Assignee: Sidharta Seethana
>
> The current container-executor invocation interface (especially for launch 
> container) is cumbersome to use . Launching a container now requires 13-15 
> arguments.  This becomes especially problematic when additional, potentially 
> optional, arguments are required. We need a better mechanism to deal with 
> this. One such mechanism could be to handle this could be to use a file 
> containing key/value pairs (similar to container-executor.cfg) corresponding 
> to the arguments each invocation needs. Such a mechanism would make it easier 
> to add new optional arguments to container-executor and better manage 
> existing ones. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3854) Add localization support for docker images

2016-03-14 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194342#comment-15194342
 ] 

Vinod Kumar Vavilapalli commented on YARN-3854:
---

[~sidharta-s], this one's old but is it a dup of YARN-3289? If the goal is the 
same, we should close the newer of the two and change JIRA hierarchy etc of the 
older one as needed.

> Add localization support for docker images
> --
>
> Key: YARN-3854
> URL: https://issues.apache.org/jira/browse/YARN-3854
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Sidharta Seethana
>
> We need the ability to localize images from HDFS and load them for use when 
> launching docker containers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4817) Change Log Level to DEBUG for putDomain call in ATS 1.5

2016-03-14 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194329#comment-15194329
 ] 

Xuan Gong commented on YARN-4817:
-

trivial patch

> Change Log Level to DEBUG for putDomain call in ATS 1.5
> ---
>
> Key: YARN-4817
> URL: https://issues.apache.org/jira/browse/YARN-4817
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Trivial
> Attachments: YARN-4817.1.patch
>
>
> We have already changed the log level to DEBUG for putEntity call. Let us 
> make it consistence for the putDomain call



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4817) Change Log Level to DEBUG for putDomain call in ATS 1.5

2016-03-14 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-4817:

Attachment: YARN-4817.1.patch

> Change Log Level to DEBUG for putDomain call in ATS 1.5
> ---
>
> Key: YARN-4817
> URL: https://issues.apache.org/jira/browse/YARN-4817
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Trivial
> Attachments: YARN-4817.1.patch
>
>
> We have already changed the log level to DEBUG for putEntity call. Let us 
> make it consistence for the putDomain call



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4817) Change Log Level to DEBUG for putDomain call in ATS 1.5

2016-03-14 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-4817:

Priority: Trivial  (was: Major)

> Change Log Level to DEBUG for putDomain call in ATS 1.5
> ---
>
> Key: YARN-4817
> URL: https://issues.apache.org/jira/browse/YARN-4817
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Trivial
> Attachments: YARN-4817.1.patch
>
>
> We have already changed the log level to DEBUG for putEntity call. Let us 
> make it consistence for the putDomain call



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4817) Change Log Level to DEBUG for putDomain call in ATS 1.5

2016-03-14 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-4817:
---

 Summary: Change Log Level to DEBUG for putDomain call in ATS 1.5
 Key: YARN-4817
 URL: https://issues.apache.org/jira/browse/YARN-4817
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong


We have already changed the log level to DEBUG for putEntity call. Let us make 
it consistence for the putDomain call



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4805) Don't go through all schedulers in ParameterizedTestBase

2016-03-14 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194326#comment-15194326
 ] 

Karthik Kambatla commented on YARN-4805:


Thanks Robert. Will check this in tomorrow if I don't hear any objections. 

> Don't go through all schedulers in ParameterizedTestBase
> 
>
> Key: YARN-4805
> URL: https://issues.apache.org/jira/browse/YARN-4805
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-4805-1.patch
>
>
> ParameterizedSchedulerTestBase was created to make sure tests that were 
> written with CapacityScheduler in mind don't fail when run against 
> FairScheduler. Before this was introduced, tests would fail because 
> FairScheduler requires an allocation file. 
> However, the tests that extend it take about 10 minutes per scheduler. So, 
> instead of running against both schedulers, we could setup the scheduler 
> appropriately so the tests pass against both schedulers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4816) SystemClock API broken in 2.9.0

2016-03-14 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194325#comment-15194325
 ] 

Karthik Kambatla commented on YARN-4816:


My bad. Thanks for catching and fixing this, Sid. 

+1. 

> SystemClock API broken in 2.9.0
> ---
>
> Key: YARN-4816
> URL: https://issues.apache.org/jira/browse/YARN-4816
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Siddharth Seth
> Attachments: YARN-4816.1.txt
>
>
> https://issues.apache.org/jira/browse/YARN-4526 removed the public 
> constructor on SystemClock - making it an incompatible change.
> cc [~kasha]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4719) Add a helper library to maintain node state and allows common queries

2016-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194316#comment-15194316
 ] 

Hudson commented on YARN-4719:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9460 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9460/])
YARN-4719. Add a helper library to maintain node state and allows common 
(kasha: rev 20d389ce61eaacb5ddfb329015f50e96ad894f8d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/NodeFilter.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ClusterNodeTracker.java


> Add a helper library to maintain node state and allows common queries
> -
>
> Key: YARN-4719
> URL: https://issues.apache.org/jira/browse/YARN-4719
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 2.9.0
>
> Attachments: yarn-4719-1.patch, yarn-4719-2.patch, yarn-4719-3.patch, 
> yarn-4719-4.patch, yarn-4719-5.patch, yarn-4719-6.patch, yarn-4719-7.patch
>
>
> The scheduler could use a helper library to maintain node state and allowing 
> matching/sorting queries. Several reasons for this:
> # Today, a lot of the node state management is done separately in each 
> scheduler. Having a single library will take us that much closer to reducing 
> duplication among schedulers.
> # Adding a filtering/matching API would simplify node labels and locality 
> significantly. 
> # An API that returns a sorted list for a custom comparator would help 
> YARN-1011 where we want to sort by allocation and utilization for 
> continuous/asynchronous and opportunistic scheduling respectively. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4766) NM should not aggregate logs older than the retention policy

2016-03-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194303#comment-15194303
 ] 

Hadoop QA commented on YARN-4766:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 33s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 3s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 11 new + 
123 unchanged - 2 fixed = 134 total (was 125) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 53s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 5s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 10s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 36s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch 

[jira] [Updated] (YARN-4816) SystemClock API broken in 2.9.0

2016-03-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated YARN-4816:
-
Attachment: YARN-4816.1.txt

Trivial patch. Re-introduces the public constructor and marks it as deprecated.

[~kasha] - please review.

> SystemClock API broken in 2.9.0
> ---
>
> Key: YARN-4816
> URL: https://issues.apache.org/jira/browse/YARN-4816
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Siddharth Seth
> Attachments: YARN-4816.1.txt
>
>
> https://issues.apache.org/jira/browse/YARN-4526 removed the public 
> constructor on SystemClock - making it an incompatible change.
> cc [~kasha]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4805) Don't go through all schedulers in ParameterizedTestBase

2016-03-14 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194274#comment-15194274
 ] 

Robert Kanter commented on YARN-4805:
-

+1

> Don't go through all schedulers in ParameterizedTestBase
> 
>
> Key: YARN-4805
> URL: https://issues.apache.org/jira/browse/YARN-4805
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-4805-1.patch
>
>
> ParameterizedSchedulerTestBase was created to make sure tests that were 
> written with CapacityScheduler in mind don't fail when run against 
> FairScheduler. Before this was introduced, tests would fail because 
> FairScheduler requires an allocation file. 
> However, the tests that extend it take about 10 minutes per scheduler. So, 
> instead of running against both schedulers, we could setup the scheduler 
> appropriately so the tests pass against both schedulers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4816) SystemClock API broken in 2.9.0

2016-03-14 Thread Siddharth Seth (JIRA)
Siddharth Seth created YARN-4816:


 Summary: SystemClock API broken in 2.9.0
 Key: YARN-4816
 URL: https://issues.apache.org/jira/browse/YARN-4816
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.9.0
Reporter: Siddharth Seth


https://issues.apache.org/jira/browse/YARN-4526 removed the public constructor 
on SystemClock - making it an incompatible change.

cc [~kasha]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4814) ATS 1.5 timelineclient impl call flush after every event write

2016-03-14 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-4814:

Attachment: YARN-4814.1.patch

> ATS 1.5 timelineclient impl call flush after every event write
> --
>
> Key: YARN-4814
> URL: https://issues.apache.org/jira/browse/YARN-4814
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4814.1.patch
>
>
> ATS 1.5 timelineclient impl call flush after every event write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4814) ATS 1.5 timelineclient impl call flush after every event write

2016-03-14 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194270#comment-15194270
 ] 

Xuan Gong commented on YARN-4814:
-

[~gtCarrera9] Could you review it, please ?

> ATS 1.5 timelineclient impl call flush after every event write
> --
>
> Key: YARN-4814
> URL: https://issues.apache.org/jira/browse/YARN-4814
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4814.1.patch
>
>
> ATS 1.5 timelineclient impl call flush after every event write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started

2016-03-14 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194235#comment-15194235
 ] 

Eric Badger commented on YARN-4686:
---

As per my comment above:

TestYarnCLI, TestAMRMClient, TestYarnClient, TestNMClient, and TestGetGroups 
are failing in multiple recent precommit builds YARN-4117, YARN-4630, YARN-4676.
TestMiniYarnClusterNodeUtilization is tracked by YARN-4566.
TestContainerManagerSecurity is failing on other recent precommit builds 
YARN-4117, YARN-4566.

Those are the only tests that have failed and all all unrelated to the patch. 
[~jlowe] [~kasha] [~eepayne] please review the patch and give me your thoughts. 
Thanks!

> MiniYARNCluster.start() returns before cluster is completely started
> 
>
> Key: YARN-4686
> URL: https://issues.apache.org/jira/browse/YARN-4686
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Rohith Sharma K S
>Assignee: Eric Badger
> Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, 
> YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, 
> YARN-4686.005.patch
>
>
> TestRMNMInfo fails intermittently. Below is trace for the failure
> {noformat}
> testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo)  Time elapsed: 0.28 
> sec  <<< FAILURE!
> java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but 
> was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4815) ATS 1.5 timelineclinet impl try to create attempt directory for every event call

2016-03-14 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-4815:
---

 Summary: ATS 1.5 timelineclinet impl try to create attempt 
directory for every event call
 Key: YARN-4815
 URL: https://issues.apache.org/jira/browse/YARN-4815
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong


ATS 1.5 timelineclinet impl, try to create attempt directory for every event 
call. Since per attempt only one call to create directory is enough, this is 
causing perf issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4812) TestFairScheduler#testContinuousScheduling fails intermittently

2016-03-14 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194170#comment-15194170
 ] 

Karthik Kambatla commented on YARN-4812:


I ran it a few thousand times, and didn't see it fail. Before this patch, it 
would hardly take 10 runs to see this fail. 

> TestFairScheduler#testContinuousScheduling fails intermittently
> ---
>
> Key: YARN-4812
> URL: https://issues.apache.org/jira/browse/YARN-4812
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-4812-1.patch
>
>
> This test has failed in the past, and there seem to be more issues. 
> {noformat}
> java.lang.AssertionError: expected:<2> but was:<1>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testContinuousScheduling(TestFairScheduler.java:3816)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4814) ATS 1.5 timelineclient impl call flush after every event write

2016-03-14 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194161#comment-15194161
 ] 

Xuan Gong commented on YARN-4814:
-

Looks like the flush happens in ObjectMapper#writeValue.
{code}
@Override
public void writeValue(JsonGenerator jgen, Object value)
throws IOException, JsonGenerationException, JsonMappingException
{
SerializationConfig config = copySerializationConfig();
if (config.isEnabled(SerializationConfig.Feature.CLOSE_CLOSEABLE) && 
(value instanceof Closeable)) {
_writeCloseableValue(jgen, value, config);
} else {
_serializerProvider.serializeValue(config, jgen, value, 
_serializerFactory);
if 
(config.isEnabled(SerializationConfig.Feature.FLUSH_AFTER_WRITE_VALUE)) {
jgen.flush();
}
}
}
{code}

For the performance purpose, we already have the flush timer, so we do not need 
to flush every time.

> ATS 1.5 timelineclient impl call flush after every event write
> --
>
> Key: YARN-4814
> URL: https://issues.apache.org/jira/browse/YARN-4814
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>
> ATS 1.5 timelineclient impl call flush after every event write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4814) ATS 1.5 timelineclient impl call flush after every event write

2016-03-14 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194162#comment-15194162
 ] 

Xuan Gong commented on YARN-4814:
-

The simple fix could be: config Feature.FLUSH_AFTER_WRITE_VALUE as false when 
we create objectMapper object.


> ATS 1.5 timelineclient impl call flush after every event write
> --
>
> Key: YARN-4814
> URL: https://issues.apache.org/jira/browse/YARN-4814
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>
> ATS 1.5 timelineclient impl call flush after every event write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4814) ATS 1.5 timelineclient impl call flush after every event write

2016-03-14 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-4814:
---

 Summary: ATS 1.5 timelineclient impl call flush after every event 
write
 Key: YARN-4814
 URL: https://issues.apache.org/jira/browse/YARN-4814
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong


ATS 1.5 timelineclient impl call flush after every event write.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started

2016-03-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194145#comment-15194145
 ] 

Hadoop QA commented on YARN-4686:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 28s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 23s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 35s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 1 new + 
30 unchanged - 0 fixed = 31 total (was 30) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 57s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 4s {color} | 
{color:red} hadoop-yarn-server-tests in the patch failed with JDK v1.8.0_74. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 38s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_74. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 8s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 4s {color} | 
{color:red} hadoop-yarn-server-tests in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | 

[jira] [Created] (YARN-4813) TestRMWebServicesDelegationTokenAuthentication.testDoAs fails intermittently

2016-03-14 Thread Daniel Templeton (JIRA)
Daniel Templeton created YARN-4813:
--

 Summary: TestRMWebServicesDelegationTokenAuthentication.testDoAs 
fails intermittently
 Key: YARN-4813
 URL: https://issues.apache.org/jira/browse/YARN-4813
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.9.0
Reporter: Daniel Templeton


{noformat}
---
 T E S T S
---
Running 
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication
Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 11.627 sec <<< 
FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication
testDoAs[0](org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication)
  Time elapsed: 0.208 sec  <<< ERROR!
java.io.IOException: Server returned HTTP response code: 403 for URL: 
http://localhost:8088/ws/v1/cluster/delegation-token
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626)
at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication$3.call(TestRMWebServicesDelegationTokenAuthentication.java:407)
at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication$3.call(TestRMWebServicesDelegationTokenAuthentication.java:398)
at 
org.apache.hadoop.security.authentication.KerberosTestUtils$1.run(KerberosTestUtils.java:120)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.authentication.KerberosTestUtils.doAs(KerberosTestUtils.java:117)
at 
org.apache.hadoop.security.authentication.KerberosTestUtils.doAsClient(KerberosTestUtils.java:133)
at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.getDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:398)
at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testDoAs(TestRMWebServicesDelegationTokenAuthentication.java:357)


Results :

Tests in error: 
  
TestRMWebServicesDelegationTokenAuthentication.testDoAs:357->getDelegationToken:398
 » IO

Tests run: 8, Failures: 0, Errors: 1, Skipped: 0
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4809) De-duplicate container completion across schedulers

2016-03-14 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194119#comment-15194119
 ] 

Karthik Kambatla commented on YARN-4809:


Please feel free to take up. I might be able to review. 

> De-duplicate container completion across schedulers
> ---
>
> Key: YARN-4809
> URL: https://issues.apache.org/jira/browse/YARN-4809
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Karthik Kambatla
>
> CapacityScheduler and FairScheduler implement containerCompleted the exact 
> same way. Duplication across the schedulers can be avoided. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4812) TestFairScheduler#testContinuousScheduling fails intermittently

2016-03-14 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-4812:
---
Attachment: yarn-4812-1.patch

Moved testContinuousScheduling to a separate class that uses mock clocks 
instead of depending on system clocks. 

> TestFairScheduler#testContinuousScheduling fails intermittently
> ---
>
> Key: YARN-4812
> URL: https://issues.apache.org/jira/browse/YARN-4812
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-4812-1.patch
>
>
> This test has failed in the past, and there seem to be more issues. 
> {noformat}
> java.lang.AssertionError: expected:<2> but was:<1>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testContinuousScheduling(TestFairScheduler.java:3816)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4757) [Umbrella] Simplified discovery of services via DNS mechanisms

2016-03-14 Thread Jonathan Maron (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194047#comment-15194047
 ] 

Jonathan Maron commented on YARN-4757:
--

I'm trying to address all of these issues/concerns in the document I reference 
above - it'll probably be a good way to structure the discussion.  I hope to 
have it posted to this JIRA this week.  Some quick points:

- I'm trying to address security by leveraging the existing DNS security 
extensions (DNSSEC).  The exposed DNS facility will have to accommodate both 
Java and non-Java clients, and as such should probably not provide proprietary 
or non-compliant security mechanisms.  In addition, for the DNS facility will 
more than likely need to interoperate with existing DNS resources (e.g. a 
corporate BIND server).  DNS security is structured more around the idea of 
validating the authenticity of returned information rather than authenticating 
identities.  In addition, I believe the approach I'm proposing will address the 
authentication concerns.

- As Allen mentioned - there are existing approaches for interacting with DNS 
name servers.  I have been utilizing dnsjava to prototype some approaches.

> [Umbrella] Simplified discovery of services via DNS mechanisms
> --
>
> Key: YARN-4757
> URL: https://issues.apache.org/jira/browse/YARN-4757
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jonathan Maron
>
> [See overview doc at YARN-4692, copying the sub-section (3.2.10.2) to track 
> all related efforts.]
> In addition to completing the present story of service­-registry (YARN-913), 
> we also need to simplify the access to the registry entries. The existing 
> read mechanisms of the YARN Service Registry are currently limited to a 
> registry specific (java) API and a REST interface. In practice, this makes it 
> very difficult for wiring up existing clients and services. For e.g, dynamic 
> configuration of dependent end­points of a service is not easy to implement 
> using the present registry­-read mechanisms, *without* code-changes to 
> existing services.
> A good solution to this is to expose the registry information through a more 
> generic and widely used discovery mechanism: DNS. Service Discovery via DNS 
> uses the well-­known DNS interfaces to browse the network for services. 
> YARN-913 in fact talked about such a DNS based mechanism but left it as a 
> future task. (Task) Having the registry information exposed via DNS 
> simplifies the life of services.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4766) NM should not aggregate logs older than the retention policy

2016-03-14 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-4766:
-
Attachment: yarn4766.001.patch

> NM should not aggregate logs older than the retention policy
> 
>
> Key: YARN-4766
> URL: https://issues.apache.org/jira/browse/YARN-4766
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: yarn4766.001.patch
>
>
> When a log aggregation fails on the NM the information is for the attempt is 
> kept in the recovery DB. Log aggregation can fail for multiple reasons which 
> are often related to HDFS space or permissions.
> On restart the recovery DB is read and if an application attempt needs its 
> logs aggregated, the files are scheduled for aggregation without any checks. 
> The log files could be older than the retention limit in which case we should 
> not aggregate them but immediately mark them for deletion from the local file 
> system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4766) NM should not aggregate logs older than the retention policy

2016-03-14 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-4766:
-
Attachment: (was: yarn4766.001.patch)

> NM should not aggregate logs older than the retention policy
> 
>
> Key: YARN-4766
> URL: https://issues.apache.org/jira/browse/YARN-4766
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> When a log aggregation fails on the NM the information is for the attempt is 
> kept in the recovery DB. Log aggregation can fail for multiple reasons which 
> are often related to HDFS space or permissions.
> On restart the recovery DB is read and if an application attempt needs its 
> logs aggregated, the files are scheduled for aggregation without any checks. 
> The log files could be older than the retention limit in which case we should 
> not aggregate them but immediately mark them for deletion from the local file 
> system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4766) NM should not aggregate logs older than the retention policy

2016-03-14 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-4766:
-
Attachment: (was: yarn4766.001.patch)

> NM should not aggregate logs older than the retention policy
> 
>
> Key: YARN-4766
> URL: https://issues.apache.org/jira/browse/YARN-4766
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> When a log aggregation fails on the NM the information is for the attempt is 
> kept in the recovery DB. Log aggregation can fail for multiple reasons which 
> are often related to HDFS space or permissions.
> On restart the recovery DB is read and if an application attempt needs its 
> logs aggregated, the files are scheduled for aggregation without any checks. 
> The log files could be older than the retention limit in which case we should 
> not aggregate them but immediately mark them for deletion from the local file 
> system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4766) NM should not aggregate logs older than the retention policy

2016-03-14 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-4766:
-
Attachment: (was: yarn4766.001.patch)

> NM should not aggregate logs older than the retention policy
> 
>
> Key: YARN-4766
> URL: https://issues.apache.org/jira/browse/YARN-4766
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> When a log aggregation fails on the NM the information is for the attempt is 
> kept in the recovery DB. Log aggregation can fail for multiple reasons which 
> are often related to HDFS space or permissions.
> On restart the recovery DB is read and if an application attempt needs its 
> logs aggregated, the files are scheduled for aggregation without any checks. 
> The log files could be older than the retention limit in which case we should 
> not aggregate them but immediately mark them for deletion from the local file 
> system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started

2016-03-14 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-4686:
--
Target Version/s: 3.0.0, 2.8.0, 2.7.3  (was: 2.7.3)

> MiniYARNCluster.start() returns before cluster is completely started
> 
>
> Key: YARN-4686
> URL: https://issues.apache.org/jira/browse/YARN-4686
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Rohith Sharma K S
>Assignee: Eric Badger
> Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, 
> YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, 
> YARN-4686.005.patch
>
>
> TestRMNMInfo fails intermittently. Below is trace for the failure
> {noformat}
> testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo)  Time elapsed: 0.28 
> sec  <<< FAILURE!
> java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but 
> was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4757) [Umbrella] Simplified discovery of services via DNS mechanisms

2016-03-14 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193930#comment-15193930
 ] 

Allen Wittenauer commented on YARN-4757:


bq. I am not expert on DNS so it is good to hear that you have thought through 
this and done your homework.

I'm (probably) not doing the work either, but I've been working with DNS for an 
extremely long time... (as in "before Java existed" long time)

bq.  It still does not change the need for 2 way authentication and making sure 
that we can restrict who registers for a service

Yup. I share this concern.  This is a security hole waiting to happen.

bq. I can tell java does not come with built in support, not the end of the 
world, but also likely non-trivial. 

I'm assuming by built-in, you mean a specific method for querying SRV records 
since the Java libs clearly allow one to query for records, even if it is 
through things like the "fun" JNDI.  But fret not, others are already working 
in this space with lots of example code to look at.  See 
https://github.com/spotify/dns-java, 
https://github.com/couchbase/couchbase-java-client, http://www.dnsjava.org/, 
and several others.  This isn't new ground being covered here at all and all of 
the above referenced code should be in a compatible license.

> [Umbrella] Simplified discovery of services via DNS mechanisms
> --
>
> Key: YARN-4757
> URL: https://issues.apache.org/jira/browse/YARN-4757
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jonathan Maron
>
> [See overview doc at YARN-4692, copying the sub-section (3.2.10.2) to track 
> all related efforts.]
> In addition to completing the present story of service­-registry (YARN-913), 
> we also need to simplify the access to the registry entries. The existing 
> read mechanisms of the YARN Service Registry are currently limited to a 
> registry specific (java) API and a REST interface. In practice, this makes it 
> very difficult for wiring up existing clients and services. For e.g, dynamic 
> configuration of dependent end­points of a service is not easy to implement 
> using the present registry­-read mechanisms, *without* code-changes to 
> existing services.
> A good solution to this is to expose the registry information through a more 
> generic and widely used discovery mechanism: DNS. Service Discovery via DNS 
> uses the well-­known DNS interfaces to browse the network for services. 
> YARN-913 in fact talked about such a DNS based mechanism but left it as a 
> future task. (Task) Having the registry information exposed via DNS 
> simplifies the life of services.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4719) Add a helper library to maintain node state and allows common queries

2016-03-14 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193857#comment-15193857
 ] 

Wangda Tan commented on YARN-4719:
--

+1, thanks [~kasha].

> Add a helper library to maintain node state and allows common queries
> -
>
> Key: YARN-4719
> URL: https://issues.apache.org/jira/browse/YARN-4719
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-4719-1.patch, yarn-4719-2.patch, yarn-4719-3.patch, 
> yarn-4719-4.patch, yarn-4719-5.patch, yarn-4719-6.patch, yarn-4719-7.patch
>
>
> The scheduler could use a helper library to maintain node state and allowing 
> matching/sorting queries. Several reasons for this:
> # Today, a lot of the node state management is done separately in each 
> scheduler. Having a single library will take us that much closer to reducing 
> duplication among schedulers.
> # Adding a filtering/matching API would simplify node labels and locality 
> significantly. 
> # An API that returns a sorted list for a custom comparator would help 
> YARN-1011 where we want to sort by allocation and utilization for 
> continuous/asynchronous and opportunistic scheduling respectively. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1547) Prevent DoS of ApplicationMasterProtocol by putting in limits

2016-03-14 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-1547:
---
Attachment: (was: YARN-1547.pdf)

> Prevent DoS of ApplicationMasterProtocol by putting in limits
> -
>
> Key: YARN-1547
> URL: https://issues.apache.org/jira/browse/YARN-1547
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Giovanni Matteo Fumarola
>
> Points of DoS in ApplicationMasterProtocol
>  - Host and trackingURL in RegisterApplicationMasterRequest
>  - Diagnostics, final trackingURL in FinishApplicationMasterRequest
>  - Unlimited number of resourceAsks, containersToBeReleased and 
> resourceBlacklistRequest in AllocateRequest
> -- Unbounded number of priorities and/or resourceRequests in each ask.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4751) In 2.7, Labeled queue usage not shown properly in capacity scheduler UI

2016-03-14 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193832#comment-15193832
 ] 

Eric Payne commented on YARN-4751:
--

Thanks, [~sunilg]. I will look into trunk's version of 
{{TestCapacitySchedulerNodeLabelUpdate}}.

Regarding the larger issue of how to fix this problem in 2.7, I am fine if you 
want to provide a backport of some of the patches you mentioned. However, my 
biggest concern is the time and effort that will take, along with the added 
risk. As I mentioned above, there seems to be a lot of inter-dependencies that 
involve adding more features and fixes than just the one documented by this 
JIRA. I'm not sure we want all of that complexity going back into 2.7.

> In 2.7, Labeled queue usage not shown properly in capacity scheduler UI
> ---
>
> Key: YARN-4751
> URL: https://issues.apache.org/jira/browse/YARN-4751
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 2.7.3
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: 2.7 CS UI No BarGraph.jpg, 
> YARH-4752-branch-2.7.001.patch, YARH-4752-branch-2.7.002.patch
>
>
> In 2.6 and 2.7, the capacity scheduler UI does not have the queue graphs 
> separated by partition. When applications are running on a labeled queue, no 
> color is shown in the bar graph, and several of the "Used" metrics are zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4757) [Umbrella] Simplified discovery of services via DNS mechanisms

2016-03-14 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193769#comment-15193769
 ] 

Robert Joseph Evans commented on YARN-4757:
---

[~aw], I am not expert on DNS so it is good to hear that you have thought 
through this and done your homework.  I read up a little on SRV records and it 
looks like a good fit.  It still does not change the need for 2 way 
authentication and making sure that we can restrict who registers for a 
service, but because SRV records are not a drop in replacement for A/CNAME 
records it should not be as big of an issue.

Clients are likely going to need to make changes to support SRV records, and 
from what I can tell java does not come with built in support, not the end of 
the world, but also likely non-trivial.  Especially when it looks like the 
industry has not decided on how they want to support http. (Although I could be 
wrong on all of that, because like I said I am not an expert here)

I just want to be sure that you are thinking things through, and it looks like 
you are so I am happy.

> [Umbrella] Simplified discovery of services via DNS mechanisms
> --
>
> Key: YARN-4757
> URL: https://issues.apache.org/jira/browse/YARN-4757
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jonathan Maron
>
> [See overview doc at YARN-4692, copying the sub-section (3.2.10.2) to track 
> all related efforts.]
> In addition to completing the present story of service­-registry (YARN-913), 
> we also need to simplify the access to the registry entries. The existing 
> read mechanisms of the YARN Service Registry are currently limited to a 
> registry specific (java) API and a REST interface. In practice, this makes it 
> very difficult for wiring up existing clients and services. For e.g, dynamic 
> configuration of dependent end­points of a service is not easy to implement 
> using the present registry­-read mechanisms, *without* code-changes to 
> existing services.
> A good solution to this is to expose the registry information through a more 
> generic and widely used discovery mechanism: DNS. Service Discovery via DNS 
> uses the well-­known DNS interfaces to browse the network for services. 
> YARN-913 in fact talked about such a DNS based mechanism but left it as a 
> future task. (Task) Having the registry information exposed via DNS 
> simplifies the life of services.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-14 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193764#comment-15193764
 ] 

Sangjin Lee commented on YARN-4712:
---

Also some quick comments on the latest patch:

(ContainersMonitorImpl.java)
- l.469-473: We need to note other usages of {{cpuUsageTotalCoresPercentage}}. 
It is used in tracking the container resource utilization, as well as passed to 
{{ContainerMetrics.forContainer()}}. If we're no longer going to use this for 
the {{NMTimelinePublisher}}, we might need to it differently?

(NMTimelinePublisher.java)
- l.117: we should change the argument name from 
{{cpuUsageTotalCoresPercentage}} to {{cpuUsagePercentPerCore}}


> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4812) TestFairScheduler#testContinuousScheduling fails intermittently

2016-03-14 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-4812:
---
Description: 
This test has failed in the past, and there seem to be more issues. 

{noformat}
java.lang.AssertionError: expected:<2> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testContinuousScheduling(TestFairScheduler.java:3816)
{noformat}

  was:This test has failed in the past, and there seem to be more issues. 


> TestFairScheduler#testContinuousScheduling fails intermittently
> ---
>
> Key: YARN-4812
> URL: https://issues.apache.org/jira/browse/YARN-4812
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>
> This test has failed in the past, and there seem to be more issues. 
> {noformat}
> java.lang.AssertionError: expected:<2> but was:<1>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testContinuousScheduling(TestFairScheduler.java:3816)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4794) Distributed shell app gets stuck on stopping containers after App completes

2016-03-14 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4794:
--
Target Version/s: 2.8.0, 2.7.3, 2.9.0

Tentatively targeting all unreleased versions..

> Distributed shell app gets stuck on stopping containers after App completes
> ---
>
> Key: YARN-4794
> URL: https://issues.apache.org/jira/browse/YARN-4794
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Jian He
>Priority: Critical
>
> Distributed shell app gets stuck on stopping containers after App completes 
> with the following exception
> {code:title = app log}
> 15/12/10 14:52:20 INFO distributedshell.ApplicationMaster: Application 
> completed. Stopping running containers
> 15/12/10 14:52:20 WARN ipc.Client: Exception encountered while connecting to 
> the server : java.nio.channels.ClosedByInterruptException
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4771) Some containers can be skipped during log aggregation after NM restart

2016-03-14 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4771:
--
Priority: Critical  (was: Major)

Sounds bad given also the high possibility of leak on the file-system with 
non-aggregated / non-deleted container-logs, bumping priority.

> Some containers can be skipped during log aggregation after NM restart
> --
>
> Key: YARN-4771
> URL: https://issues.apache.org/jira/browse/YARN-4771
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.2
>Reporter: Jason Lowe
>Priority: Critical
> Attachments: YARN-4771.001.patch, YARN-4771.002.patch
>
>
> A container can be skipped during log aggregation after a work-preserving 
> nodemanager restart if the following events occur:
> # Container completes more than 
> yarn.nodemanager.duration-to-track-stopped-containers milliseconds before the 
> restart
> # At least one other container completes after the above container and before 
> the restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-14 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193732#comment-15193732
 ] 

Sangjin Lee commented on YARN-4712:
---

I also think that {{cpuUsagePercentPerCore}} might be a better metric to record 
than {{cpuUsageTotalCorePercentage}}.

One way to understand the difference in using either is with the former the 
unit is the cores and with the latter it is the machines. Other aspects are 
entirely similar. Thus, it follows that {{cpuUsagePercentPerCore}} is a 
finer-grained value than {{cpuUsageTotalCorePercentage}}.

For example, to come up with a relative utilization of an app against the full 
cluster, you need the number of cores as the denominator with the former, and 
the number of machines with the latter. Granted, obtaining the number of cores 
can be more difficult than the number of machines.

Either model breaks down when those units are no longer interchangeable. For 
example, with {{cpuUsageTotalCorePercentage}}, it causes inaccurate values if 
the machines are not of equal size (e.g. machines with different numbers of 
cores). With {{cpuUsagePercentPerCore}}, it can report inaccurate utilization 
of the cluster if clock speeds are different between machines.

\[1\] cpuUsagePercentPerCore
- pro: more accurate and finer-grained reporting of utilization
- con: requires the number of cores to come up with the cluster-wide 
utilization of anything
- con: still doesn’t account for different core performance

\[2\] cpuUsageTotalCoresPercentage
- pro: easier to come up with cluster-wide utilization
- con: coarser-grained metric that breaks down the moment machines are not 
equivalent

\[other points\]
\[1\] stick with pure utilization
One point to consider is whether we should take into account the available 
capacity as opposed to full machine capacity. There are a couple of ways the 
available capacity can be different than the full capacity. One is via 
{{nodeCpuPercentageForYARN}} (coming from the cpu-limit config). Another 
mechanism is via the allocated vcores mechanism. Either way, for example, one 
may allocate only 6 cores out of a 8-core machine. If a container is using 6 
cores, the question is whether that should be reported as 100% utilization or 
75% utilization.

Although an argument can be made for either outcome, I think it might be 
simpler to stick with a pure utilization approach. It would be easier to match 
those numbers against CPU measurements coming from direct means. We should 
consider CPU reported by the NM as plain utilization numbers.

\[2\] stick with physical cores vs. vcores
Another potentially complicating factor is whether we should consider using 
vcores Using vcores would put this closer to YARN’s resource scheduling model. 
However, IMO it would make things unnecessarily more complicated.

Again, in the vein of treating the CPU as plain utilization that can be matched 
against the direct measurements, I think we should stick with physical cores.

Thoughts?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4812) TestFairScheduler#testContinuousScheduling fails intermittently

2016-03-14 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-4812:
--

 Summary: TestFairScheduler#testContinuousScheduling fails 
intermittently
 Key: YARN-4812
 URL: https://issues.apache.org/jira/browse/YARN-4812
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla


This test has failed in the past, and there seem to be more issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4757) [Umbrella] Simplified discovery of services via DNS mechanisms

2016-03-14 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193627#comment-15193627
 ] 

Allen Wittenauer commented on YARN-4757:


bq.  As far as I know there is no standard for including port(s) in a DNS 
entry. 

The proposed solution better use SRV records and not just some stupidly naive 
approach with A/CNAME records.  SRV is built for long lived service discovery 
using DNS and covers such things as port numbers, weighting, etc.

> [Umbrella] Simplified discovery of services via DNS mechanisms
> --
>
> Key: YARN-4757
> URL: https://issues.apache.org/jira/browse/YARN-4757
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jonathan Maron
>
> [See overview doc at YARN-4692, copying the sub-section (3.2.10.2) to track 
> all related efforts.]
> In addition to completing the present story of service­-registry (YARN-913), 
> we also need to simplify the access to the registry entries. The existing 
> read mechanisms of the YARN Service Registry are currently limited to a 
> registry specific (java) API and a REST interface. In practice, this makes it 
> very difficult for wiring up existing clients and services. For e.g, dynamic 
> configuration of dependent end­points of a service is not easy to implement 
> using the present registry­-read mechanisms, *without* code-changes to 
> existing services.
> A good solution to this is to expose the registry information through a more 
> generic and widely used discovery mechanism: DNS. Service Discovery via DNS 
> uses the well-­known DNS interfaces to browse the network for services. 
> YARN-913 in fact talked about such a DNS based mechanism but left it as a 
> future task. (Task) Having the registry information exposed via DNS 
> simplifies the life of services.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started

2016-03-14 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-4686:
--
Attachment: YARN-4686.005.patch

I moved the total capacity check out of the waitForNodeManagersToConnect method 
and into the TestYarnClient#testReservationAPIs test so as to not fail other 
tests. 

> MiniYARNCluster.start() returns before cluster is completely started
> 
>
> Key: YARN-4686
> URL: https://issues.apache.org/jira/browse/YARN-4686
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Rohith Sharma K S
>Assignee: Eric Badger
> Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, 
> YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, 
> YARN-4686.005.patch
>
>
> TestRMNMInfo fails intermittently. Below is trace for the failure
> {noformat}
> testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo)  Time elapsed: 0.28 
> sec  <<< FAILURE!
> java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but 
> was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4757) [Umbrella] Simplified discovery of services via DNS mechanisms

2016-03-14 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193577#comment-15193577
 ] 

Robert Joseph Evans commented on YARN-4757:
---

I am +1 on the idea of using DNS for long lived service discovery, but we need 
to be very very careful about security.  If we are not all of the problems 
possible with https://en.wikipedia.org/wiki/DNS_spoofing would likely be 
possible with this too.  We need to be positive that we can restrict the names 
allowed so there are no conflicts with other servers on the network/internet.  
Additionally if we make this super simple, which is the entire goal here, then 
we are covering up some really potentially serious issues with client code, 
that a normal server running off YARN would not expect to have.  It really 
comes down to any service running on YARN that wants to be secure needs to have 
2 way authentication client authenticates server and server authenticates 
clients.  There are timing attacks and other things that can happen when a 
process crashes and lets go of a port.  Internal web services especially feel 
vulnerable because unless you enable SSL they will be insecure, something that 
many groups avoid on internal services because of the extra overhead of doing 
encryption.

Do you plan on handling ephemeral ports in some way? As far as I know there is 
no standard for including port(s) in a DNS entry.  If we do come up with 
something that is non-standard doesn't that still necessitate client side 
changes which was an expressed goal of this JIRA?  If we don't handle ephemeral 
ports are we going to add in mesos-like scheduling of ports?

  

> [Umbrella] Simplified discovery of services via DNS mechanisms
> --
>
> Key: YARN-4757
> URL: https://issues.apache.org/jira/browse/YARN-4757
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jonathan Maron
>
> [See overview doc at YARN-4692, copying the sub-section (3.2.10.2) to track 
> all related efforts.]
> In addition to completing the present story of service­-registry (YARN-913), 
> we also need to simplify the access to the registry entries. The existing 
> read mechanisms of the YARN Service Registry are currently limited to a 
> registry specific (java) API and a REST interface. In practice, this makes it 
> very difficult for wiring up existing clients and services. For e.g, dynamic 
> configuration of dependent end­points of a service is not easy to implement 
> using the present registry­-read mechanisms, *without* code-changes to 
> existing services.
> A good solution to this is to expose the registry information through a more 
> generic and widely used discovery mechanism: DNS. Service Discovery via DNS 
> uses the well-­known DNS interfaces to browse the network for services. 
> YARN-913 in fact talked about such a DNS based mechanism but left it as a 
> future task. (Task) Having the registry information exposed via DNS 
> simplifies the life of services.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4811) Generate histograms for actual container resource usage

2016-03-14 Thread Varun Vasudev (JIRA)
Varun Vasudev created YARN-4811:
---

 Summary: Generate histograms for actual container resource usage
 Key: YARN-4811
 URL: https://issues.apache.org/jira/browse/YARN-4811
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Vasudev
Assignee: Varun Vasudev


The ContainerMetrics class stores some details about actual container resource 
usage. It would be useful to generate histograms for the actual resource usage 
as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started

2016-03-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193505#comment-15193505
 ] 

Hadoop QA commented on YARN-4686:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 2s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 1 new + 
30 unchanged - 0 fixed = 31 total (was 30) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 5s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 14s {color} 
| {color:red} hadoop-yarn-server-tests in the patch failed with JDK v1.8.0_74. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 31s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_74. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 37s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 21s {color} 
| {color:red} hadoop-yarn-server-tests in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | 

[jira] [Commented] (YARN-4785) inconsistent value type of the "type" field for LeafQueueInfo in response of RM REST API - cluster/scheduler

2016-03-14 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193476#comment-15193476
 ] 

Varun Vasudev commented on YARN-4785:
-

[~jhsenjaliya] - Sorry to annoy you about this - can you give me some details 
about your environment? I added a test for the type field but 
TestRMWebServicesCapacitySched passed for me.

I tested on a Mac OS X - 10.11.2 with JDK 1.7.0_71 and on Ubuntu 14.04 with 
OpenJDK 1.7.0_79 and 1.8.0_72.

> inconsistent value type of the "type" field for LeafQueueInfo in response of 
> RM REST API - cluster/scheduler
> 
>
> Key: YARN-4785
> URL: https://issues.apache.org/jira/browse/YARN-4785
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.6.0
>Reporter: Jayesh
>  Labels: REST_API
>
> I see inconsistent value type ( String and Array ) of the "type" field for 
> LeafQueueInfo in response of RM REST API - cluster/scheduler
> as per the spec it should be always String.
> here is the sample output ( removed non-relevant fields )
> {code}
> {
>   "scheduler": {
> "schedulerInfo": {
>   "type": "capacityScheduler",
>   "capacity": 100,
>   ...
>   "queueName": "root",
>   "queues": {
> "queue": [
>   {
> "type": "capacitySchedulerLeafQueueInfo",
> "capacity": 0.1,
> 
>   },
>   {
> "type": [
>   "capacitySchedulerLeafQueueInfo"
> ],
> "capacity": 0.1,
> "queueName": "test-queue",
> "state": "RUNNING",
> 
>   },
>   {
> "type": [
>   "capacitySchedulerLeafQueueInfo"
> ],
> "capacity": 2.5,
> 
>   },
>   {
> "capacity": 25,
> 
> "state": "RUNNING",
> "queues": {
>   "queue": [
> {
>   "capacity": 6,
>   "state": "RUNNING",
>   "queues": {
> "queue": [
>   {
> "type": "capacitySchedulerLeafQueueInfo",
> "capacity": 100,
> ...
>   }
> ]
>   },
>   
> },
> {
>   "capacity": 6,
>   ...
>   "state": "RUNNING",
>   "queues": {
> "queue": [
>   {
> "type": "capacitySchedulerLeafQueueInfo",
> "capacity": 100,
> ...
>   }
> ]
>   },
>   ...
> },
> ...
>   ]
> },
> ...
>   }
> ]
>   }
> }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs

2016-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193474#comment-15193474
 ] 

Hudson commented on YARN-4545:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9458 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9458/])
YARN-4545. Allow YARN distributed shell to use ATS v1.5 APIs. Li Lu via 
(junping_du: rev f291d82cd49c04a81380bc45c97c279d791b571c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/DistributedShellTimelinePlugin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/package-info.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/test/java/org/apache/hadoop/yarn/server/timeline/PluginStoreTestUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* hadoop-project/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/timeline/TimelineVersion.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/timeline/TimelineVersionWatcher.java


> Allow YARN distributed shell to use ATS v1.5 APIs
> -
>
> Key: YARN-4545
> URL: https://issues.apache.org/jira/browse/YARN-4545
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-4545-YARN-4265.001.patch, 
> YARN-4545-trunk.001.patch, YARN-4545-trunk.002.patch, 
> YARN-4545-trunk.003.patch, YARN-4545-trunk.004.patch, 
> YARN-4545-trunk.005.patch, YARN-4545-trunk.006.patch, 
> YARN-4545-trunk.007.patch, YARN-4545-trunk.008.patch, 
> YARN-4545-trunk.009.patch
>
>
> We can use YARN distributed shell as a demo for the ATS v1.5 APIs. We need to 
> allow distributed shell post data with ATS v1.5 API if 1.5 is enabled in the 
> system. We also need to provide a sample plugin to read those data out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs

2016-03-14 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193458#comment-15193458
 ] 

Junping Du commented on YARN-4545:
--

LGTM too. +1. Committing it now.

> Allow YARN distributed shell to use ATS v1.5 APIs
> -
>
> Key: YARN-4545
> URL: https://issues.apache.org/jira/browse/YARN-4545
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-4545-YARN-4265.001.patch, 
> YARN-4545-trunk.001.patch, YARN-4545-trunk.002.patch, 
> YARN-4545-trunk.003.patch, YARN-4545-trunk.004.patch, 
> YARN-4545-trunk.005.patch, YARN-4545-trunk.006.patch, 
> YARN-4545-trunk.007.patch, YARN-4545-trunk.008.patch, 
> YARN-4545-trunk.009.patch
>
>
> We can use YARN distributed shell as a demo for the ATS v1.5 APIs. We need to 
> allow distributed shell post data with ATS v1.5 API if 1.5 is enabled in the 
> system. We also need to provide a sample plugin to read those data out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4517) [YARN-3368] Add nodes page

2016-03-14 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193451#comment-15193451
 ] 

Varun Saxena commented on YARN-4517:


Thanks [~gtCarrera9] for the review.

bq. One main question: maybe we want to unify all application/container views? 
I noticed that right now, the "application views" from the application list and 
from the NM are different. Ideally, we'd like to provide one unified place to 
show one application, no matter the user arrives from the app list, flow list, 
NM app list or anywhere else? We can also integrate Similar story also applies 
to the container view?
The application and container states in NM will be distinct from RM.
The applications and containers seen here are running containers(applications 
are seen a bit longer based on keep alive time - depends on config). This 
information including NM states by itself can be useful.
However,  I think we can fit some container related information(which is 
fetched from NM) on the main container page.  We can leave out some unnecessary 
info too. This is more a mimic of what was there on old UI.
Will have to discuss page layouts and organization in detail.

bq. Meanwhile, maybe it's time to start detailed page style designs. With 
unified app/container views, we need to address questions like where to put 
node id/app ids in the page, and how to organize all available data on the page?
Sure. Suggestions are welcome. I agree we need to discuss this in detail and 
reach a consensus on what goes where. Even I have not given a great deal of 
thought on this. Maybe after we move this branch's code into trunk. Because 
this is important to go in for YARN-2928.

bq. I noticed one workflow related problem: once a NM is in shutdown state, it 
is not possible to go into the node page. What is the assumed debug workflow on 
this? 
The link to node page has been disabled because we cannot reach NM in this 
state. What information are you expecting to give here ? A way to display NM 
logs, for instance ?

bq. On my local machine, links to application logs are broken with just a 500 
error. Maybe we can improve this in future.
Where ? On the app page ?

bq. Seems like there's no need to show "node labels" if node label is not 
enabled?
In UI, we cannot know if node labels are enabled or not(until and unless we 
iterate over the whole output and assume if labels are not there for any node, 
it means labels are not enabled). Even if labels are enabled but not attached 
to any node, output will be same. Maybe once we start ember from within 
RM(there is a JIRA for this), we can think about using these configurations. 
Thoughts ?

bq. I'm not sure about the meaning of the row in node status showing "Node 
Health Report". 
In NM, we can configure disk health check(true by default) and health check 
scripts. Node Health report will contain output from that. It will contain 
information about which disks are bad, for instance. In normal case, it will be 
empty. 

> [YARN-3368] Add nodes page
> --
>
> Key: YARN-4517
> URL: https://issues.apache.org/jira/browse/YARN-4517
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Wangda Tan
>Assignee: Varun Saxena
>  Labels: webui
> Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, 
> Screenshot_after_4709.png, Screenshot_after_4709_1.png, 
> YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch
>
>
> We need nodes page added to next generation web UI, similar to existing 
> RM/nodes page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started

2016-03-14 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-4686:
--
Attachment: YARN-4686.004.patch

The new patch changes the  waitForNodeManagersToConnect method in 
MiniYARNCluster.java so that it waits for the total plan capacity to be greater 
than 0 (to ensure that reservations can be made). It fixes the 
TestYarnClient#testReservationAPIs test failure locally on my machine. However, 
I'm not sure if this check should be in the MiniYARNCluster itself or whether 
it should be in the test code that calls it, since only a small amount of tests 
will actually be worried about the reservation system. 

> MiniYARNCluster.start() returns before cluster is completely started
> 
>
> Key: YARN-4686
> URL: https://issues.apache.org/jira/browse/YARN-4686
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Rohith Sharma K S
>Assignee: Eric Badger
> Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, 
> YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch
>
>
> TestRMNMInfo fails intermittently. Below is trace for the failure
> {noformat}
> testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo)  Time elapsed: 0.28 
> sec  <<< FAILURE!
> java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but 
> was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4810) NM applicationpage cause internal error 500

2016-03-14 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4810:
---
Description: 
Use url /node/application/

*Case 1*
{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.dao.AppInfo.(AppInfo.java:45)
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.ApplicationPage$ApplicationBlock.render(ApplicationPage.java:82)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
at 
org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:848)
at 
org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.NMController.application(NMController.java:58)
... 44 more

{noformat}
*Case 2*
{noformat}
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: java.util.NoSuchElementException
at 
com.google.common.base.AbstractIterator.next(AbstractIterator.java:75)
at 
org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:131)
at 
org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:126)
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.ApplicationPage$ApplicationBlock.render(ApplicationPage.java:79)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
at 
org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:848)
at 
org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.NMController.application(NMController.java:58)
... 44 more

{noformat}



  was:
Use url /node/application/

{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.dao.AppInfo.(AppInfo.java:45)
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.ApplicationPage$ApplicationBlock.render(ApplicationPage.java:82)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
at 
org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:848)
at 
org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.NMController.application(NMController.java:58)
... 44 more

{noformat}




> NM applicationpage cause internal error 500
> ---
>
> Key: YARN-4810
> URL: https://issues.apache.org/jira/browse/YARN-4810
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> Use url /node/application/
> *Case 1*
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.dao.AppInfo.(AppInfo.java:45)
> at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.ApplicationPage$ApplicationBlock.render(ApplicationPage.java:82)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
> at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
> at 
> 

[jira] [Commented] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted

2016-03-14 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192988#comment-15192988
 ] 

Surendra Singh Lilhore commented on YARN-4783:
--

Thanks [~jlowe] for comment..

bq. in the general case we can't leave it around forever because it will 
eventually expire on its own. Therefore we can't support arbitrary delays 
between the application completing and the log aggregation starting.

Agree with you.

> Log aggregation failure for application when Nodemanager is restarted 
> --
>
> Key: YARN-4783
> URL: https://issues.apache.org/jira/browse/YARN-4783
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>
> Scenario :
> =
> 1.Start NM with user dsperf:hadoop
> 2.Configure linux-execute user as dsperf
> 3.Submit application with yarn user 
> 4.Once few containers are allocated to NM 1
> 5.Nodemanager 1 is stopped  (wait for expiry )
> 6.Start node manager after application is completed
> 7.Check the log aggregation is happening for the containers log in NMLocal 
> directory
> Expect Output :
> ===
> Log aggregation should be succesfull
> Actual Output :
> ===
> Log aggreation not successfull



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4810) NM applicationpage cause internal error 500

2016-03-14 Thread Bibin A Chundatt (JIRA)
Bibin A Chundatt created YARN-4810:
--

 Summary: NM applicationpage cause internal error 500
 Key: YARN-4810
 URL: https://issues.apache.org/jira/browse/YARN-4810
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt


Use url /node/application/

{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.dao.AppInfo.(AppInfo.java:45)
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.ApplicationPage$ApplicationBlock.render(ApplicationPage.java:82)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
at 
org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:848)
at 
org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
at 
org.apache.hadoop.yarn.server.nodemanager.webapp.NMController.application(NMController.java:58)
... 44 more

{noformat}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2670) Adding feedback capability to capacity scheduler from external systems

2016-03-14 Thread Ha Son Hai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192982#comment-15192982
 ] 

Ha Son Hai commented on YARN-2670:
--

Is there any news on this JIRA? I'm also very interested in this.

> Adding feedback capability to capacity scheduler from external systems
> --
>
> Key: YARN-2670
> URL: https://issues.apache.org/jira/browse/YARN-2670
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
>
> The sheer growth in data volume and Hadoop cluster size make it a significant 
> challenge to diagnose and locate problems in a production-level cluster 
> environment efficiently and within a short period of time. Often times, the 
> distributed monitoring systems are not capable of detecting a problem well in 
> advance when a large-scale Hadoop cluster starts to deteriorate in 
> performance or becomes unavailable. Thus, incoming workloads, scheduled 
> between the time when cluster starts to deteriorate and the time when the 
> problem is identified, suffer from longer execution times. As a result, both 
> reliability and throughput of the cluster reduce significantly. we address 
> this problem by proposing a system called Astro, which consists of a 
> predictive model and an extension to the Capacity scheduler. The predictive 
> model in Astro takes into account a rich set of cluster behavioral 
> information that are collected by monitoring processes and model them using 
> machine learning algorithms to predict future behavior of the cluster. The 
> Astro predictive model detects anomalies in the cluster and also identifies a 
> ranked set of metrics that have contributed the most towards the problem. The 
> Astro scheduler uses the prediction outcome and the list of metrics to decide 
> whether it needs to move and reduce workloads from the problematic cluster 
> nodes or to prevent additional workload allocations to them, in order to 
> improve both throughput and reliability of the cluster.
> This JIRA is only for adding feedback capabilities to Capacity Scheduler 
> which can take feedback from external systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4809) De-duplicate container completion across schedulers

2016-03-14 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192791#comment-15192791
 ] 

Sunil G commented on YARN-4809:
---

Hi [~kasha] 
I could help in taking this. Pls let me know if you have planned the same.

> De-duplicate container completion across schedulers
> ---
>
> Key: YARN-4809
> URL: https://issues.apache.org/jira/browse/YARN-4809
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Karthik Kambatla
>
> CapacityScheduler and FairScheduler implement containerCompleted the exact 
> same way. Duplication across the schedulers can be avoided. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)