[jira] [Commented] (YARN-3409) Add constraint node labels

2015-05-11 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537676#comment-14537676
 ] 

Dian Fu commented on YARN-3409:
---

Just to post requirements discussed in YARN-3557 here: Constraint node labels 
should be supported to be added from both RM and NM. As some labels such as 
TRUSTED/UNTRUSTED described in YARN-3557 require to be added from RM and some 
labels such as GPU, FPGA, LINUX, WINDOWS are more suitable to be added from NM. 
A large cluster may have all these kinds of labels coexist.

> Add constraint node labels
> --
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, capacityscheduler, client
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of 
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3557) Support Intel Trusted Execution Technology(TXT) in YARN scheduler

2015-05-11 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537681#comment-14537681
 ] 

Dian Fu commented on YARN-3557:
---

Hi [~leftnoteasy],
I have posted the requirements about supporting configure constraints node 
label from both RM and NM on YARN-3409. About support script based node label 
configuration at RM side, what's your thought?

> Support Intel Trusted Execution Technology(TXT) in YARN scheduler
> -
>
> Key: YARN-3557
> URL: https://issues.apache.org/jira/browse/YARN-3557
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Dian Fu
> Attachments: Support TXT in YARN high level design doc.pdf
>
>
> Intel TXT defines platform-level enhancements that provide the building 
> blocks for creating trusted platforms. A TXT aware YARN scheduler can 
> schedule security sensitive jobs on TXT enabled nodes only. YARN-2492 
> provides the capacity to restrict YARN applications to run only on cluster 
> nodes that have a specified node label. This is a good mechanism that be 
> utilized for TXT aware YARN scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3170) YARN architecture document needs updating

2015-05-11 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537726#comment-14537726
 ] 

Tsuyoshi Ozawa commented on YARN-3170:
--

[~brahmareddy] thank you for updating.

{quote} 
We call MapReduce running on YARN "MapReduce 2.0 (MRv2).
{quote}

A trailing double quotation is missing. Please add it before the period. 

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
>  Labels: BB2015-05-TBR
> Attachments: YARN-3170-002.patch, YARN-3170-003.patch, YARN-3170.patch
>
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3615) Yarn and Mapred queue CLI command support for Fairscheduler

2015-05-11 Thread Bibin A Chundatt (JIRA)
Bibin A Chundatt created YARN-3615:
--

 Summary: Yarn and Mapred queue CLI command support for 
Fairscheduler
 Key: YARN-3615
 URL: https://issues.apache.org/jira/browse/YARN-3615
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler, scheduler
Reporter: Bibin A Chundatt
Assignee: Naganarasimha G R


Add support for CLI command when Fair scheduler is configured
Listed few command which needs updation

./yarn queue -status 

*Current output*
{code}
Queue Name : root.sls_queue_2
State : RUNNING
Capacity : 100.0%
Current Capacity : 100.0%
Maximum Capacity : -100.0%
Default Node Label expression :
Accessible Node Labels :

{code}
./mapred queue -info  
./mapred queue  -list

All the below commands currently displaying based on Capacity 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3170) YARN architecture document needs updating

2015-05-11 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-3170:
---
Attachment: YARN-3170-004.patch

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
>  Labels: BB2015-05-TBR
> Attachments: YARN-3170-002.patch, YARN-3170-003.patch, 
> YARN-3170-004.patch, YARN-3170.patch
>
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3170) YARN architecture document needs updating

2015-05-11 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537772#comment-14537772
 ] 

Brahma Reddy Battula commented on YARN-3170:


[~ozawa] updated the patch..Kindly Review..thanks

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
>  Labels: BB2015-05-TBR
> Attachments: YARN-3170-002.patch, YARN-3170-003.patch, 
> YARN-3170-004.patch, YARN-3170.patch
>
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3170) YARN architecture document needs updating

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537780#comment-14537780
 ] 

Hadoop QA commented on YARN-3170:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   2m 53s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 57s | Site still builds. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   6m 13s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731885/YARN-3170-004.patch |
| Optional Tests | site |
| git revision | trunk / 3fa2efc |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7860/console |


This message was automatically generated.

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
>  Labels: BB2015-05-TBR
> Attachments: YARN-3170-002.patch, YARN-3170-003.patch, 
> YARN-3170-004.patch, YARN-3170.patch
>
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3513) Remove unused variables in ContainersMonitorImpl and add debug log for overall resource usage by all containers

2015-05-11 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3513:

Attachment: YARN-3513.20150511-1.patch

Ok [~devaraj.k], updated the patch as per your suggestion.

> Remove unused variables in ContainersMonitorImpl and add debug log for 
> overall resource usage by all containers 
> 
>
> Key: YARN-3513
> URL: https://issues.apache.org/jira/browse/YARN-3513
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Trivial
>  Labels: BB2015-05-TBR, newbie
> Attachments: YARN-3513.20150421-1.patch, YARN-3513.20150503-1.patch, 
> YARN-3513.20150506-1.patch, YARN-3513.20150507-1.patch, 
> YARN-3513.20150508-1.patch, YARN-3513.20150508-1.patch, 
> YARN-3513.20150511-1.patch
>
>
> Some local variables in MonitoringThread.run()  : {{vmemStillInUsage and 
> pmemStillInUsage}} are not used and just updated. 
> Instead we need to add debug log for overall resource usage by all containers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-817) If input path does not exist application/job id is getting assigned.

2015-05-11 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith resolved YARN-817.
-
Resolution: Invalid

> If input path does not exist application/job id is getting assigned.
> 
>
> Key: YARN-817
> URL: https://issues.apache.org/jira/browse/YARN-817
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.2-alpha, 2.0.1-alpha
>Reporter: Nishan Shetty
>Priority: Minor
>
> 1.Run job by giving input as some path which does not exist
> 2.Application/job is is getting assigned.
> 2013-06-12 16:00:24,494 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new 
> applicationId: 12
> Suggestion
> Before assiging job/app id input path check can be made.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-817) If input path does not exist application/job id is getting assigned.

2015-05-11 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537817#comment-14537817
 ] 

Rohith commented on YARN-817:
-

Input path is used by Application JVM. Application client should handle this 
before submiting the application to YARN. 
Closing as Invalid, reopen if any concern on this

> If input path does not exist application/job id is getting assigned.
> 
>
> Key: YARN-817
> URL: https://issues.apache.org/jira/browse/YARN-817
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.2-alpha, 2.0.1-alpha
>Reporter: Nishan Shetty
>Priority: Minor
>
> 1.Run job by giving input as some path which does not exist
> 2.Application/job is is getting assigned.
> 2013-06-12 16:00:24,494 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new 
> applicationId: 12
> Suggestion
> Before assiging job/app id input path check can be made.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3513) Remove unused variables in ContainersMonitorImpl and add debug log for overall resource usage by all containers

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537832#comment-14537832
 ] 

Hadoop QA commented on YARN-3513:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 33s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 32s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 36s | The applied patch generated  1 
new checkstyle issues (total was 27, now 27). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  1s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   5m 57s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  41m 44s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731901/YARN-3513.20150511-1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 3fa2efc |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7861/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7861/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7861/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7861/console |


This message was automatically generated.

> Remove unused variables in ContainersMonitorImpl and add debug log for 
> overall resource usage by all containers 
> 
>
> Key: YARN-3513
> URL: https://issues.apache.org/jira/browse/YARN-3513
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Trivial
>  Labels: BB2015-05-TBR, newbie
> Attachments: YARN-3513.20150421-1.patch, YARN-3513.20150503-1.patch, 
> YARN-3513.20150506-1.patch, YARN-3513.20150507-1.patch, 
> YARN-3513.20150508-1.patch, YARN-3513.20150508-1.patch, 
> YARN-3513.20150511-1.patch
>
>
> Some local variables in MonitoringThread.run()  : {{vmemStillInUsage and 
> pmemStillInUsage}} are not used and just updated. 
> Instead we need to add debug log for overall resource usage by all containers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3614) FileSystemRMStateStore throw exception when failed to remove application, that cause resourcemanager to crash

2015-05-11 Thread nijel (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537837#comment-14537837
 ] 

nijel commented on YARN-3614:
-

hi @lachisis
bq.when standby resourcemanager try to transitiontoActive, it will cost more 
than ten minutes to load applications
Is this a secure cluster ? 

> FileSystemRMStateStore throw exception when failed to remove application, 
> that cause resourcemanager to crash
> -
>
> Key: YARN-3614
> URL: https://issues.apache.org/jira/browse/YARN-3614
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.5.0
>Reporter: lachisis
>Priority: Critical
>
> FileSystemRMStateStore is only a accessorial plug-in of rmstore. 
> When it failed to remove application, I think warning is enough, but now 
> resourcemanager crashed.
> Recently, I configure 
> "yarn.resourcemanager.state-store.max-completed-applications"  to limit 
> applications number in rmstore. when applications number exceed the limit, 
> some old applications will be removed. If failed to remove, resourcemanager 
> will crash.
> The following is log: 
> 2015-05-11 06:58:43,815 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Removing 
> info for app: application_1430994493305_0053
> 2015-05-11 06:58:43,815 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore:
>  Removing info for app: application_1430994493305_0053 at: 
> /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053
> 2015-05-11 06:58:43,816 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
> removing app: application_1430994493305_0053
> java.lang.Exception: Failed to delete 
> /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:572)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:471)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:185)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:171)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:806)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:879)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:874)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
> at java.lang.Thread.run(Thread.java:745)
> 2015-05-11 06:58:43,819 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a 
> org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
> STATE_STORE_OP_FAILED. Cause:
> java.lang.Exception: Failed to delete 
> /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:572)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:471)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:185)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:171)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
>

[jira] [Commented] (YARN-3409) Add constraint node labels

2015-05-11 Thread David Villegas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537858#comment-14537858
 ] 

David Villegas commented on YARN-3409:
--

Thanks for your comment, Wangda. 

I agree that loadAvg may not be useful in all cases. The main idea for dynamic 
label values is the system would be more extensible, and reduce human errors if 
some of the labels can be automatically populated. An example that comes to 
mind, based on Dian's comment, is the NodeManager's Operating System. Rather 
than having an administrator set it, it could be pre-set to the actual OS by 
the NM.

> Add constraint node labels
> --
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, capacityscheduler, client
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of 
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3599) Fix the javadoc of DelegationTokenSecretManager in hadoop-yarn

2015-05-11 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-3599.
--
Resolution: Duplicate

> Fix the javadoc of DelegationTokenSecretManager in hadoop-yarn
> --
>
> Key: YARN-3599
> URL: https://issues.apache.org/jira/browse/YARN-3599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gabor Liptak
>Priority: Trivial
> Attachments: YARN-3599.1.patch, YARN-3599.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.

2015-05-11 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3587:
-
Hadoop Flags: Reviewed

> Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
> -
>
> Key: YARN-3587
> URL: https://issues.apache.org/jira/browse/YARN-3587
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Akira AJISAKA
>Assignee: Gabor Liptak
>Priority: Minor
>  Labels: newbie
> Fix For: 2.8.0
>
> Attachments: YARN-3587.1.patch, YARN-3587.patch
>
>
> In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager,  
> the javadoc of the constructor is as follows:
> {code}
>   /**
>* Create a secret manager
>* @param delegationKeyUpdateInterval the number of seconds for rolling new
>*secret keys.
>* @param delegationTokenMaxLifetime the maximum lifetime of the delegation
>*tokens
>* @param delegationTokenRenewInterval how often the tokens must be renewed
>* @param delegationTokenRemoverScanInterval how often the tokens are 
> scanned
>*for expired tokens
>*/
> {code}
> 1. "the number of seconds" should be "the number of milliseconds".
> 2. It's better to add time unit to the description of other parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3598) Fix the javadoc of DelegationTokenSecretManager in hadoop-mapreduce

2015-05-11 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-3598.
--
Resolution: Duplicate

> Fix the javadoc of DelegationTokenSecretManager in hadoop-mapreduce
> ---
>
> Key: YARN-3598
> URL: https://issues.apache.org/jira/browse/YARN-3598
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gabor Liptak
>Priority: Trivial
> Attachments: YARN-3598.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3596) Fix the javadoc of DelegationTokenSecretManager in hadoop-common

2015-05-11 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-3596.
--
Resolution: Duplicate

> Fix the javadoc of DelegationTokenSecretManager in hadoop-common
> 
>
> Key: YARN-3596
> URL: https://issues.apache.org/jira/browse/YARN-3596
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gabor Liptak
>Priority: Trivial
> Attachments: YARN-3596.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3597) Fix the javadoc of DelegationTokenSecretManager in hadoop-hdfs

2015-05-11 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-3597.
--
Resolution: Duplicate

> Fix the javadoc of DelegationTokenSecretManager in hadoop-hdfs
> --
>
> Key: YARN-3597
> URL: https://issues.apache.org/jira/browse/YARN-3597
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gabor Liptak
>Priority: Trivial
> Attachments: YARN-3597.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.

2015-05-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537869#comment-14537869
 ] 

Hudson commented on YARN-3587:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7790 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7790/])
YARN-3587. Fix the javadoc of DelegationTokenSecretManager in yarn, etc. 
projects. Contributed by Gabor Liptak. (junping_du: rev 
7e543c27fa2881aa65967be384a6203bd5b2304f)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineDelegationTokenSecretManagerService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JHSDelegationTokenSecretManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/token/delegation/DelegationTokenSecretManager.java


> Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
> -
>
> Key: YARN-3587
> URL: https://issues.apache.org/jira/browse/YARN-3587
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Akira AJISAKA
>Assignee: Gabor Liptak
>Priority: Minor
>  Labels: newbie
> Fix For: 2.8.0
>
> Attachments: YARN-3587.1.patch, YARN-3587.patch
>
>
> In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager,  
> the javadoc of the constructor is as follows:
> {code}
>   /**
>* Create a secret manager
>* @param delegationKeyUpdateInterval the number of seconds for rolling new
>*secret keys.
>* @param delegationTokenMaxLifetime the maximum lifetime of the delegation
>*tokens
>* @param delegationTokenRenewInterval how often the tokens must be renewed
>* @param delegationTokenRemoverScanInterval how often the tokens are 
> scanned
>*for expired tokens
>*/
> {code}
> 1. "the number of seconds" should be "the number of milliseconds".
> 2. It's better to add time unit to the description of other parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.

2015-05-11 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3587:
-
Summary: Fix the javadoc of DelegationTokenSecretManager in projects of 
yarn, etc.  (was: Fix the javadoc of DelegationTokenSecretManager in yarn 
project)

> Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
> -
>
> Key: YARN-3587
> URL: https://issues.apache.org/jira/browse/YARN-3587
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Akira AJISAKA
>Assignee: Gabor Liptak
>Priority: Minor
>  Labels: newbie
> Attachments: YARN-3587.1.patch, YARN-3587.patch
>
>
> In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager,  
> the javadoc of the constructor is as follows:
> {code}
>   /**
>* Create a secret manager
>* @param delegationKeyUpdateInterval the number of seconds for rolling new
>*secret keys.
>* @param delegationTokenMaxLifetime the maximum lifetime of the delegation
>*tokens
>* @param delegationTokenRenewInterval how often the tokens must be renewed
>* @param delegationTokenRemoverScanInterval how often the tokens are 
> scanned
>*for expired tokens
>*/
> {code}
> 1. "the number of seconds" should be "the number of milliseconds".
> 2. It's better to add time unit to the description of other parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3276) Refactor and fix null casting in some map cast for TimelineEntity (old and new) and fix findbug warnings

2015-05-11 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537911#comment-14537911
 ] 

Junping Du commented on YARN-3276:
--

Thanks [~zjshen] for review and comments!
bq. TimelineServiceUtils -> TimelineServiceHelper?
Sure. Will update it.

bq.  Is mapreduce using it? Maybe simply @Private
In my understanding, @Private could means it could be used by "Common", "HDFS", 
"MapReduce", and "YARN", so it could be broader than current limitation? I 
didn't remove MapReduce here as from other places, it seems we always keep 
MapReduce there as a practice even no obviously reference from MR project. May 
be better to keep here as it is?

bq. TimelineEvent are not covered?
Nice catch! Will update it.

bq. AllocateResponsePBImpl change is not related?
Yes. There are several findbug warnings (this and change in 
TimelineMetric.java) involved in previous patch on branch YARN-2928. I think it 
could be too overkill to file a separated JIRA to fix this simple issues so I 
put the fix here and update the title a little bit. Make sense?

> Refactor and fix null casting in some map cast for TimelineEntity (old and 
> new) and fix findbug warnings
> 
>
> Key: YARN-3276
> URL: https://issues.apache.org/jira/browse/YARN-3276
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-3276-YARN-2928.v3.patch, 
> YARN-3276-YARN-2928.v4.patch, YARN-3276-v2.patch, YARN-3276-v3.patch, 
> YARN-3276.patch
>
>
> Per discussion in YARN-3087, we need to refactor some similar logic to cast 
> map to hashmap and get rid of NPE issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3276) Refactor and fix null casting in some map cast for TimelineEntity (old and new) and fix findbug warnings

2015-05-11 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3276:
-
Attachment: YARN-3276-YARN-2928.v5.patch

Fix most comments from [~zjshen] in v5 patch.

> Refactor and fix null casting in some map cast for TimelineEntity (old and 
> new) and fix findbug warnings
> 
>
> Key: YARN-3276
> URL: https://issues.apache.org/jira/browse/YARN-3276
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-3276-YARN-2928.v3.patch, 
> YARN-3276-YARN-2928.v4.patch, YARN-3276-YARN-2928.v5.patch, 
> YARN-3276-v2.patch, YARN-3276-v3.patch, YARN-3276.patch
>
>
> Per discussion in YARN-3087, we need to refactor some similar logic to cast 
> map to hashmap and get rid of NPE issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-401) ClientRMService.getQueueInfo can return stale application reports

2015-05-11 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved YARN-401.
-
Resolution: Duplicate

This was fixed by YARN-2978.

> ClientRMService.getQueueInfo can return stale application reports
> -
>
> Key: YARN-401
> URL: https://issues.apache.org/jira/browse/YARN-401
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.2-alpha, 0.23.6
>Reporter: Jason Lowe
>Priority: Minor
>
> ClientRMService.getQueueInfo is modifying a QueueInfo object when application 
> reports are requested.  Unfortunately this QueueInfo object could be a 
> persisting object in the scheduler, and modifying it in this way can lead to 
> stale application reports being returned to the client.  Here's an example 
> scenario with CapacityScheduler:
> # A client asks for queue info on queue X with application reports
> # ClientRMService.getQueueInfo modifies the queue's QueueInfo object and sets 
> application reports on it
> # Another client asks for recursive queue info from the root queue without 
> application reports
> # Since the old application reports are still attached to queue X's QueueInfo 
> object, these stale reports appear in the QueueInfo data for queue X in the 
> results
> Normally if the client is not asking for application reports it won't be 
> looking for and act upon any application reports that happen to appear in the 
> queue info result.  However we shouldn't be returning application reports in 
> the first place, and when we do, they shouldn't be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3422) relatedentities always return empty list when primary filter is set

2015-05-11 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537962#comment-14537962
 ] 

Billie Rinaldi commented on YARN-3422:
--

That's true, changing the name to indicate direction would also be helpful.  I 
think that fixing this limitation would complicate the write path significantly 
and is probably not worthwhile in ATS v1.  If someone were to implement it, we 
would need to take before and after performance measurements and possibly make 
the new feature optional.

> relatedentities always return empty list when primary filter is set
> ---
>
> Key: YARN-3422
> URL: https://issues.apache.org/jira/browse/YARN-3422
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-3422.1.patch
>
>
> When you curl for ats entities with a primary filter, the relatedentities 
> fields always return empty list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3276) Refactor and fix null casting in some map cast for TimelineEntity (old and new) and fix findbug warnings

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537978#comment-14537978
 ] 

Hadoop QA commented on YARN-3276:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 36s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   9m 12s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 49s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 20s | The applied patch generated  2 
new checkstyle issues (total was 105, now 107). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 42s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 50s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| | |  47m 14s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731922/YARN-3276-YARN-2928.v5.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / b3b791b |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7862/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7862/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7862/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7862/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7862/console |


This message was automatically generated.

> Refactor and fix null casting in some map cast for TimelineEntity (old and 
> new) and fix findbug warnings
> 
>
> Key: YARN-3276
> URL: https://issues.apache.org/jira/browse/YARN-3276
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-3276-YARN-2928.v3.patch, 
> YARN-3276-YARN-2928.v4.patch, YARN-3276-YARN-2928.v5.patch, 
> YARN-3276-v2.patch, YARN-3276-v3.patch, YARN-3276.patch
>
>
> Per discussion in YARN-3087, we need to refactor some similar logic to cast 
> map to hashmap and get rid of NPE issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3360) Add JMX metrics to TimelineDataManager

2015-05-11 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-3360:
-
Attachment: YARN-3360.002.patch

Updated patch to trunk.

> Add JMX metrics to TimelineDataManager
> --
>
> Key: YARN-3360
> URL: https://issues.apache.org/jira/browse/YARN-3360
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>  Labels: BB2015-05-TBR
> Attachments: YARN-3360.001.patch, YARN-3360.002.patch
>
>
> The TimelineDataManager currently has no metrics, outside of the standard JVM 
> metrics.  It would be very useful to at least log basic counts of method 
> calls, time spent in those calls, and number of entities/events involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.

2015-05-11 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538017#comment-14538017
 ] 

Akira AJISAKA commented on YARN-3587:
-

Agree with [~djp]. Late +1 from me. Thanks [~djp], [~jianhe], and [~gliptak] 
for contribution!

> Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
> -
>
> Key: YARN-3587
> URL: https://issues.apache.org/jira/browse/YARN-3587
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Akira AJISAKA
>Assignee: Gabor Liptak
>Priority: Minor
>  Labels: newbie
> Fix For: 2.8.0
>
> Attachments: YARN-3587.1.patch, YARN-3587.patch
>
>
> In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager,  
> the javadoc of the constructor is as follows:
> {code}
>   /**
>* Create a secret manager
>* @param delegationKeyUpdateInterval the number of seconds for rolling new
>*secret keys.
>* @param delegationTokenMaxLifetime the maximum lifetime of the delegation
>*tokens
>* @param delegationTokenRenewInterval how often the tokens must be renewed
>* @param delegationTokenRemoverScanInterval how often the tokens are 
> scanned
>*for expired tokens
>*/
> {code}
> 1. "the number of seconds" should be "the number of milliseconds".
> 2. It's better to add time unit to the description of other parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.

2015-05-11 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538030#comment-14538030
 ] 

Junping Du commented on YARN-3587:
--

Thanks [~ajisakaa]! :)

> Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
> -
>
> Key: YARN-3587
> URL: https://issues.apache.org/jira/browse/YARN-3587
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Akira AJISAKA
>Assignee: Gabor Liptak
>Priority: Minor
>  Labels: newbie
> Fix For: 2.8.0
>
> Attachments: YARN-3587.1.patch, YARN-3587.patch
>
>
> In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager,  
> the javadoc of the constructor is as follows:
> {code}
>   /**
>* Create a secret manager
>* @param delegationKeyUpdateInterval the number of seconds for rolling new
>*secret keys.
>* @param delegationTokenMaxLifetime the maximum lifetime of the delegation
>*tokens
>* @param delegationTokenRenewInterval how often the tokens must be renewed
>* @param delegationTokenRemoverScanInterval how often the tokens are 
> scanned
>*for expired tokens
>*/
> {code}
> 1. "the number of seconds" should be "the number of milliseconds".
> 2. It's better to add time unit to the description of other parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.

2015-05-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538034#comment-14538034
 ] 

Hudson commented on YARN-3587:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #192 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/192/])
YARN-3587. Fix the javadoc of DelegationTokenSecretManager in yarn, etc. 
projects. Contributed by Gabor Liptak. (junping_du: rev 
7e543c27fa2881aa65967be384a6203bd5b2304f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineDelegationTokenSecretManagerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/token/delegation/DelegationTokenSecretManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JHSDelegationTokenSecretManager.java


> Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
> -
>
> Key: YARN-3587
> URL: https://issues.apache.org/jira/browse/YARN-3587
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Akira AJISAKA
>Assignee: Gabor Liptak
>Priority: Minor
>  Labels: newbie
> Fix For: 2.8.0
>
> Attachments: YARN-3587.1.patch, YARN-3587.patch
>
>
> In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager,  
> the javadoc of the constructor is as follows:
> {code}
>   /**
>* Create a secret manager
>* @param delegationKeyUpdateInterval the number of seconds for rolling new
>*secret keys.
>* @param delegationTokenMaxLifetime the maximum lifetime of the delegation
>*tokens
>* @param delegationTokenRenewInterval how often the tokens must be renewed
>* @param delegationTokenRemoverScanInterval how often the tokens are 
> scanned
>*for expired tokens
>*/
> {code}
> 1. "the number of seconds" should be "the number of milliseconds".
> 2. It's better to add time unit to the description of other parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.

2015-05-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538046#comment-14538046
 ] 

Hudson commented on YARN-3587:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2140 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2140/])
YARN-3587. Fix the javadoc of DelegationTokenSecretManager in yarn, etc. 
projects. Contributed by Gabor Liptak. (junping_du: rev 
7e543c27fa2881aa65967be384a6203bd5b2304f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineDelegationTokenSecretManagerService.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/token/delegation/DelegationTokenSecretManager.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JHSDelegationTokenSecretManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java


> Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
> -
>
> Key: YARN-3587
> URL: https://issues.apache.org/jira/browse/YARN-3587
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Akira AJISAKA
>Assignee: Gabor Liptak
>Priority: Minor
>  Labels: newbie
> Fix For: 2.8.0
>
> Attachments: YARN-3587.1.patch, YARN-3587.patch
>
>
> In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager,  
> the javadoc of the constructor is as follows:
> {code}
>   /**
>* Create a secret manager
>* @param delegationKeyUpdateInterval the number of seconds for rolling new
>*secret keys.
>* @param delegationTokenMaxLifetime the maximum lifetime of the delegation
>*tokens
>* @param delegationTokenRenewInterval how often the tokens must be renewed
>* @param delegationTokenRemoverScanInterval how often the tokens are 
> scanned
>*for expired tokens
>*/
> {code}
> 1. "the number of seconds" should be "the number of milliseconds".
> 2. It's better to add time unit to the description of other parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-05-11 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538058#comment-14538058
 ] 

Junping Du commented on YARN-3044:
--

Sorry for coming late on this. Latest patch LGTM too. [~sjlee0], feel free to 
go ahead to commit this!
However, for [~vinodkv]'s comments " We can take a dual pronged approach here? 
That or we make the RM-publisher itself a distributed push." which sounds 
reasonable to me but haven't fully addressed in this JIRA. Shall we open a new 
JIRA for further discussion on this?

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>  Labels: BB2015-05-TBR
> Attachments: YARN-3044-YARN-2928.004.patch, 
> YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, 
> YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, 
> YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3360) Add JMX metrics to TimelineDataManager

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538078#comment-14538078
 ] 

Hadoop QA commented on YARN-3360:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 39s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 28s | The applied patch generated  
19 new checkstyle issues (total was 7, now 26). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 47s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   3m  8s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| | |  38m 45s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731939/YARN-3360.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7e543c2 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7863/artifact/patchprocess/diffcheckstylehadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7863/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7863/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7863/console |


This message was automatically generated.

> Add JMX metrics to TimelineDataManager
> --
>
> Key: YARN-3360
> URL: https://issues.apache.org/jira/browse/YARN-3360
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>  Labels: BB2015-05-TBR
> Attachments: YARN-3360.001.patch, YARN-3360.002.patch
>
>
> The TimelineDataManager currently has no metrics, outside of the standard JVM 
> metrics.  It would be very useful to at least log basic counts of method 
> calls, time spent in those calls, and number of entities/events involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2942) Aggregated Log Files should be combined

2015-05-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538087#comment-14538087
 ] 

Jason Lowe commented on YARN-2942:
--

My apologies for taking so long to respond.  I took a look at the v6 and v7 
proposals.  If I understand them correctly they both propose that the NMs 
upload the original per-node aggregated log to HDFS and then something (either 
the NMs or the RM) later comes along and creates the aggregate-of-aggregates 
log with a side-index for faster searching and ability to correct for failed 
appends.  These are reasonable ideas, and I prefer the simpler approach.  
However I didn't see details on solving the race condition where a log reader 
comes along, sees from the index file that the desired log isn't in the 
aggregate-of-aggregates, then opens the log and reads from it just as the log 
is deleted by the entity appending to the aggregate-of-aggregates.  Since we 
don't have UNIX-style refcounting of open files in HDFS, deleting the log while 
the reader is trying to read from it is going to be disruptive.

One thing to consider in the proposals -- do we want a threshold for a per-node 
log file where we do not try to append it to the aggregate-of-aggregates file?  
We have an internal solution where we create per-application har files of the 
logs, and that process intentionally skips files that are already "big enough" 
on their own.  Saves significant time and network traffic aggregating files 
that are already beefy enough on their own to justify their existence, as we're 
primarily concerned with cleaning up the tiny logs per node, per app.

Another issue from log aggregation we've seen in practice is that the proposals 
don't address the significant write load the per-node aggregate files place on 
the namenode.  This isn't an absolute requirement for the design, but we've 
noticed it's not just about the number of files and blocks being created but 
also the overall write load associated with those files.  It would be really 
nice to reduce that load significantly.  Thinking off the top of my head, one 
possibility is to have the RM coordinate log aggregation across the nodes.  It 
would work something like this:
- NMs do not upload logs for an application to the aggregate file until told to 
do so by the RM (probably in NM heartbeat response)
- NMs provide periodic progress reports in their heartbeat on how aggregation 
is proceeding and when it succeeds/fails.
- RM coordinates and tracks aggregation process (which NM is "active", revoking 
NMs that have taken too long without progress, etc.)
- Logs would remain on NM local disk and served from there until they are 
uploaded into the app aggregate file, similar to how they work today with the 
per-node aggregate file

This has the advantages of only uploading the logs to HDFS once, only as a 
single aggregate file (plus index), and doesn't require ZooKeeper.  A 
significant downside is that it prolongs the average time the logs will be 
available on HDFS for an application due to the serialized upload process.

> Aggregated Log Files should be combined
> ---
>
> Key: YARN-2942
> URL: https://issues.apache.org/jira/browse/YARN-2942
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: CombinedAggregatedLogsProposal_v3.pdf, 
> CombinedAggregatedLogsProposal_v6.pdf, CombinedAggregatedLogsProposal_v7.pdf, 
> CompactedAggregatedLogsProposal_v1.pdf, 
> CompactedAggregatedLogsProposal_v2.pdf, 
> ConcatableAggregatedLogsProposal_v4.pdf, 
> ConcatableAggregatedLogsProposal_v5.pdf, YARN-2942-preliminary.001.patch, 
> YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
> YARN-2942.003.patch
>
>
> Turning on log aggregation allows users to easily store container logs in 
> HDFS and subsequently view them in the YARN web UIs from a central place.  
> Currently, there is a separate log file for each Node Manager.  This can be a 
> problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
> accumulating many (possibly small) files per YARN application.  The current 
> “solution” for this problem is to configure YARN (actually the JHS) to 
> automatically delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into 
> one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-05-11 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538146#comment-14538146
 ] 

Zhijie Shen commented on YARN-3044:
---

[~sjlee0], would you mind holding the commit for a while? I want to take a look 
at the last patch too:-)

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>  Labels: BB2015-05-TBR
> Attachments: YARN-3044-YARN-2928.004.patch, 
> YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, 
> YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, 
> YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-05-11 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538152#comment-14538152
 ] 

Sangjin Lee commented on YARN-3044:
---

No problem. Take your time.

[~djp], I'll file a JIRA.

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>  Labels: BB2015-05-TBR
> Attachments: YARN-3044-YARN-2928.004.patch, 
> YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, 
> YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, 
> YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3616) determine how to generate YARN container events

2015-05-11 Thread Sangjin Lee (JIRA)
Sangjin Lee created YARN-3616:
-

 Summary: determine how to generate YARN container events
 Key: YARN-3616
 URL: https://issues.apache.org/jira/browse/YARN-3616
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee


The initial design called for the node manager to write YARN container events 
to take advantage of the distributed writes. RM acting as a sole writer of all 
YARN container events would have significant scalability problems.

Still, there are some types of events that are not captured by the NM. The 
current implementation has both: RM writing container events and NM writing 
container events.

We need to sort this out, and decide how we can write all needed container 
events in a scalable manner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-05-11 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538160#comment-14538160
 ] 

Sangjin Lee commented on YARN-3044:
---

YARN-3616 filed.

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>  Labels: BB2015-05-TBR
> Attachments: YARN-3044-YARN-2928.004.patch, 
> YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, 
> YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, 
> YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3170) YARN architecture document needs updating

2015-05-11 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538162#comment-14538162
 ] 

Allen Wittenauer commented on YARN-3170:


I'll be honest:  I greatly dislike that entire first paragraph. MRv2 needs to 
be struck from the vocabulary.  It was a marketing ploy to get YARN's 
acceptance into Hadoop as a subproject.   It also helps underscore the problems 
of what to call the "new" MR API in the actual MR subproject.  I'm inclined to 
think the entire paragraph should just get deleted.

The second paragraph should be rewritten.  There's little value in comparing 
YARN to earlier versions of Hadoop at this point.  Don't describe YARN in terms 
of the JobTracker.  If I'm new to Hadoop, I have no idea what the heck a JT 
even is.

{code} An application is either a single job in the classical sense of 
Map-Reduce jobs or a DAG of jobs.{code}

* Drop "in the classical sense of Map-Reduce jobs". 

{code} The ResourceManager and per-node slave, the NodeManager (*NM*), form the 
data-computation framework. The ResourceManager is the ultimate authority that 
arbitrates resources among all the applications in the system.
{code}

* Drop "per-node slave,"
* Drop "(*NM*)"
* Add a description of the node manager after the description of the resource 
manager in this paragraph.

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
>  Labels: BB2015-05-TBR
> Attachments: YARN-3170-002.patch, YARN-3170-003.patch, 
> YARN-3170-004.patch, YARN-3170.patch
>
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps

2015-05-11 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538165#comment-14538165
 ] 

Junping Du commented on YARN-3505:
--

bq. If this happens, that means the log aggregation still happens in some of 
NMs.
I see. Agree that we don't need to do any cleanup in this case. 
Some minor comments on updated patch:

In aggregateLogReport.java,
{code}
-  if (report.getDiagnosticMessage() != null
-  && !report.getDiagnosticMessage().isEmpty()) {
-curReport
-  .setDiagnosticMessage(curReport.getDiagnosticMessage() == null
-  ? report.getDiagnosticMessage() : curReport
-.getDiagnosticMessage() + report.getDiagnosticMessage());
+if (!curReport.getLogAggregationStatus().equals(
+  LogAggregationStatus.SUCCEEDED)
+&& !curReport.getLogAggregationStatus().equals(
+  LogAggregationStatus.FAILED)
+&& (report.getLogAggregationStatus().equals(
+  LogAggregationStatus.SUCCEEDED)
+|| report.getLogAggregationStatus().equals(
+  LogAggregationStatus.FAILED))) {
+  statusChanged = true; // anchor 1 for comments
+}
+if (report.getLogAggregationStatus() != 
LogAggregationStatus.RUNNING
+|| curReport.getLogAggregationStatus() !=
+LogAggregationStatus.RUNNING_WITH_FAILURE) {
+  curReport.setLogAggregationStatus(report
+.getLogAggregationStatus()); // anchor 2 for comments
+}
{code}
Are we missing curReport.setLogAggregationStatus() (in above anchor 1 place)? 
We should set SUCCEEDED or FAILED to curReport. Isn't it? In addition, why we 
don't put statusChanged in above anchor 2 place? If we think statusChanged only 
hint status move to final state (SUCCEEDED or FAILED), then we should rename 
statusChanged to something like stateChangedToFinal which sounds more 
obviously. 
BTW, can we make logic here to be something simpler to make status get updated 
except only two cases?:
1. curReport.getLogAggregationStatus() = report.getLogAggregationStatus(); 
2. curReport.getLogAggregationStatus() = RUNNING_WITH_FAILURE && 
report.getLogAggregationStatus() = RUNNING


In updateLogAggregationDiagnosticMessages(),
{code}
if (report.getLogAggregationStatus()
+  == LogAggregationStatus.RUNNING || report.getLogAggregationStatus()
+  == LogAggregationStatus.SUCCEEDED || report.getLogAggregationStatus()
+  == LogAggregationStatus.FAILED) {
{code}
Why case of "report.getLogAggregationStatus() == LogAggregationStatus.FAILED" 
doesn't go to the other branch like: LogAggregationStatus.RUNNING_WITH_FAILURE?

{code}
+LogAggregationDiagnosticsForNMs.put(nodeId, diagnostics);
{code}
Move this into block of "diagnostics == null", right after " diagnostics = new 
ArrayList();", because we only need to call this the first time we put 
diagnostics info. The same problem for failureMessages too.

> Node's Log Aggregation Report with SUCCEED should not cached in RMApps
> --
>
> Key: YARN-3505
> URL: https://issues.apache.org/jira/browse/YARN-3505
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 2.8.0
>Reporter: Junping Du
>Assignee: Xuan Gong
>Priority: Critical
> Attachments: YARN-3505.1.patch, YARN-3505.2.patch, 
> YARN-3505.2.rebase.patch, YARN-3505.3.patch
>
>
> Per discussions in YARN-1402, we shouldn't cache all node's log aggregation 
> reports in RMApps for always, especially for those finished with SUCCEED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3606) Spark container fails to launch if spark-assembly.jar file has different timestamp

2015-05-11 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538190#comment-14538190
 ] 

Steve Loughran commented on YARN-3606:
--

Looking at timestamp is the strategy chosen based on a key assumption : there 
is a single artifact to localise by downloading from a single shared 
filesystem. Trying to use local filesystems, each with a cached copy of the 
artifact, isn't what the NM expects to be doing. If it is, then the normal 
localisation checks aren't


I think the checksum is probably omitted as you have to read the whole file to 
see if it has changed; plus there's the cost of actually recalculating that 
checksum prior to launching every container. Timestamps aren't too great though 
—the check as stands will reject the same file with two different times *or* 
two differently sized files with the same timestamp.

> Spark container fails to launch if spark-assembly.jar file has different 
> timestamp
> --
>
> Key: YARN-3606
> URL: https://issues.apache.org/jira/browse/YARN-3606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.6.0
> Environment: YARN 2.6.0
> Spark 1.3.1
>Reporter: Michael Le
>Priority: Minor
>
> In a YARN cluster, when submitting a Spark job, the Spark job will fail to 
> run because YARN fails to launch containers on the other nodes (not the node 
> where the job submission took place).
> YARN checks for similar spark-assembly.jar file by looking at the timestamps. 
> This check will fail when the spark-assembly.jar is the same but copied to 
> the location at different time.
> YARN throws this exception:
> 15/05/07 20:13:22 INFO yarn.ExecutorRunnable: Setting up executor with 
> commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill 
> %p', -Xms1024m, -Xmx1024m, -Djava.io.tmpdir={{PWD}}/tmp, 
> '-Dspark.driver.port=52357', -Dspark.yarn.app.container.log.dir=, 
> org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, 
> akka.tcp://sparkDriver@xxx:52357/user/CoarseGrainedScheduler, --executor-id, 
> 4, --hostname, xxx, --cores, 1, --app-id, application_1431047540996_0001, 
> --user-class-path, file:$PWD/__app__.jar, 1>, /stdout, 2>, 
> /stderr)
> 15/05/07 20:13:22 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> xxx:34165
> 15/05/07 20:13:27 INFO yarn.YarnAllocator: Completed container 
> container_1431047540996_0001_02_05 (state: COMPLETE, exit status: -1000)
> 15/05/07 20:13:27 INFO yarn.YarnAllocator: Container marked as failed: 
> container_1431047540996_0001_02_05. Exit status: -1000. Diagnostics: 
> Resource 
> file:/home/spark/spark-1.3.1-bin-hadoop2.6/lib/spark-assembly-1.3.1-hadoop2.6.0.jar
>  changed on src filesystem (expected 1430944255000, was 1430944249000
> java.io.IOException: Resource 
> file:/home/spark/spark-1.3.1-bin-hadoop2.6/lib/spark-assembly-1.3.1-hadoop2.6.0.jar
>  changed on src filesystem (expected 1430944255000, was 1430944249000
> at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
> at 
> org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Problem can be easily replicated by setting up two nodes and copying the 
> spark-assembly.jar to each node but changing the timestamp of the file on one 
> of the nodes. Then execute spark-shell --master yarn-client. Observe the 
> nodemanager log on the other node to find the error.
> Work around is to make sure the jar file has the same timestamp. But it looks 
> like perhaps the function that does the copy and check of the jar file 
> (org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) should 
> check for file similarity using a checksum rather than timestamp.



--

[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation

2015-05-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538216#comment-14538216
 ] 

Wangda Tan commented on YARN-3434:
--

Jenkins doesn't get back, sent a mail to hadoop-dev for help.

> Interaction between reservations and userlimit can result in significant ULF 
> violation
> --
>
> Key: YARN-3434
> URL: https://issues.apache.org/jira/browse/YARN-3434
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.6.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.8.0
>
> Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch
>
>
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 
> containers, each 8G each, within about 5 seconds. I think this allowed the 
> logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538232#comment-14538232
 ] 

Hadoop QA commented on YARN-3434:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731239/YARN-3434-branch2.7.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / b9cebfc |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7864/console |


This message was automatically generated.

> Interaction between reservations and userlimit can result in significant ULF 
> violation
> --
>
> Key: YARN-3434
> URL: https://issues.apache.org/jira/browse/YARN-3434
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.6.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.8.0
>
> Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch
>
>
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 
> containers, each 8G each, within about 5 seconds. I think this allowed the 
> logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3170) YARN architecture document needs updating

2015-05-11 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-3170:
---
Attachment: YARN-3170-005.patch

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
>  Labels: BB2015-05-TBR
> Attachments: YARN-3170-002.patch, YARN-3170-003.patch, 
> YARN-3170-004.patch, YARN-3170-005.patch, YARN-3170.patch
>
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3170) YARN architecture document needs updating

2015-05-11 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538242#comment-14538242
 ] 

Brahma Reddy Battula commented on YARN-3170:


[~aw] Thanks for taking look into this issue.. Updated the patch based on your 
comments..Kindly review...Let me anyother rework in second paragraph ( Mainly 
first line )...

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
>  Labels: BB2015-05-TBR
> Attachments: YARN-3170-002.patch, YARN-3170-003.patch, 
> YARN-3170-004.patch, YARN-3170-005.patch, YARN-3170.patch
>
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3616) determine how to generate YARN container events

2015-05-11 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-3616:
---

Assignee: Naganarasimha G R

> determine how to generate YARN container events
> ---
>
> Key: YARN-3616
> URL: https://issues.apache.org/jira/browse/YARN-3616
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>
> The initial design called for the node manager to write YARN container events 
> to take advantage of the distributed writes. RM acting as a sole writer of 
> all YARN container events would have significant scalability problems.
> Still, there are some types of events that are not captured by the NM. The 
> current implementation has both: RM writing container events and NM writing 
> container events.
> We need to sort this out, and decide how we can write all needed container 
> events in a scalable manner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation

2015-05-11 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538252#comment-14538252
 ] 

Thomas Graves commented on YARN-3434:
-

whats your question exactly?  For branch patches jenkins has never been hooked 
up. We generally download the patch, build and possibly the run the tests that 
apply and commit.

> Interaction between reservations and userlimit can result in significant ULF 
> violation
> --
>
> Key: YARN-3434
> URL: https://issues.apache.org/jira/browse/YARN-3434
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.6.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.8.0
>
> Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch
>
>
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 
> containers, each 8G each, within about 5 seconds. I think this allowed the 
> logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3170) YARN architecture document needs updating

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538259#comment-14538259
 ] 

Hadoop QA commented on YARN-3170:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   2m 53s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 55s | Site still builds. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   6m 11s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731957/YARN-3170-005.patch |
| Optional Tests | site |
| git revision | trunk / b9cebfc |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7865/console |


This message was automatically generated.

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
>  Labels: BB2015-05-TBR
> Attachments: YARN-3170-002.patch, YARN-3170-003.patch, 
> YARN-3170-004.patch, YARN-3170-005.patch, YARN-3170.patch
>
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation

2015-05-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538265#comment-14538265
 ] 

Wangda Tan commented on YARN-3434:
--

Just read [~aw]'s comment: 
https://issues.apache.org/jira/browse/HADOOP-11746?focusedCommentId=14499458&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14499458.
 Now it can only support branches after branch-2.7. So I will run all RM tests 
locally for YARN-3434 and commit.

> Interaction between reservations and userlimit can result in significant ULF 
> violation
> --
>
> Key: YARN-3434
> URL: https://issues.apache.org/jira/browse/YARN-3434
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.6.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.8.0
>
> Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch
>
>
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 
> containers, each 8G each, within about 5 seconds. I think this allowed the 
> logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3616) determine how to generate YARN container events

2015-05-11 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538303#comment-14538303
 ] 

Naganarasimha G R commented on YARN-3616:
-

I would like to continue working on this issue :).
Also to capture one important point from [~Vinodkv]'s review
bq. The missing dots occur when a container's life-cycle ends either on the RM 
or the AM. We can take a dual pronged approach here? That or we make the 
RM-publisher itself a distributed push.
IMO dual pronged approach would be better, we can rely on NMs to post normal 
life cycle events and in rare cases where NM cant handle, RM publish events 
directly to ATS.
And might be here distributed push might not work as in the cases which Vinod 
mentioned NM might not be able to handle publishing as TimelineCollector might 
not be created as no container is created in the NM side for that app. Correct 
me if i am wrong.

> determine how to generate YARN container events
> ---
>
> Key: YARN-3616
> URL: https://issues.apache.org/jira/browse/YARN-3616
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>
> The initial design called for the node manager to write YARN container events 
> to take advantage of the distributed writes. RM acting as a sole writer of 
> all YARN container events would have significant scalability problems.
> Still, there are some types of events that are not captured by the NM. The 
> current implementation has both: RM writing container events and NM writing 
> container events.
> We need to sort this out, and decide how we can write all needed container 
> events in a scalable manner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3595) Performance optimization using connection cache of Phoenix timeline writer

2015-05-11 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538341#comment-14538341
 ] 

Li Lu commented on YARN-3595:
-

Hi [~sjlee0], thanks for the suggestions. I think you're right that most 
complexities come from having a cache rather than a pool for those connections. 
I'll look into alternative solutions. 

> Performance optimization using connection cache of Phoenix timeline writer
> --
>
> Key: YARN-3595
> URL: https://issues.apache.org/jira/browse/YARN-3595
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
>
> The story about the connection cache in Phoenix timeline storage is a little 
> bit long. In YARN-3033 we planned to have shared writer layer for all 
> collectors in the same collector manager. In this way we can better reuse the 
> same heavy-weight storage layer connection, therefore it's more friendly to 
> conventional storage layer connections which are typically heavy-weight. 
> Phoenix, on the other hand, implements its own connection interface layer to 
> be light-weight, thread-unsafe. To make these connections work with our 
> "multiple collector, single writer" model, we're adding a thread indexed 
> connection cache. However, many performance critical factors are yet to be 
> tested. 
> In this JIRA we're tracing performance optimization efforts using this 
> connection cache. Previously we had a draft, but there was one implementation 
> challenge on cache evictions: There may be races between Guava cache's 
> removal listener calls (which close the connection) and normal references to 
> the connection. We need to carefully define the way they synchronize. 
> Performance-wise, at the very beginning stage we may need to understand:
> # If the current, thread-based indexing is an appropriate approach, or we can 
> use some better ways to index the connections. 
> # the best size of the cache, presumably as the proposed default value of a 
> configuration. 
> # how long we need to preserve a connection in the cache. 
> Please feel free to add this list. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation

2015-05-11 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538346#comment-14538346
 ] 

Allen Wittenauer commented on YARN-3434:


You can run test-patch.sh locally and specify the branch using --branch.

> Interaction between reservations and userlimit can result in significant ULF 
> violation
> --
>
> Key: YARN-3434
> URL: https://issues.apache.org/jira/browse/YARN-3434
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.6.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.8.0
>
> Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch
>
>
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 
> containers, each 8G each, within about 5 seconds. I think this allowed the 
> logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation

2015-05-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538367#comment-14538367
 ] 

Wangda Tan commented on YARN-3434:
--

Thanks Allen! Trying it.

> Interaction between reservations and userlimit can result in significant ULF 
> violation
> --
>
> Key: YARN-3434
> URL: https://issues.apache.org/jira/browse/YARN-3434
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.6.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.8.0
>
> Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch
>
>
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 
> containers, each 8G each, within about 5 seconds. I think this allowed the 
> logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-05-11 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538437#comment-14538437
 ] 

Zhijie Shen commented on YARN-3044:
---

Sorry to put my comments at last minute:

1. I'm still not sure why it is necessary to have RMContainerEntity. Whether 
the container entity comes from RM or NM, it's about container's info. Any 
reason we want differentiate both? At reader side, if I want to list all 
containers of an app, should I return RMContainerEntity or ContainerEntity? I 
incline to only having ContainerEntity, but RM and NM may put different 
info/event about it based on their knowledge.

2. Should v1 and v2 publisher only differentiate at publishEvent, however, 
it seems that we duplicate code more than that. And perhaps defining and 
implementing SystemMetricsEvent.toTimelineEvent can further cleanup the code.

3. I saw v2 is going to send config, but where the config is coming from. Did 
we conclude who and how to send the config? IAC, sending config seems to be 
half done. And we can use {{entity.addConfigs(event.getConfig());}}. No need to 
iterate over config collection and put each config one-by-one.

4. yarn.system-metrics-publisher.rm.publish.container-metrics -> 
yarn.rm.system-metrics-publisher.emit-container-events?
{code}
374   public static final String RM_PUBLISH_CONTAINER_METRICS_ENABLED = 
YARN_PREFIX
375   + "system-metrics-publisher.rm.publish.container-metrics";
376   public static final boolean 
DEFAULT_RM_PUBLISH_CONTAINER_METRICS_ENABLED =
377   false;
{code}
Moreover, I also think we should  not have 
"yarn.system-metrics-publisher.enabled" too, and reuse the existing config. And 
it's not limited to RM metrics publisher, but all existing ATS service. IMHO, 
the better practice is to reuse the existing config. And we can have a global 
config (or env var) timeline-service.version to determine the service is 
enabled with v1 or v2 implementation. Anyway, it's a separate problem, I'll 
file a separate jira for it.

5. Methods/innner classes in SystemMetricsPublisher don't need to be changed to 
"public". Default is enough to access them? 

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>  Labels: BB2015-05-TBR
> Attachments: YARN-3044-YARN-2928.004.patch, 
> YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, 
> YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, 
> YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-11 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538466#comment-14538466
 ] 

Vinod Kumar Vavilapalli commented on YARN-3134:
---

Tx folks, this is great progress!

> [Storage implementation] Exploiting the option of using Phoenix to access 
> HBase backend
> ---
>
> Key: YARN-3134
> URL: https://issues.apache.org/jira/browse/YARN-3134
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Li Lu
> Fix For: YARN-2928
>
> Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
> YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
> YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
> YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
> YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
> YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, 
> YARN-3134-YARN-2928.007.patch, YARN-3134DataSchema.pdf, 
> hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out
>
>
> Quote the introduction on Phoenix web page:
> {code}
> Apache Phoenix is a relational database layer over HBase delivered as a 
> client-embedded JDBC driver targeting low latency queries over HBase data. 
> Apache Phoenix takes your SQL query, compiles it into a series of HBase 
> scans, and orchestrates the running of those scans to produce regular JDBC 
> result sets. The table metadata is stored in an HBase table and versioned, 
> such that snapshot queries over prior versions will automatically use the 
> correct schema. Direct use of the HBase API, along with coprocessors and 
> custom filters, results in performance on the order of milliseconds for small 
> queries, or seconds for tens of millions of rows.
> {code}
> It may simply our implementation read/write data from/to HBase, and can 
> easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3618) Fix unused variable to get CPU frequency on Windows systems

2015-05-11 Thread Georg Berendt (JIRA)
Georg Berendt created YARN-3618:
---

 Summary: Fix unused variable to get CPU frequency on Windows 
systems
 Key: YARN-3618
 URL: https://issues.apache.org/jira/browse/YARN-3618
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows 7 x64 SP1
Reporter: Georg Berendt
Priority: Minor


In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, there 
is an unused variable for CPU frequency.

" /** {@inheritDoc} */
  @Override
  public long getCpuFrequency() {
refreshIfNeeded();
return -1;   
  }"

Please change '-1' to use 'cpuFrequencyKhz'.

org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3617) Fix unused variable to get CPU frequency on Windows systems

2015-05-11 Thread Georg Berendt (JIRA)
Georg Berendt created YARN-3617:
---

 Summary: Fix unused variable to get CPU frequency on Windows 
systems
 Key: YARN-3617
 URL: https://issues.apache.org/jira/browse/YARN-3617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows 7 x64 SP1
Reporter: Georg Berendt
Priority: Minor


In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, there 
is an unused variable for CPU frequency.

" /** {@inheritDoc} */
  @Override
  public long getCpuFrequency() {
refreshIfNeeded();
return -1;   
  }"

Please change '-1' to use 'cpuFrequencyKhz'.

org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-05-11 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-3619:


 Summary: ContainerMetrics unregisters during getMetrics and leads 
to ConcurrentModificationException
 Key: YARN-3619
 URL: https://issues.apache.org/jira/browse/YARN-3619
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Jason Lowe


ContainerMetrics is able to unregister itself during the getMetrics method, but 
that method can be called by MetricsSystemImpl.sampleMetrics which is trying to 
iterate the sources.  This leads to a ConcurrentModificationException log like 
this:
{noformat}
2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
impl.MetricsSystemImpl: java.util.ConcurrentModificationException
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-05-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538581#comment-14538581
 ] 

Jason Lowe commented on YARN-3619:
--

This appears to have been caused by YARN-2984.  [~kasha] would you mind taking 
a look?

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-05-11 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned YARN-3619:
--

Assignee: Karthik Kambatla

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3620) MetricsSystemImpl fails to show backtrace when an error occurs

2015-05-11 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-3620:


 Summary: MetricsSystemImpl fails to show backtrace when an error 
occurs
 Key: YARN-3620
 URL: https://issues.apache.org/jira/browse/YARN-3620
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jason Lowe
Assignee: Jason Lowe


While investigating YARN-3619 it was frustrating that MetricsSystemImpl was 
logging a ConcurrentModificationException but without any backtrace.  Logging a 
backtrace would be very beneficial to tracking down the cause of the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3621) FairScheduler doesn't count AM vcores towards max-share

2015-05-11 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-3621:
--

 Summary: FairScheduler doesn't count AM vcores towards max-share
 Key: YARN-3621
 URL: https://issues.apache.org/jira/browse/YARN-3621
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.1
Reporter: Karthik Kambatla


FairScheduler seems to not count AM vcores towards max-vcores. On a queue with 
maxVcores set to 1, I am able to run a sleep job. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3622) Enable application client to communicate with new timeline service

2015-05-11 Thread Zhijie Shen (JIRA)
Zhijie Shen created YARN-3622:
-

 Summary: Enable application client to communicate with new 
timeline service
 Key: YARN-3622
 URL: https://issues.apache.org/jira/browse/YARN-3622
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen


YARN application has client and AM. We have the story to make TimelineClient 
work inside AM for v2, but not for client. TimelineClient inside app client 
needs to be taken care of too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3623) Having the config to indicate the timeline service version

2015-05-11 Thread Zhijie Shen (JIRA)
Zhijie Shen created YARN-3623:
-

 Summary: Having the config to indicate the timeline service version
 Key: YARN-3623
 URL: https://issues.apache.org/jira/browse/YARN-3623
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen


So far RM, MR AM, DA AM added/changed new config to enable the feature to write 
the timeline data to v2 server. It's good to have a YARN 
timeline-service.version config like timeline-service.enable to indicate the 
version of the running timeline service with the given YARN cluster. It's 
beneficial for users to more smoothly move from v1 to v2, as they don't need to 
change the existing config, but switch this config from v1 to v2. And each 
framework doesn't need to have their own v1/v2 config.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps

2015-05-11 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-3505:

Attachment: YARN-3505.4.patch

> Node's Log Aggregation Report with SUCCEED should not cached in RMApps
> --
>
> Key: YARN-3505
> URL: https://issues.apache.org/jira/browse/YARN-3505
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 2.8.0
>Reporter: Junping Du
>Assignee: Xuan Gong
>Priority: Critical
> Attachments: YARN-3505.1.patch, YARN-3505.2.patch, 
> YARN-3505.2.rebase.patch, YARN-3505.3.patch, YARN-3505.4.patch
>
>
> Per discussions in YARN-1402, we shouldn't cache all node's log aggregation 
> reports in RMApps for always, especially for those finished with SUCCEED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps

2015-05-11 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538716#comment-14538716
 ] 

Xuan Gong commented on YARN-3505:
-

Upload a new patch to address all the comments


> Node's Log Aggregation Report with SUCCEED should not cached in RMApps
> --
>
> Key: YARN-3505
> URL: https://issues.apache.org/jira/browse/YARN-3505
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 2.8.0
>Reporter: Junping Du
>Assignee: Xuan Gong
>Priority: Critical
> Attachments: YARN-3505.1.patch, YARN-3505.2.patch, 
> YARN-3505.2.rebase.patch, YARN-3505.3.patch, YARN-3505.4.patch
>
>
> Per discussions in YARN-1402, we shouldn't cache all node's log aggregation 
> reports in RMApps for always, especially for those finished with SUCCEED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3624) ApplicationHistoryServer reverses the order of the filters it gets

2015-05-11 Thread Mit Desai (JIRA)
Mit Desai created YARN-3624:
---

 Summary: ApplicationHistoryServer reverses the order of the 
filters it gets
 Key: YARN-3624
 URL: https://issues.apache.org/jira/browse/YARN-3624
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai


AppliactionHistoryServer should not alter the order in which it gets the filter 
chain. Additional filters should be added at the end of the chain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3624) ApplicationHistoryServer reverses the order of the filters it gets

2015-05-11 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-3624:

Attachment: YARN-3624.patch

attaching the patch

> ApplicationHistoryServer reverses the order of the filters it gets
> --
>
> Key: YARN-3624
> URL: https://issues.apache.org/jira/browse/YARN-3624
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Mit Desai
>Assignee: Mit Desai
> Attachments: YARN-3624.patch
>
>
> AppliactionHistoryServer should not alter the order in which it gets the 
> filter chain. Additional filters should be added at the end of the chain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation

2015-05-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538740#comment-14538740
 ] 

Wangda Tan commented on YARN-3434:
--

Ran it locally, all tests can passed, committing.

> Interaction between reservations and userlimit can result in significant ULF 
> violation
> --
>
> Key: YARN-3434
> URL: https://issues.apache.org/jira/browse/YARN-3434
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.6.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.8.0
>
> Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch
>
>
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 
> containers, each 8G each, within about 5 seconds. I think this allowed the 
> logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3625) RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put

2015-05-11 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created YARN-3625:
-

 Summary: RollingLevelDBTimelineStore Incorrectly Forbids Related 
Entity in Same Put
 Key: YARN-3625
 URL: https://issues.apache.org/jira/browse/YARN-3625
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3625) RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put

2015-05-11 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-3625:
--
Attachment: YARN-3625.1.patch

> RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put
> --
>
> Key: YARN-3625
> URL: https://issues.apache.org/jira/browse/YARN-3625
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3625.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3625) RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put

2015-05-11 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-3625:
--
Description: RollingLevelDBTimelineStore batches all entities in the same 
put to improve performance. This causes an error when relating to an entity in 
the same put however.

> RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put
> --
>
> Key: YARN-3625
> URL: https://issues.apache.org/jira/browse/YARN-3625
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3625.1.patch
>
>
> RollingLevelDBTimelineStore batches all entities in the same put to improve 
> performance. This causes an error when relating to an entity in the same put 
> however.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3529) Add miniHBase cluster and Phoenix support to ATS v2 unit tests

2015-05-11 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538789#comment-14538789
 ] 

Zhijie Shen commented on YARN-3529:
---

Thanks for the patch, Li! Some comments:

1. You should define the dependency version under {{./hadoop-project/pom.xml}}. 
And then you can remove the version info like 
{{$\{phoenix.version\}}}

2. Do we need to all the following configurable not only in the unit test? At 
least, for POC, do we need to config connString to point a real hbase cluster?
{code}
94@VisibleForTesting
95static String connString = "jdbc:phoenix:localhost:2181:/hbase";
96@VisibleForTesting
97static Properties connProperties = new Properties();
{code}

3. In TestPhoenixTimelineWriterImpl, shall we teardown the hbase cluster as 
well after dropping the tables?

> Add miniHBase cluster and Phoenix support to ATS v2 unit tests
> --
>
> Key: YARN-3529
> URL: https://issues.apache.org/jira/browse/YARN-3529
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: AbstractMiniHBaseClusterTest.java, 
> YARN-3529-YARN-2928.000.patch, output_minicluster2.txt
>
>
> After we have our HBase and Phoenix writer implementations, we may want to 
> find a way to set up HBase and Phoenix in our unit tests. We need to do this 
> integration before the branch got merged back to trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-1886) Exceptions in the RM log while cleaning up app attempt

2015-05-11 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He resolved YARN-1886.
---
Resolution: Duplicate

> Exceptions in the RM log while cleaning up app attempt
> --
>
> Key: YARN-1886
> URL: https://issues.apache.org/jira/browse/YARN-1886
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>
> Noticed exceptions in the RM log while HA tests were running where we killed 
> RM/AM/Namnode etc.
> RM failed over and the new active RM tried to kill the old app attempt and 
> ran into this exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1886) Exceptions in the RM log while cleaning up app attempt

2015-05-11 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538800#comment-14538800
 ] 

Jian He commented on YARN-1886:
---

YARN-1885 fixed this problem. close this

> Exceptions in the RM log while cleaning up app attempt
> --
>
> Key: YARN-1886
> URL: https://issues.apache.org/jira/browse/YARN-1886
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>
> Noticed exceptions in the RM log while HA tests were running where we killed 
> RM/AM/Namnode etc.
> RM failed over and the new active RM tried to kill the old app attempt and 
> ran into this exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation

2015-05-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538803#comment-14538803
 ] 

Hudson commented on YARN-3434:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7799 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7799/])
Moved YARN-3434. (Interaction between reservations and userlimit can result in 
significant ULF violation.) From 2.8.0 to 2.7.1 (wangda: rev 
1952f9395870e7b631d43418e075e774b9d2)
* hadoop-yarn-project/CHANGES.txt


> Interaction between reservations and userlimit can result in significant ULF 
> violation
> --
>
> Key: YARN-3434
> URL: https://issues.apache.org/jira/browse/YARN-3434
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.6.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.8.0, 2.7.1
>
> Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
> YARN-3434.patch, YARN-3434.patch
>
>
> ULF was set to 1.0
> User was able to consume 1.4X queue capacity.
> It looks like when this application launched, it reserved about 1000 
> containers, each 8G each, within about 5 seconds. I think this allowed the 
> logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2000) Fix ordering of starting services inside the RM

2015-05-11 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538822#comment-14538822
 ] 

Jian He commented on YARN-2000:
---

bq. Probably we can have state-store stop last so that all the other services 
are stopped first and won't accept more requests and send events to state-store.
Even if state-store stops first, the API calls such as submitApplication won't 
return true until the state-store operation completes. 
Nothing to be done, close.

> Fix ordering of starting services inside the RM
> ---
>
> Key: YARN-2000
> URL: https://issues.apache.org/jira/browse/YARN-2000
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
>
> The order of starting services in RM would be:
> - Recovery of the app/attempts
> - Start the scheduler and add scheduler app/attempts
> - Start ResourceTrackerService and re-populate the containers in scheduler 
> based on the containers info from NMs 
> - ApplicationMasterService either don’t start or start but block until all 
> the previous NMs registers.
> Other than these, there are other services like ClientRMService, Webapps 
> which we need to  think about the order too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2000) Fix ordering of starting services inside the RM

2015-05-11 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He resolved YARN-2000.
---
Resolution: Invalid

> Fix ordering of starting services inside the RM
> ---
>
> Key: YARN-2000
> URL: https://issues.apache.org/jira/browse/YARN-2000
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
>
> The order of starting services in RM would be:
> - Recovery of the app/attempts
> - Start the scheduler and add scheduler app/attempts
> - Start ResourceTrackerService and re-populate the containers in scheduler 
> based on the containers info from NMs 
> - ApplicationMasterService either don’t start or start but block until all 
> the previous NMs registers.
> Other than these, there are other services like ClientRMService, Webapps 
> which we need to  think about the order too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-05-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538836#comment-14538836
 ] 

Wangda Tan commented on YARN-3362:
--

Hi Naga,
Thanks for updating,

1) To your questions: 
https://issues.apache.org/jira/browse/YARN-3362?focusedCommentId=14537181&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14537181,
You can refer to YARN-2824 for more information about why default cap of 
labeled resource set to zero.
The default of max-cap is 100 because queue can use such resource without 
configure it.  Let me know if you have more questions.

2) About showing resources of partitions, I think it's very helpful. I think 
you can include used-resource of each partition as well, You can file a 
separate ticket if it is hard to be added with this ticket.

3) About "Hide Hierarchy", I think it's good for queue capacity comparison, but 
admin may get confused after checked "Hide Hierarchy", it's better to be added 
to some other places instead of modify queue UI itself.

> Add node label usage in RM CapacityScheduler web UI
> ---
>
> Key: YARN-3362
> URL: https://issues.apache.org/jira/browse/YARN-3362
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, resourcemanager, webapp
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue 
> Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, 
> 2015.05.10_3362_Queue_Hierarchy.png, CSWithLabelsView.png, 
> No-space-between-Active_user_info-and-next-queues.png, Screen Shot 2015-04-29 
> at 11.42.17 AM.png, YARN-3362.20150428-3-modified.patch, 
> YARN-3362.20150428-3.patch, YARN-3362.20150506-1.patch, 
> YARN-3362.20150507-1.patch, YARN-3362.20150510-1.patch, 
> YARN-3362.20150511-1.patch, capacity-scheduler.xml
>
>
> We don't have node label usage in RM CapacityScheduler web UI now, without 
> this, user will be hard to understand what happened to nodes have labels 
> assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3618) Fix unused variable to get CPU frequency on Windows systems

2015-05-11 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula resolved YARN-3618.

Resolution: Duplicate

> Fix unused variable to get CPU frequency on Windows systems
> ---
>
> Key: YARN-3618
> URL: https://issues.apache.org/jira/browse/YARN-3618
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows 7 x64 SP1
>Reporter: Georg Berendt
>Priority: Minor
>  Labels: easyfix
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, 
> there is an unused variable for CPU frequency.
> " /** {@inheritDoc} */
>   @Override
>   public long getCpuFrequency() {
> refreshIfNeeded();
> return -1;   
>   }"
> Please change '-1' to use 'cpuFrequencyKhz'.
> org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be

2015-05-11 Thread Craig Welch (JIRA)
Craig Welch created YARN-3626:
-

 Summary: On Windows localized resources are not moved to the front 
of the classpath when they should be
 Key: YARN-3626
 URL: https://issues.apache.org/jira/browse/YARN-3626
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
 Environment: Windows
Reporter: Craig Welch
Assignee: Craig Welch


In response to the mapreduce.job.user.classpath.first setting the classpath is 
ordered differently so that localized resources will appear before system 
classpath resources when tasks execute.  On Windows this does not work because 
the localized resources are not linked into their final location when the 
classpath jar is created.  To compensate for that localized jar resources are 
added directly to the classpath generated for the jar rather than being 
discovered from the localized directories.  Unfortunately, they are always 
appended to the classpath, and so are never preferred over system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be

2015-05-11 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538849#comment-14538849
 ] 

Craig Welch commented on YARN-3626:
---

To resolve this, the situation should be detected and, when applicable, 
localized resources should be put at the beginning of the classpath rather than 
the end.

> On Windows localized resources are not moved to the front of the classpath 
> when they should be
> --
>
> Key: YARN-3626
> URL: https://issues.apache.org/jira/browse/YARN-3626
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
> Environment: Windows
>Reporter: Craig Welch
>Assignee: Craig Welch
>
> In response to the mapreduce.job.user.classpath.first setting the classpath 
> is ordered differently so that localized resources will appear before system 
> classpath resources when tasks execute.  On Windows this does not work 
> because the localized resources are not linked into their final location when 
> the classpath jar is created.  To compensate for that localized jar resources 
> are added directly to the classpath generated for the jar rather than being 
> discovered from the localized directories.  Unfortunately, they are always 
> appended to the classpath, and so are never preferred over system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3618) Fix unused variable to get CPU frequency on Windows systems

2015-05-11 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538844#comment-14538844
 ] 

Brahma Reddy Battula commented on YARN-3618:


Resloved as duplicate of YARN-3617,as both are same..

> Fix unused variable to get CPU frequency on Windows systems
> ---
>
> Key: YARN-3618
> URL: https://issues.apache.org/jira/browse/YARN-3618
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows 7 x64 SP1
>Reporter: Georg Berendt
>Priority: Minor
>  Labels: easyfix
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, 
> there is an unused variable for CPU frequency.
> " /** {@inheritDoc} */
>   @Override
>   public long getCpuFrequency() {
> refreshIfNeeded();
> return -1;   
>   }"
> Please change '-1' to use 'cpuFrequencyKhz'.
> org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1297) Miscellaneous Fair Scheduler speedups

2015-05-11 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-1297:
--
Attachment: YARN-1297.4.patch

Updating patch to fix the test failure

* Had missed accounting for app container recovery during scheduler recovery.

> Miscellaneous Fair Scheduler speedups
> -
>
> Key: YARN-1297
> URL: https://issues.apache.org/jira/browse/YARN-1297
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Sandy Ryza
>Assignee: Arun Suresh
>  Labels: BB2015-05-TBR
> Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.3.patch, 
> YARN-1297.4.patch, YARN-1297.patch, YARN-1297.patch
>
>
> I ran the Fair Scheduler's core scheduling loop through a profiler tool and 
> identified a bunch of minimally invasive changes that can shave off a few 
> milliseconds.
> The main one is demoting a couple INFO log messages to DEBUG, which brought 
> my benchmark down from 16000 ms to 6000.
> A few others (which had way less of an impact) were
> * Most of the time in comparisons was being spent in Math.signum.  I switched 
> this to direct ifs and elses and it halved the percent of time spent in 
> comparisons.
> * I removed some unnecessary instantiations of Resource objects
> * I made it so that queues' usage wasn't calculated from the applications up 
> each time getResourceUsage was called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be

2015-05-11 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3626:
--
Attachment: YARN-3626.0.patch

The attached patch propagates the conditional as a yarn configuration option 
and moves localized resources to the front of the classpath when appropriate

> On Windows localized resources are not moved to the front of the classpath 
> when they should be
> --
>
> Key: YARN-3626
> URL: https://issues.apache.org/jira/browse/YARN-3626
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
> Environment: Windows
>Reporter: Craig Welch
>Assignee: Craig Welch
> Attachments: YARN-3626.0.patch
>
>
> In response to the mapreduce.job.user.classpath.first setting the classpath 
> is ordered differently so that localized resources will appear before system 
> classpath resources when tasks execute.  On Windows this does not work 
> because the localized resources are not linked into their final location when 
> the classpath jar is created.  To compensate for that localized jar resources 
> are added directly to the classpath generated for the jar rather than being 
> discovered from the localized directories.  Unfortunately, they are always 
> appended to the classpath, and so are never preferred over system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3618) Fix unused variable to get CPU frequency on Windows systems

2015-05-11 Thread Georg Berendt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538855#comment-14538855
 ] 

Georg Berendt commented on YARN-3618:
-

Sorry, by posting the dialogue must have sent two POSTs

> Fix unused variable to get CPU frequency on Windows systems
> ---
>
> Key: YARN-3618
> URL: https://issues.apache.org/jira/browse/YARN-3618
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows 7 x64 SP1
>Reporter: Georg Berendt
>Priority: Minor
>  Labels: easyfix
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, 
> there is an unused variable for CPU frequency.
> " /** {@inheritDoc} */
>   @Override
>   public long getCpuFrequency() {
> refreshIfNeeded();
> return -1;   
>   }"
> Please change '-1' to use 'cpuFrequencyKhz'.
> org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.

2015-05-11 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated YARN-221:
-
Attachment: YARN-221-trunk-v5.patch

Here is the new patch with updated unit tests.

> NM should provide a way for AM to tell it not to aggregate logs.
> 
>
> Key: YARN-221
> URL: https://issues.apache.org/jira/browse/YARN-221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Robert Joseph Evans
>Assignee: Ming Ma
>  Labels: BB2015-05-TBR
> Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, 
> YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch
>
>
> The NodeManager should provide a way for an AM to tell it that either the 
> logs should not be aggregated, that they should be aggregated with a high 
> priority, or that they should be aggregated but with a lower priority.  The 
> AM should be able to do this in the ContainerLaunch context to provide a 
> default value, but should also be able to update the value when the container 
> is released.
> This would allow for the NM to not aggregate logs in some cases, and avoid 
> connection to the NN at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-05-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538878#comment-14538878
 ] 

Wangda Tan commented on YARN-3362:
--

The latest patch LGTM.

> Add node label usage in RM CapacityScheduler web UI
> ---
>
> Key: YARN-3362
> URL: https://issues.apache.org/jira/browse/YARN-3362
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, resourcemanager, webapp
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue 
> Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, 
> 2015.05.10_3362_Queue_Hierarchy.png, CSWithLabelsView.png, 
> No-space-between-Active_user_info-and-next-queues.png, Screen Shot 2015-04-29 
> at 11.42.17 AM.png, YARN-3362.20150428-3-modified.patch, 
> YARN-3362.20150428-3.patch, YARN-3362.20150506-1.patch, 
> YARN-3362.20150507-1.patch, YARN-3362.20150510-1.patch, 
> YARN-3362.20150511-1.patch, capacity-scheduler.xml
>
>
> We don't have node label usage in RM CapacityScheduler web UI now, without 
> this, user will be hard to understand what happened to nodes have labels 
> assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3521) Support return structured NodeLabel objects in REST API when call getClusterNodeLabels

2015-05-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538890#comment-14538890
 ] 

Wangda Tan commented on YARN-3521:
--

Thanks for updating, [~sunilg],
Latest patch LGTM, +1.

> Support return structured NodeLabel objects in REST API when call 
> getClusterNodeLabels
> --
>
> Key: YARN-3521
> URL: https://issues.apache.org/jira/browse/YARN-3521
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Sunil G
> Attachments: 0001-YARN-3521.patch, 0002-YARN-3521.patch, 
> 0003-YARN-3521.patch, 0004-YARN-3521.patch, 0005-YARN-3521.patch, 
> 0006-YARN-3521.patch, 0007-YARN-3521.patch
>
>
> In YARN-3413, yarn cluster CLI returns NodeLabel instead of String, we should 
> make the same change in REST API side to make them consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538897#comment-14538897
 ] 

Hadoop QA commented on YARN-3505:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 12s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 50s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 51s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m  7s | The applied patch generated  1 
new checkstyle issues (total was 1, now 2). |
| {color:red}-1{color} | checkstyle |   2m 22s | The applied patch generated  2 
new checkstyle issues (total was 70, now 63). |
| {color:green}+1{color} | whitespace |   0m 21s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 41s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 35s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 21s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |   6m 10s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:red}-1{color} | yarn tests |  51m 55s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 102m  7s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCResponseId |
|   | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12732030/YARN-3505.4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ea11590 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 
https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/diffcheckstylehadoop-yarn-server-common.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7866/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7866/console |


This message was automatically generated.

> Node's Log Aggregation Report with SUCCEED should not cached in RMApps
> --
>
> Key: YARN-3505
> URL: https://issues.apache.org/jira/browse/YARN-3505
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 2.8.0
>Reporter: Junping Du
>Assignee: Xuan Gong
>Priority: Critical
> Attachments: YARN-3505.1.patch, YARN-3505.2.patch, 
> YARN-3505.2.rebase.patch, YARN-3505.3.patch, YARN-3505.4.patch
>
>
> Per discussions in YARN-1402, we shouldn't cache all node's log aggregation 
> reports in RMApps for always, especially for those finished with SUCCEED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3624) ApplicationHistoryServer reverses the order of the filters it gets

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538898#comment-14538898
 ] 

Hadoop QA commented on YARN-3624:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 48s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 27s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 49s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   3m  3s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| | |  38m 59s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12732032/YARN-3624.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 444836b |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7867/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7867/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7867/console |


This message was automatically generated.

> ApplicationHistoryServer reverses the order of the filters it gets
> --
>
> Key: YARN-3624
> URL: https://issues.apache.org/jira/browse/YARN-3624
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Mit Desai
>Assignee: Mit Desai
> Attachments: YARN-3624.patch
>
>
> AppliactionHistoryServer should not alter the order in which it gets the 
> filter chain. Additional filters should be added at the end of the chain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3545) Investigate the concurrency issue with the map of timeline collector

2015-05-11 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3545:

Attachment: YARN-3545-YARN-2928.000.patch

In this patch I'm using concurrent hash map to replace synchronized hash map. 
After removing the global lock, we need to consider two cases, concurrent 
putIfAbsent calls, and concurrent putIfAbsent call and get call. 

The case with concurrent putIfAbsent call and get call is addressed by a 
initialization barrier since the contention is low. With this solution on the 
best case each read will only have one volatile variable read, instead of 
getting the lock inside synchronized map. 

The case with multiple concurrent putIfAbsents is addressed by speculatively 
allocate a collector, and try to putIfAbsent it to the hash map. It then call 
postPut and publish this new collector to all readers if the putIfAbsent call 
succeed (returns null). If the putIfAbsent call failed, someone else has 
already allocated a collector and we need to use that collector. To speed up 
this case, I used a "fast path" such that the putIfAbsent call only tries to 
allocate collectors if there was no collector for it at the beginning of this 
method. 

I'd appreciate comments since I may miss something here...

> Investigate the concurrency issue with the map of timeline collector
> 
>
> Key: YARN-3545
> URL: https://issues.apache.org/jira/browse/YARN-3545
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Li Lu
> Attachments: YARN-3545-YARN-2928.000.patch
>
>
> See the discussion in YARN-3390 for details. Let's continue the discussion 
> here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3625) RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538905#comment-14538905
 ] 

Hadoop QA commented on YARN-3625:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 58s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 45s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 27s | The applied patch generated  1 
new checkstyle issues (total was 6, now 6). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 49s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   3m 12s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| | |  39m 25s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12732038/YARN-3625.1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 444836b |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7868/artifact/patchprocess/diffcheckstylehadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7868/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7868/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7868/console |


This message was automatically generated.

> RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put
> --
>
> Key: YARN-3625
> URL: https://issues.apache.org/jira/browse/YARN-3625
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3625.1.patch
>
>
> RollingLevelDBTimelineStore batches all entities in the same put to improve 
> performance. This causes an error when relating to an entity in the same put 
> however.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)

2015-05-11 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538911#comment-14538911
 ] 

Zhijie Shen commented on YARN-2900:
---

[~mitdesai], have you got the change to fix {{java.lang.IllegalStateException: 
STREAM}}?

> Application (Attempt and Container) Not Found in AHS results in Internal 
> Server Error (500)
> ---
>
> Key: YARN-2900
> URL: https://issues.apache.org/jira/browse/YARN-2900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Attachments: YARN-2900-b2.patch, YARN-2900.patch, YARN-2900.patch, 
> YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
> YARN-2900.patch, YARN-2900.patch
>
>
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
>   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3539) Compatibility doc to state that ATS v1 is a stable REST API

2015-05-11 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538954#comment-14538954
 ] 

Zhijie Shen commented on YARN-3539:
---

[~ste...@apache.org], did you have a chance to look at my last comment? The doc 
seems to still have some minor issue.

> Compatibility doc to state that ATS v1 is a stable REST API
> ---
>
> Key: YARN-3539
> URL: https://issues.apache.org/jira/browse/YARN-3539
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.7.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-11826-001.patch, HADOOP-11826-002.patch, 
> TimelineServer.html, YARN-3539-003.patch, YARN-3539-004.patch, 
> YARN-3539-005.patch, YARN-3539-006.patch, YARN-3539-007.patch, 
> YARN-3539-008.patch, timeline_get_api_examples.txt
>
>
> The ATS v2 discussion and YARN-2423 have raised the question: "how stable are 
> the ATSv1 APIs"?
> The existing compatibility document actually states that the History Server 
> is [a stable REST 
> API|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs],
>  which effectively means that ATSv1 has already been declared as a stable API.
> Clarify this by patching the compatibility document appropriately



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538959#comment-14538959
 ] 

Hadoop QA commented on YARN-221:


\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 48s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 14s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 49s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 45s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   7m 56s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  51m 46s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12732060/YARN-221-trunk-v5.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 444836b |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7869/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7869/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7869/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7869/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7869/console |


This message was automatically generated.

> NM should provide a way for AM to tell it not to aggregate logs.
> 
>
> Key: YARN-221
> URL: https://issues.apache.org/jira/browse/YARN-221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Robert Joseph Evans
>Assignee: Ming Ma
>  Labels: BB2015-05-TBR
> Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, 
> YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch
>
>
> The NodeManager should provide a way for an AM to tell it that either the 
> logs should not be aggregated, that they should be aggregated with a high 
> priority, or that they should be aggregated but with a lower priority.  The 
> AM should be able to do this in the ContainerLaunch context to provide a 
> default value, but should also be able to update the value when the container 
> is released.
> This would allow for the NM to not aggregate logs in some cases, and avoid 
> connection to the NN at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2921) MockRM#waitForState methods can be too slow and flaky

2015-05-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538956#comment-14538956
 ] 

Wangda Tan commented on YARN-2921:
--

Hi [~ozawa],
Some comments:
- In MockAM.waitForState, I'm not very understand about the change: 1. why 
minWaitMSec is needed? 2. Why fail the method if {{if (waitedMsecs >= 
timeoutMsecs)}} is true? I think it should check now-state against expected 
state.
- In two MockRM.waitForState method, I think we should also check 
app.getState() instead of time, correct?
- In TestRMRestart, you can use GenericTestUtils.waitFor instead.

> MockRM#waitForState methods can be too slow and flaky
> -
>
> Key: YARN-2921
> URL: https://issues.apache.org/jira/browse/YARN-2921
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Tsuyoshi Ozawa
> Attachments: YARN-2921.001.patch, YARN-2921.002.patch, 
> YARN-2921.003.patch, YARN-2921.004.patch, YARN-2921.005.patch, 
> YARN-2921.006.patch, YARN-2921.007.patch
>
>
> MockRM#waitForState methods currently sleep for too long (2 seconds and 1 
> second). This leads to slow tests and sometimes failures if the 
> App/AppAttempt moves to another state. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3489) RMServerUtils.validateResourceRequests should only obtain queue info once

2015-05-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538961#comment-14538961
 ] 

Wangda Tan commented on YARN-3489:
--

Committing.

> RMServerUtils.validateResourceRequests should only obtain queue info once
> -
>
> Key: YARN-3489
> URL: https://issues.apache.org/jira/browse/YARN-3489
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
>  Labels: BB2015-05-RFC
> Attachments: YARN-3489.01.patch, YARN-3489.02.patch, 
> YARN-3489.03.patch
>
>
> Since the label support was added we now get the queue info for each request 
> being validated in SchedulerUtils.validateResourceRequest.  If 
> validateResourceRequests needs to validate a lot of requests at a time (e.g.: 
> large cluster with lots of varied locality in the requests) then it will get 
> the queue info for each request.  Since we build the queue info this 
> generates a lot of unnecessary garbage, as the queue isn't changing between 
> requests.  We should grab the queue info once and pass it down rather than 
> building it again for each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3545) Investigate the concurrency issue with the map of timeline collector

2015-05-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539001#comment-14539001
 ] 

Hadoop QA commented on YARN-3545:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 11s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 42s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 45s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 33s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 43s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 38s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   0m 40s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  37m  3s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-timelineservice |
|  |  Spinning on TimelineCollector.initialized in 
org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.initializationBarrier(TimelineCollector)
  At TimelineCollectorManager.java: At TimelineCollectorManager.java:[line 161] 
|
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12732071/YARN-3545-YARN-2928.000.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / b3b791b |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7870/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-timelineservice.html
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7870/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7870/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7870/console |


This message was automatically generated.

> Investigate the concurrency issue with the map of timeline collector
> 
>
> Key: YARN-3545
> URL: https://issues.apache.org/jira/browse/YARN-3545
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Li Lu
> Attachments: YARN-3545-YARN-2928.000.patch
>
>
> See the discussion in YARN-3390 for details. Let's continue the discussion 
> here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3489) RMServerUtils.validateResourceRequests should only obtain queue info once

2015-05-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539007#comment-14539007
 ] 

Hudson commented on YARN-3489:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7800 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7800/])
YARN-3489. RMServerUtils.validateResourceRequests should only obtain queue info 
once. (Varun Saxena via wangda) (wangda: rev 
d6f6741296639a73f5306e3ebefec84a40ca03e5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java


> RMServerUtils.validateResourceRequests should only obtain queue info once
> -
>
> Key: YARN-3489
> URL: https://issues.apache.org/jira/browse/YARN-3489
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
>  Labels: BB2015-05-RFC
> Attachments: YARN-3489.01.patch, YARN-3489.02.patch, 
> YARN-3489.03.patch
>
>
> Since the label support was added we now get the queue info for each request 
> being validated in SchedulerUtils.validateResourceRequest.  If 
> validateResourceRequests needs to validate a lot of requests at a time (e.g.: 
> large cluster with lots of varied locality in the requests) then it will get 
> the queue info for each request.  Since we build the queue info this 
> generates a lot of unnecessary garbage, as the queue isn't changing between 
> requests.  We should grab the queue info once and pass it down rather than 
> building it again for each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3617) Fix unused variable to get CPU frequency on Windows systems

2015-05-11 Thread J.Andreina (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

J.Andreina reassigned YARN-3617:


Assignee: J.Andreina

> Fix unused variable to get CPU frequency on Windows systems
> ---
>
> Key: YARN-3617
> URL: https://issues.apache.org/jira/browse/YARN-3617
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows 7 x64 SP1
>Reporter: Georg Berendt
>Assignee: J.Andreina
>Priority: Minor
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, 
> there is an unused variable for CPU frequency.
> " /** {@inheritDoc} */
>   @Override
>   public long getCpuFrequency() {
> refreshIfNeeded();
> return -1;   
>   }"
> Please change '-1' to use 'cpuFrequencyKhz'.
> org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >