[jira] [Updated] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess

2018-03-28 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7734:
---
Attachment: YARN-7734.001.patch

> YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
> -
>
> Key: YARN-7734
> URL: https://issues.apache.org/jira/browse/YARN-7734
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7734.001.patch
>
>
> It adds a call to LogAggregationFileControllerFactory where the context is 
> not filled in with the configuration in the mock in the unit test.
> {code}
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage
> [ERROR] 
> testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage)
>   Time elapsed: 0.208 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key

2018-03-28 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416921#comment-16416921
 ] 

Sunil G commented on YARN-6257:
---

Thanks [~Tao Yang]

{{the health metrics of capacity scheduler. This object existed but can't be 
actually used as the operationsInfo made illegal JSON data from 2.8.x to 3.1.x, 
and was corrected from 3.2.0}}

I think such an explanation seems better. I am trying to improve this message 
like below.

{{the health metrics of capacity scheduler. This information existed from 2.8.x 
to 3.1.x however this information is constructed with illegal JSON data format. 
Hence users can not make use of this field cleanly and is corrected from 3.2.0 
onwards.}}

 

> CapacityScheduler REST API produces incorrect JSON - JSON object 
> operationsInfo contains deplicate key
> --
>
> Key: YARN-6257
> URL: https://issues.apache.org/jira/browse/YARN-6257
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.8.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Minor
> Attachments: YARN-6257.001.patch, YARN-6257.002.patch, 
> YARN-6257.003.patch
>
>
> In response string of CapacityScheduler REST API, 
> scheduler/schedulerInfo/health/operationsInfo have duplicate key 'entry' as a 
> JSON object :
> {code}
> "operationsInfo":{
>   
> "entry":{"key":"last-preemption","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-reservation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-allocation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-release","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}
> }
> {code}
> To solve this problem, I suppose the type of operationsInfo field in 
> CapacitySchedulerHealthInfo class should be converted from Map to List.
> After convert to List, The operationsInfo string will be:
> {code}
> "operationInfos":[
>   
> {"operation":"last-allocation","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-release","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-preemption","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-reservation","nodeId":"N/A","containerId":"N/A","queue":"N/A"}
> ]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess

2018-03-28 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang reassigned YARN-7734:
--

Assignee: Tao Yang  (was: Xuan Gong)

> YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
> -
>
> Key: YARN-7734
> URL: https://issues.apache.org/jira/browse/YARN-7734
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7734.001.patch
>
>
> It adds a call to LogAggregationFileControllerFactory where the context is 
> not filled in with the configuration in the mock in the unit test.
> {code}
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage
> [ERROR] 
> testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage)
>   Time elapsed: 0.208 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key

2018-03-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416971#comment-16416971
 ] 

genericqa commented on YARN-6257:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 62 unchanged - 5 fixed = 64 total (was 67) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 10s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
22s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 66m 
40s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
20s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}145m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Unread field:CapacitySchedulerHealthInfo.java:[line 45] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | 

[jira] [Commented] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess

2018-03-28 Thread Tao Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416898#comment-16416898
 ] 

Tao Yang commented on YARN-7734:


This UT failure is still there. Attached patch which adds  
{{when(context.getConf()).thenReturn(conf);}} for mock context to solve this 
failure.

> YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
> -
>
> Key: YARN-7734
> URL: https://issues.apache.org/jira/browse/YARN-7734
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7734.001.patch
>
>
> It adds a call to LogAggregationFileControllerFactory where the context is 
> not filled in with the configuration in the mock in the unit test.
> {code}
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage
> [ERROR] 
> testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage)
>   Time elapsed: 0.208 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess

2018-03-28 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-7734:
--
Affects Version/s: 3.0.1
   3.1.0

> YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
> -
>
> Key: YARN-7734
> URL: https://issues.apache.org/jira/browse/YARN-7734
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Miklos Szegedi
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7734.001.patch
>
>
> It adds a call to LogAggregationFileControllerFactory where the context is 
> not filled in with the configuration in the mock in the unit test.
> {code}
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage
> [ERROR] 
> testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage)
>   Time elapsed: 0.208 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess

2018-03-28 Thread Tao Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417105#comment-16417105
 ] 

Tao Yang commented on YARN-7734:


Thanks [~cheersyang] for review and committing.

> YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
> -
>
> Key: YARN-7734
> URL: https://issues.apache.org/jira/browse/YARN-7734
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Miklos Szegedi
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.0.2, 3.2.0
>
> Attachments: YARN-7734.001.patch
>
>
> It adds a call to LogAggregationFileControllerFactory where the context is 
> not filled in with the configuration in the mock in the unit test.
> {code}
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage
> [ERROR] 
> testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage)
>   Time elapsed: 0.208 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7935) Expose container's hostname to applications running within the docker container

2018-03-28 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417175#comment-16417175
 ] 

Shane Kumpf commented on YARN-7935:
---

{quote}Docker embedded DNS will use /etc/resolv.conf from host, and filter out 
local IP addresses (127.0.0.1 etc), if no entires are available, it will route 
to 8.8.8.8
{quote}
[~eyang] this isn't true for overlay networks. You can't assume Registry DNS 
will be in use and it won't be used by some of these network types without 
additional modifications to Hadoop ({{--dns}} for {{docker run}}).

{quote}I am concerned that some end user code will end up invoking InetAddress 
Java class{quote}
This will use the IP of the container and whatever resolver the container is 
configured to use. Adding this environment variable doesn't change that.

I'm not seeing the issue with adding an additional environment variable that is 
set to the same value as --hostname if this solves a problem for a class of 
application. No one is proposing modifying Hadoop IPC code to support NAT here 
or to use the {{--link}} feature, just adding an additional environment 
variable in non-entrypoint mode. Can you elaborate on the exact issue you see 
this new environment variable causing?

> Expose container's hostname to applications running within the docker 
> container
> ---
>
> Key: YARN-7935
> URL: https://issues.apache.org/jira/browse/YARN-7935
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-7935.1.patch, YARN-7935.2.patch, YARN-7935.3.patch
>
>
> Some applications have a need to bind to the container's hostname (like 
> Spark) which is different from the NodeManager's hostname(NM_HOST which is 
> available as an env during container launch) when launched through Docker 
> runtime. The container's hostname can be exposed to applications via an env 
> CONTAINER_HOSTNAME. Another potential candidate is the container's IP but 
> this can be addressed in a separate jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8048) Support auto-spawning of admin configured services during bootstrap of rm/apiserver

2018-03-28 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-8048:

Attachment: YARN-8048.005.patch

> Support auto-spawning of admin configured services during bootstrap of 
> rm/apiserver
> ---
>
> Key: YARN-8048
> URL: https://issues.apache.org/jira/browse/YARN-8048
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-8048.001.patch, YARN-8048.002.patch, 
> YARN-8048.003.patch, YARN-8048.004.patch, YARN-8048.005.patch
>
>
> Goal is to support auto-spawning of admin configured services during 
> bootstrap of resourcemanager/apiserver. 
> *Requirement:* Some of the  services might required to be consumed by yarn 
> itself ex: Hbase for atsv2. Instead of depending on user installed HBase or 
> sometimes user may not required to install HBase at all, in such conditions 
> running HBase app on YARN will help for ATSv2.
> Before YARN cluster is started, admin configure these services spec and place 
> it in common location in HDFS. At the time of RM/apiServer bootstrap, these 
> services will be submitted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key

2018-03-28 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-6257:
---
Attachment: YARN-6257.004.patch

> CapacityScheduler REST API produces incorrect JSON - JSON object 
> operationsInfo contains deplicate key
> --
>
> Key: YARN-6257
> URL: https://issues.apache.org/jira/browse/YARN-6257
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.8.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Minor
> Attachments: YARN-6257.001.patch, YARN-6257.002.patch, 
> YARN-6257.003.patch, YARN-6257.004.patch
>
>
> In response string of CapacityScheduler REST API, 
> scheduler/schedulerInfo/health/operationsInfo have duplicate key 'entry' as a 
> JSON object :
> {code}
> "operationsInfo":{
>   
> "entry":{"key":"last-preemption","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-reservation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-allocation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-release","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}
> }
> {code}
> To solve this problem, I suppose the type of operationsInfo field in 
> CapacitySchedulerHealthInfo class should be converted from Map to List.
> After convert to List, The operationsInfo string will be:
> {code}
> "operationInfos":[
>   
> {"operation":"last-allocation","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-release","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-preemption","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-reservation","nodeId":"N/A","containerId":"N/A","queue":"N/A"}
> ]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7988) Refactor FSNodeLabelStore code for attributes store support

2018-03-28 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417197#comment-16417197
 ] 

Bibin A Chundatt commented on YARN-7988:


[~sunilg]
Attaching patch after handling review comments.

Basic test done from 2.8.3 to current
*2.8.3*
{noformat}
root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin 
-addToClusterNodeLabels bibin
18/03/28 15:22:13 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8033
root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin 
-replaceLabelsOnNode xxx,bibin
18/03/28 15:22:32 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8033
root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin 
-replaceLabelsOnNode xxy,bibin
18/03/28 15:22:40 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8033
root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin 
-replaceLabelsOnNode xxz,bibin
18/03/28 15:22:49 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8033
root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin 
-replaceLabelsOnNode xxy,
18/03/28 15:23:08 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8033
root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin 
-addToClusterNodeLabels xxy
18/03/28 15:23:39 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8033
root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin 
-removeFromClusterNodeLabels xxy
18/03/28 15:23:51 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8033
{noformat}

recovered in BRANCH
{noformat}

root@bibinpc:/opt/apacheprojects/hadoop/YARN3409/hadoop-dist/target/hadoop-3.1.0-SNAPSHOT/bin#
 ./yarn cluster -lnl
2018-03-28 16:45:53,065 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8032
Node Labels: 

[jira] [Commented] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919

2018-03-28 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417208#comment-16417208
 ] 

Rohith Sharma K S commented on YARN-7946:
-

Overall looks good. Does below change make sense? However two bullet points 
explains about each versions. 
{code:java}
The version of Apache HBase that is supported with Timeline Service v.2 is 
1.2.6 (default) and 2.0.0-beta1.
{code}
to
{code:java}
The supported version of Apache HBase are 1.2.6 (default) and 2.0.0-beta1.
{code}

> Update TimelineServerV2 doc as per YARN-7919
> 
>
> Key: YARN-7946
> URL: https://issues.apache.org/jira/browse/YARN-7946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-7946.00.patch
>
>
> Post YARN-7919, document need to be updated for co processor jar name and 
> other related details if any.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess

2018-03-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417027#comment-16417027
 ] 

genericqa commented on YARN-7734:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m  
9s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 74m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-7734 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916544/YARN-7734.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1f786d6b4fa6 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a71656c |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20117/testReport/ |
| Max. process+thread count | 292 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20117/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> YARN-5418 breaks 

[jira] [Commented] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess

2018-03-28 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417091#comment-16417091
 ] 

Weiwei Yang commented on YARN-7734:
---

+1, will commit this shortly

> YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
> -
>
> Key: YARN-7734
> URL: https://issues.apache.org/jira/browse/YARN-7734
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7734.001.patch
>
>
> It adds a call to LogAggregationFileControllerFactory where the context is 
> not filled in with the configuration in the mock in the unit test.
> {code}
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage
> [ERROR] 
> testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage)
>   Time elapsed: 0.208 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess

2018-03-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417126#comment-16417126
 ] 

Hudson commented on YARN-7734:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13891 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13891/])
YARN-7734. Fix UT failure (wwei: rev 411993f6e5723c8cba8100bff0269418e46f6367)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java


> YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
> -
>
> Key: YARN-7734
> URL: https://issues.apache.org/jira/browse/YARN-7734
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Miklos Szegedi
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.0.2, 3.2.0
>
> Attachments: YARN-7734.001.patch
>
>
> It adds a call to LogAggregationFileControllerFactory where the context is 
> not filled in with the configuration in the mock in the unit test.
> {code}
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage
> [ERROR] 
> testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage)
>   Time elapsed: 0.208 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7935) Expose container's hostname to applications running within the docker container

2018-03-28 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417175#comment-16417175
 ] 

Shane Kumpf edited comment on YARN-7935 at 3/28/18 11:10 AM:
-

{quote}Docker embedded DNS will use /etc/resolv.conf from host, and filter out 
local IP addresses (127.0.0.1 etc), if no entires are available, it will route 
to 8.8.8.8
{quote}
[~eyang] this isn't true for overlay networks. You can't assume Registry DNS 
will be in use and it won't be used by some of these network types without 
additional modifications to Hadoop ({{--dns}} for {{docker run}}).

{quote}I am concerned that some end user code will end up invoking InetAddress 
Java class{quote}
This will use the IP of the container and whatever resolver the container is 
configured to use. Adding this environment variable doesn't change that.

I'm not seeing the issue with adding an additional environment variable that is 
set to the same value as {{\-\-hostname}} if this solves a problem for a class 
of application. No one is proposing modifying Hadoop IPC code to support NAT 
here or to use the {{--link}} feature, just adding an additional environment 
variable in non-entrypoint mode. Can you elaborate on the exact issue you see 
this new environment variable causing?


was (Author: shaneku...@gmail.com):
{quote}Docker embedded DNS will use /etc/resolv.conf from host, and filter out 
local IP addresses (127.0.0.1 etc), if no entires are available, it will route 
to 8.8.8.8
{quote}
[~eyang] this isn't true for overlay networks. You can't assume Registry DNS 
will be in use and it won't be used by some of these network types without 
additional modifications to Hadoop ({{--dns}} for {{docker run}}).

{quote}I am concerned that some end user code will end up invoking InetAddress 
Java class{quote}
This will use the IP of the container and whatever resolver the container is 
configured to use. Adding this environment variable doesn't change that.

I'm not seeing the issue with adding an additional environment variable that is 
set to the same value as {{--hostname}} if this solves a problem for a class of 
application. No one is proposing modifying Hadoop IPC code to support NAT here 
or to use the {{--link}} feature, just adding an additional environment 
variable in non-entrypoint mode. Can you elaborate on the exact issue you see 
this new environment variable causing?

> Expose container's hostname to applications running within the docker 
> container
> ---
>
> Key: YARN-7935
> URL: https://issues.apache.org/jira/browse/YARN-7935
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-7935.1.patch, YARN-7935.2.patch, YARN-7935.3.patch
>
>
> Some applications have a need to bind to the container's hostname (like 
> Spark) which is different from the NodeManager's hostname(NM_HOST which is 
> available as an env during container launch) when launched through Docker 
> runtime. The container's hostname can be exposed to applications via an env 
> CONTAINER_HOSTNAME. Another potential candidate is the container's IP but 
> this can be addressed in a separate jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7935) Expose container's hostname to applications running within the docker container

2018-03-28 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417175#comment-16417175
 ] 

Shane Kumpf edited comment on YARN-7935 at 3/28/18 11:10 AM:
-

{quote}Docker embedded DNS will use /etc/resolv.conf from host, and filter out 
local IP addresses (127.0.0.1 etc), if no entires are available, it will route 
to 8.8.8.8
{quote}
[~eyang] this isn't true for overlay networks. You can't assume Registry DNS 
will be in use and it won't be used by some of these network types without 
additional modifications to Hadoop ({{--dns}} for {{docker run}}).

{quote}I am concerned that some end user code will end up invoking InetAddress 
Java class{quote}
This will use the IP of the container and whatever resolver the container is 
configured to use. Adding this environment variable doesn't change that.

I'm not seeing the issue with adding an additional environment variable that is 
set to the same value as {{--hostname}} if this solves a problem for a class of 
application. No one is proposing modifying Hadoop IPC code to support NAT here 
or to use the {{--link}} feature, just adding an additional environment 
variable in non-entrypoint mode. Can you elaborate on the exact issue you see 
this new environment variable causing?


was (Author: shaneku...@gmail.com):
{quote}Docker embedded DNS will use /etc/resolv.conf from host, and filter out 
local IP addresses (127.0.0.1 etc), if no entires are available, it will route 
to 8.8.8.8
{quote}
[~eyang] this isn't true for overlay networks. You can't assume Registry DNS 
will be in use and it won't be used by some of these network types without 
additional modifications to Hadoop ({{--dns}} for {{docker run}}).

{quote}I am concerned that some end user code will end up invoking InetAddress 
Java class{quote}
This will use the IP of the container and whatever resolver the container is 
configured to use. Adding this environment variable doesn't change that.

I'm not seeing the issue with adding an additional environment variable that is 
set to the same value as --hostname if this solves a problem for a class of 
application. No one is proposing modifying Hadoop IPC code to support NAT here 
or to use the {{--link}} feature, just adding an additional environment 
variable in non-entrypoint mode. Can you elaborate on the exact issue you see 
this new environment variable causing?

> Expose container's hostname to applications running within the docker 
> container
> ---
>
> Key: YARN-7935
> URL: https://issues.apache.org/jira/browse/YARN-7935
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-7935.1.patch, YARN-7935.2.patch, YARN-7935.3.patch
>
>
> Some applications have a need to bind to the container's hostname (like 
> Spark) which is different from the NodeManager's hostname(NM_HOST which is 
> available as an env during container launch) when launched through Docker 
> runtime. The container's hostname can be exposed to applications via an env 
> CONTAINER_HOSTNAME. Another potential candidate is the container's IP but 
> this can be addressed in a separate jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key

2018-03-28 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417016#comment-16417016
 ] 

Weiwei Yang commented on YARN-6257:
---

Hi [~Tao Yang]

Can you please fix the checkstyle and findbugs issues?

About the message, based on [~sunilg]'s comment, how about
{quote}the health metrics of capacity scheduler. This metrics existed since 
2.8.0, but the output was not well formatted. Hence users can not make use of 
this field cleanly, this is optimized from 3.2.0 onwards.
{quote}
Basically I don't want to say it was an illegal JSON as it follows JSON spec. 
Does that make sense?

> CapacityScheduler REST API produces incorrect JSON - JSON object 
> operationsInfo contains deplicate key
> --
>
> Key: YARN-6257
> URL: https://issues.apache.org/jira/browse/YARN-6257
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.8.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Minor
> Attachments: YARN-6257.001.patch, YARN-6257.002.patch, 
> YARN-6257.003.patch
>
>
> In response string of CapacityScheduler REST API, 
> scheduler/schedulerInfo/health/operationsInfo have duplicate key 'entry' as a 
> JSON object :
> {code}
> "operationsInfo":{
>   
> "entry":{"key":"last-preemption","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-reservation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-allocation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-release","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}
> }
> {code}
> To solve this problem, I suppose the type of operationsInfo field in 
> CapacitySchedulerHealthInfo class should be converted from Map to List.
> After convert to List, The operationsInfo string will be:
> {code}
> "operationInfos":[
>   
> {"operation":"last-allocation","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-release","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-preemption","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-reservation","nodeId":"N/A","containerId":"N/A","queue":"N/A"}
> ]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8048) Support auto-spawning of admin configured services during bootstrap of rm/apiserver

2018-03-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417304#comment-16417304
 ] 

genericqa commented on YARN-8048:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 18s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
17s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
19s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 22s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 3 new + 272 unchanged - 0 fixed = 275 total (was 272) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
12s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 65m 
44s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  5m 
24s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
35s{color} | {color:red} The patch generated 4 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}169m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce 

[jira] [Updated] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7946:
-
Attachment: YARN-7946.01.patch

> Update TimelineServerV2 doc as per YARN-7919
> 
>
> Key: YARN-7946
> URL: https://issues.apache.org/jira/browse/YARN-7946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-7946.00.patch, YARN-7946.01.patch
>
>
> Post YARN-7919, document need to be updated for co processor jar name and 
> other related details if any.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7988) Refactor FSNodeLabelStore code for attributes store support

2018-03-28 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-7988:
---
Attachment: YARN-7988-YARN-3409.007.patch

> Refactor FSNodeLabelStore code for attributes store support
> ---
>
> Key: YARN-7988
> URL: https://issues.apache.org/jira/browse/YARN-7988
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-7988-YARN-3409.002.patch, 
> YARN-7988-YARN-3409.003.patch, YARN-7988-YARN-3409.004.patch, 
> YARN-7988-YARN-3409.005.patch, YARN-7988-YARN-3409.006.patch, 
> YARN-7988-YARN-3409.007.patch, YARN-7988.001.patch
>
>
> # Abstract out file FileSystemStore operation
> # Define EditLog Operartions  and Mirror operation
> # Support compatibility with old nodelabel store



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919

2018-03-28 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417319#comment-16417319
 ] 

Haibo Chen commented on YARN-7946:
--

Let me make that change in a new patch.

> Update TimelineServerV2 doc as per YARN-7919
> 
>
> Key: YARN-7946
> URL: https://issues.apache.org/jira/browse/YARN-7946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-7946.00.patch
>
>
> Post YARN-7919, document need to be updated for co processor jar name and 
> other related details if any.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key

2018-03-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417247#comment-16417247
 ] 

genericqa commented on YARN-6257:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  9s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
2s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 62 unchanged - 5 fixed = 64 total (was 67) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 
13s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
20s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}146m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-6257 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916577/YARN-6257.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | 

[jira] [Updated] (YARN-7221) Add security check for privileged docker container

2018-03-28 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7221:

Attachment: YARN-7221.012.patch

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch, 
> YARN-7221.003.patch, YARN-7221.004.patch, YARN-7221.005.patch, 
> YARN-7221.006.patch, YARN-7221.007.patch, YARN-7221.008.patch, 
> YARN-7221.009.patch, YARN-7221.010.patch, YARN-7221.011.patch, 
> YARN-7221.012.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1151) Ability to configure auxiliary services from HDFS-based JAR files

2018-03-28 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417819#comment-16417819
 ] 

Xuan Gong commented on YARN-1151:
-

[~rkanter]   Could you review the latest patch, please?

> Ability to configure auxiliary services from HDFS-based JAR files
> -
>
> Key: YARN-1151
> URL: https://issues.apache.org/jira/browse/YARN-1151
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.1.0-beta, 2.9.0
>Reporter: john lilley
>Assignee: Xuan Gong
>Priority: Major
>  Labels: auxiliary-service, yarn
> Attachments: YARN-1151.1.patch, YARN-1151.2.patch, 
> YARN-1151.branch-2.poc.2.patch, YARN-1151.branch-2.poc.3.patch, 
> YARN-1151.branch-2.poc.patch, [YARN-1151] [Design] Configure auxiliary 
> services from HDFS-based JAR files.pdf
>
>
> I would like to install an auxiliary service in Hadoop YARN without actually 
> installing files/services on every node in the system.  Discussions on the 
> user@ list indicate that this is not easily done.  The reason we want an 
> auxiliary service is that our application has some persistent-data components 
> that are not appropriate for HDFS.  In fact, they are somewhat analogous to 
> the mapper output of MapReduce's shuffle, which is what led me to 
> auxiliary-services in the first place.  It would be much easier if we could 
> just place our service's JARs in HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8080) YARN native service should support component restart policy

2018-03-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8080:
-
Attachment: (was: YARN-8080.004.patch)

> YARN native service should support component restart policy
> ---
>
> Key: YARN-8080
> URL: https://issues.apache.org/jira/browse/YARN-8080
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8080.001.patch, YARN-8080.002.patch, 
> YARN-8080.003.patch, YARN-8080.005.patch
>
>
> Existing native service assumes the service is long running and never 
> finishes. Containers will be restarted even if exit code == 0. 
> To support boarder use cases, we need to allow restart policy of component 
> specified by users. Propose to have following policies:
> 1) Always: containers always restarted by framework regardless of container 
> exit status. This is existing/default behavior.
> 2) Never: Do not restart containers in any cases after container finishes: To 
> support job-like workload (for example Tensorflow training job). If a task 
> exit with code == 0, we should not restart the task. This can be used by 
> services which is not restart/recovery-able.
> 3) On-failure: Similar to above, only restart task with exitcode != 0. 
> Behaviors after component *instance* finalize (Succeeded or Failed when 
> restart_policy != ALWAYS): 
> 1) For single component, single instance: complete service.
> 2) For single component, multiple instance: other running instances from the 
> same component won't be affected by the finalized component instance. Service 
> will be terminated once all instances finalized. 
> 3) For multiple components: Service will be terminated once all components 
> finalized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8080) YARN native service should support component restart policy

2018-03-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8080:
-
Attachment: YARN-8080.005.patch

> YARN native service should support component restart policy
> ---
>
> Key: YARN-8080
> URL: https://issues.apache.org/jira/browse/YARN-8080
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8080.001.patch, YARN-8080.002.patch, 
> YARN-8080.003.patch, YARN-8080.005.patch
>
>
> Existing native service assumes the service is long running and never 
> finishes. Containers will be restarted even if exit code == 0. 
> To support boarder use cases, we need to allow restart policy of component 
> specified by users. Propose to have following policies:
> 1) Always: containers always restarted by framework regardless of container 
> exit status. This is existing/default behavior.
> 2) Never: Do not restart containers in any cases after container finishes: To 
> support job-like workload (for example Tensorflow training job). If a task 
> exit with code == 0, we should not restart the task. This can be used by 
> services which is not restart/recovery-able.
> 3) On-failure: Similar to above, only restart task with exitcode != 0. 
> Behaviors after component *instance* finalize (Succeeded or Failed when 
> restart_policy != ALWAYS): 
> 1) For single component, single instance: complete service.
> 2) For single component, multiple instance: other running instances from the 
> same component won't be affected by the finalized component instance. Service 
> will be terminated once all instances finalized. 
> 3) For multiple components: Service will be terminated once all components 
> finalized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8010) Add config in FederationRMFailoverProxy to not bypass facade cache when failing over

2018-03-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417927#comment-16417927
 ] 

Hudson commented on YARN-8010:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13894 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13894/])
Revert "YARN-8010. Add config in FederationRMFailoverProxy to not bypass 
(subru: rev 725b10e3aee383d049c97f8ed2b0b1ae873d5ae8)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/failover/FederationRMFailoverProxyProvider.java
YARN-8010. Add config in FederationRMFailoverProxy to not bypass facade (subru: 
rev 0d7e014fde717e8b122773b68664f4594106)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfigurationFields.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/failover/FederationRMFailoverProxyProvider.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestFederationRMFailoverProxyProvider.java


> Add config in FederationRMFailoverProxy to not bypass facade cache when 
> failing over
> 
>
> Key: YARN-8010
> URL: https://issues.apache.org/jira/browse/YARN-8010
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Fix For: 2.10.0, 2.9.1, 3.1.1
>
> Attachments: YARN-8010.v1.patch, YARN-8010.v1.patch, 
> YARN-8010.v2.patch, YARN-8010.v3.patch
>
>
> Today when YarnRM is failing over, the FederationRMFailoverProxy running in 
> AMRMProxy will perform failover, try to get latest subcluster info from 
> FederationStateStore and then retry connect to the latest YarnRM master. When 
> calling getSubCluster() to FederationStateStoreFacade, it bypasses the cache 
> with a flush flag. When YarnRM is failing over, every AM heartbeat thread 
> creates a different thread inside FederationInterceptor, each of which keeps 
> performing failover several times. This leads to a big spike of getSubCluster 
> call to FederationStateStore. 
> Depending on the cluster setup (e.g. putting a VIP before all YarnRMs), 
> YarnRM master slave change might not result in RM addr change. In other 
> cases, a small delay of getting latest subcluster information may be 
> acceptable. This patch thus creates a config option, so that it is possible 
> to ask the FederationRMFailoverProxy to not flush cache when calling 
> getSubCluster(). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7623) Fix the CapacityScheduler Queue configuration documentation

2018-03-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417928#comment-16417928
 ] 

Hudson commented on YARN-7623:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13894 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13894/])
YARN-7623. Fix the CapacityScheduler Queue configuration documentation. 
(zezhang: rev 0b1c2b5fe1b5c225d208936ecb1d3e307a535ee6)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md


> Fix the CapacityScheduler Queue configuration documentation
> ---
>
> Key: YARN-7623
> URL: https://issues.apache.org/jira/browse/YARN-7623
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Jonathan Hung
>Priority: Major
> Fix For: 2.10.0, 2.9.1, 3.0.2, 3.1.1
>
> Attachments: Screen Shot 2018-03-27 at 3.02.45 PM.png, 
> YARN-7623.001.patch, YARN-7623.002.patch
>
>
> It looks like the [Changing Queue 
> Configuration|https://hadoop.apache.org/docs/r2.9.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Changing_queue_configuration_via_API]
>  section is mis-formatted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6936) [Atsv2] Retrospect storing entities into sub application table from client perspective

2018-03-28 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417766#comment-16417766
 ] 

Rohith Sharma K S commented on YARN-6936:
-

bq. Let's add the scope of the entities to each of the four methods
OK, Does this modified sentence fine? {{Send the information of a number of 
conceptual entities in the scope of a YARN application to the timeline service 
v.2 collector.}}. Does all 4 API need to be modified with same way? For newer 
API, it should be out side scope of application also right?

bq.  Is it intended to extend updateAggregateStatus() so that sub application 
metrics are rolled up?
I vaguely remember this we discussed in weekly call and decided to aggregate 
for both APIs. Because newer APIs write into both tables i.e entity and subapp 
table which. So aggregated metrics can also available in application scope as 
well. 

bq. The TimelineCollectorContext is bound to an application attempt. Adding a 
subApplicationWrite flag to TimelineCollectorContext may not be the most 
intuitive approach. How about we leave subApplicationWrite as a separate flag 
instead?
I would inclined to send required information in record rather sending in 
parameter. This avoids compatibility in future. May be let's define newer 
record that contains context, ugi and subappwrite.  cc :/ [~vrushalic]



> [Atsv2] Retrospect storing entities into sub application table from client 
> perspective
> --
>
> Key: YARN-6936
> URL: https://issues.apache.org/jira/browse/YARN-6936
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-6936.000.patch, YARN-6936.001.patch
>
>
> Currently YARN-6734 stores entities into sub application table only if doAs 
> user and submitted users are different. This holds good for Tez kind of use 
> cases. But AM runs as same as submitted user like MR also need to store 
> entities in sub application table so that it could read entities without 
> application id. 
> This would be a point of concern later stages when ATSv2 is deployed into 
> production. This JIRA is to retrospect decision of storing entities into sub 
> application table based on client side configuration driven rather than user 
> driven. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8080) YARN native service should support component restart policy

2018-03-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417861#comment-16417861
 ] 

Wangda Tan commented on YARN-8080:
--

Attached ver.005 patch, which added tests to cover single component / multi 
components cases;

> YARN native service should support component restart policy
> ---
>
> Key: YARN-8080
> URL: https://issues.apache.org/jira/browse/YARN-8080
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8080.001.patch, YARN-8080.002.patch, 
> YARN-8080.003.patch, YARN-8080.005.patch
>
>
> Existing native service assumes the service is long running and never 
> finishes. Containers will be restarted even if exit code == 0. 
> To support boarder use cases, we need to allow restart policy of component 
> specified by users. Propose to have following policies:
> 1) Always: containers always restarted by framework regardless of container 
> exit status. This is existing/default behavior.
> 2) Never: Do not restart containers in any cases after container finishes: To 
> support job-like workload (for example Tensorflow training job). If a task 
> exit with code == 0, we should not restart the task. This can be used by 
> services which is not restart/recovery-able.
> 3) On-failure: Similar to above, only restart task with exitcode != 0. 
> Behaviors after component *instance* finalize (Succeeded or Failed when 
> restart_policy != ALWAYS): 
> 1) For single component, single instance: complete service.
> 2) For single component, multiple instance: other running instances from the 
> same component won't be affected by the finalized component instance. Service 
> will be terminated once all instances finalized. 
> 3) For multiple components: Service will be terminated once all components 
> finalized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7690) expose reserved memory/Vcores of nodemanager at webUI

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7690:
-
Issue Type: Improvement  (was: New Feature)

> expose reserved memory/Vcores of  nodemanager at webUI
> --
>
> Key: YARN-7690
> URL: https://issues.apache.org/jira/browse/YARN-7690
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Reporter: tianjuan
>Priority: Major
> Attachments: YARN-7690.patch
>
>
> now only total reserved memory/Vcores are exposed at RM webUI, reserved 
> memory/Vcores of a single nodemanager is hard to find out. it confuses users 
> that they obeserve that there are available memory/Vcores at nodes page, but 
> their jobs are stuck and waiting for resouce to be allocated. It is helpful 
> for bedug to expose reserved memory/Vcores of every single nodemanager, and 
> memory/Vcores that can be allocated( unallocated minus reserved)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8084) Yarn native service rename for easier development?

2018-03-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8084:
-
Environment: (was: There're a couple of classes with same name exists 
in YARN native service. Such as: 
1) ...service.component.Component and api.records.Component.
This makes harder when development in IDE since clash of class name forces to 
use full qualified class name.

Similarly in API definition:
...service.api.records:
Container/ContainerState/Resource/ResourceInformation. How about rename them to:
ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation?)

> Yarn native service rename for easier development?
> --
>
> Key: YARN-8084
> URL: https://issues.apache.org/jira/browse/YARN-8084
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8084) Yarn native service rename for easier development?

2018-03-28 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8084:


 Summary: Yarn native service rename for easier development?
 Key: YARN-8084
 URL: https://issues.apache.org/jira/browse/YARN-8084
 Project: Hadoop YARN
  Issue Type: Task
 Environment: There're a couple of classes with same name exists in 
YARN native service. Such as: 
1) ...service.component.Component and api.records.Component.
This makes harder when development in IDE since clash of class name forces to 
use full qualified class name.

Similarly in API definition:
...service.api.records:
Container/ContainerState/Resource/ResourceInformation. How about rename them to:
ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation?
Reporter: Wangda Tan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8084) Yarn native service rename for easier development?

2018-03-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8084:
-
Description: 
There're a couple of classes with same name exists in YARN native service. Such 
as: 
1) ...service.component.Component and api.records.Component.
This makes harder when development in IDE since clash of class name forces to 
use full qualified class name.

Similarly in API definition:
...service.api.records:
Container/ContainerState/Resource/ResourceInformation. How about rename them to:
ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation?

> Yarn native service rename for easier development?
> --
>
> Key: YARN-8084
> URL: https://issues.apache.org/jira/browse/YARN-8084
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Priority: Major
>
> There're a couple of classes with same name exists in YARN native service. 
> Such as: 
> 1) ...service.component.Component and api.records.Component.
> This makes harder when development in IDE since clash of class name forces to 
> use full qualified class name.
> Similarly in API definition:
> ...service.api.records:
> Container/ContainerState/Resource/ResourceInformation. How about rename them 
> to:
> ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7946:
-
Attachment: YARN-7946.02.patch

> Update TimelineServerV2 doc as per YARN-7919
> 
>
> Key: YARN-7946
> URL: https://issues.apache.org/jira/browse/YARN-7946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-7946.00.patch, YARN-7946.01.patch, 
> YARN-7946.02.patch
>
>
> Post YARN-7919, document need to be updated for co processor jar name and 
> other related details if any.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance

2018-03-28 Thread Gour Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417988#comment-16417988
 ] 

Gour Saha commented on YARN-7939:
-

Yes, we should use UPGRADING instead of UPGRADE (which is an action verb).

> Yarn Service Upgrade: add support to upgrade a component instance 
> --
>
> Key: YARN-7939
> URL: https://issues.apache.org/jira/browse/YARN-7939
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-7939.001.patch
>
>
> Yarn core supports in-place upgrade of containers. A yarn service can 
> leverage that to provide in-place upgrade of component instances. Please see 
> YARN-7512 for details.
> Will add support to upgrade a single component instance first and then 
> iteratively add other APIs and features.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7935) Expose container's hostname to applications running within the docker container

2018-03-28 Thread Mridul Muralidharan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418029#comment-16418029
 ] 

Mridul Muralidharan commented on YARN-7935:
---

[~eyang] I think there is some confusion here.
Spark does not require user defined networks - I dont think it was mentioned 
that this was required.

Taking a step back:

With "host" networking mode, we get it to work without any changes to the 
application code at all - giving us all the benefits of isolation without any 
loss in existing functionality (modulo specifying the env variables required 
ofcourse).

When used with bridge/overlay/user defined networks/etc, the container hostname 
passed to spark AM via allocation request is that of nodemanager, and not the 
actual container hostname used in the docker container.
This patch exposes the container hostname as an env variable - just as we have 
other container and node specific env variables exposed to the container 
(CONTAINER_ID, NM_HOST, etc).

Do you see any concern with exposing this variable ? I want to make sure I am 
not missing something here.

What spark (or any other application) does with this variable is their 
implementation detail; I can go into details of why this is required in the 
case of spark specifically if required, but that might digress from the jira.


> Expose container's hostname to applications running within the docker 
> container
> ---
>
> Key: YARN-7935
> URL: https://issues.apache.org/jira/browse/YARN-7935
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-7935.1.patch, YARN-7935.2.patch, YARN-7935.3.patch
>
>
> Some applications have a need to bind to the container's hostname (like 
> Spark) which is different from the NodeManager's hostname(NM_HOST which is 
> available as an env during container launch) when launched through Docker 
> runtime. The container's hostname can be exposed to applications via an env 
> CONTAINER_HOSTNAME. Another potential candidate is the container's IP but 
> this can be addressed in a separate jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919

2018-03-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418058#comment-16418058
 ] 

genericqa commented on YARN-7946:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 25m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
58m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 27m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  5s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 97m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-7946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916670/YARN-7946.02.patch |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 59798d31bb1c 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0b1c2b5 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 441 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20128/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Update TimelineServerV2 doc as per YARN-7919
> 
>
> Key: YARN-7946
> URL: https://issues.apache.org/jira/browse/YARN-7946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-7946.00.patch, YARN-7946.01.patch, 
> YARN-7946.02.patch
>
>
> Post YARN-7919, document need to be updated for co processor jar name and 
> other related details if any.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-2478) Nested containers should be supported

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved YARN-2478.
--
Resolution: Won't Fix

Closing this as DockerContainerExecutor has been deprecated in branch-2 and 
removed in trunk

> Nested containers should be supported
> -
>
> Key: YARN-2478
> URL: https://issues.apache.org/jira/browse/YARN-2478
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Abin Shahab
>Priority: Major
>
> Currently DockerContainerExecutor only supports one level of containers. 
> However, YARN's responsibility is to handle resource isolation, and nested 
> containers would allow YARN to delegate handling software isolation to the 
> jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7905) Parent directory permission incorrect during public localization

2018-03-28 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-7905:
---
Attachment: YARN-7905-008.patch

> Parent directory permission incorrect during public localization 
> -
>
> Key: YARN-7905
> URL: https://issues.apache.org/jira/browse/YARN-7905
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Critical
> Attachments: YARN-7905-001.patch, YARN-7905-002.patch, 
> YARN-7905-003.patch, YARN-7905-004.patch, YARN-7905-005.patch, 
> YARN-7905-006.patch, YARN-7905-007.patch, YARN-7905-008.patch
>
>
> Similar to YARN-6708 during public localization also we have to take care for 
> parent directory if the umask is 027 during node manager start up.
> /filecache/0/200
> the directory permission of /filecache/0 is 750. Which cause 
> application failure 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7905) Parent directory permission incorrect during public localization

2018-03-28 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417752#comment-16417752
 ] 

Bibin A Chundatt commented on YARN-7905:


Uploaded patch again to trigger jenkins. Missed to commit this patch.

> Parent directory permission incorrect during public localization 
> -
>
> Key: YARN-7905
> URL: https://issues.apache.org/jira/browse/YARN-7905
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Critical
> Attachments: YARN-7905-001.patch, YARN-7905-002.patch, 
> YARN-7905-003.patch, YARN-7905-004.patch, YARN-7905-005.patch, 
> YARN-7905-006.patch, YARN-7905-007.patch, YARN-7905-008.patch
>
>
> Similar to YARN-6708 during public localization also we have to take care for 
> parent directory if the umask is 027 during node manager start up.
> /filecache/0/200
> the directory permission of /filecache/0 is 750. Which cause 
> application failure 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1151) Ability to configure auxiliary services from HDFS-based JAR files

2018-03-28 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1151:

Attachment: YARN-1151.2.patch

> Ability to configure auxiliary services from HDFS-based JAR files
> -
>
> Key: YARN-1151
> URL: https://issues.apache.org/jira/browse/YARN-1151
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.1.0-beta, 2.9.0
>Reporter: john lilley
>Assignee: Xuan Gong
>Priority: Major
>  Labels: auxiliary-service, yarn
> Attachments: YARN-1151.1.patch, YARN-1151.2.patch, 
> YARN-1151.branch-2.poc.2.patch, YARN-1151.branch-2.poc.3.patch, 
> YARN-1151.branch-2.poc.patch, [YARN-1151] [Design] Configure auxiliary 
> services from HDFS-based JAR files.pdf
>
>
> I would like to install an auxiliary service in Hadoop YARN without actually 
> installing files/services on every node in the system.  Discussions on the 
> user@ list indicate that this is not easily done.  The reason we want an 
> auxiliary service is that our application has some persistent-data components 
> that are not appropriate for HDFS.  In fact, they are somewhat analogous to 
> the mapper output of MapReduce's shuffle, which is what led me to 
> auxiliary-services in the first place.  It would be much easier if we could 
> just place our service's JARs in HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7905) Parent directory permission incorrect during public localization

2018-03-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417873#comment-16417873
 ] 

genericqa commented on YARN-7905:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m  6s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-7905 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916651/YARN-7905-008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4d4da0e09aea 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / cdee0a4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20124/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20124/testReport/ |
| Max. process+thread count | 407 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 

[jira] [Comment Edited] (YARN-8079) YARN native service should respect source file of ConfigFile inside Service/Component spec

2018-03-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417747#comment-16417747
 ] 

Wangda Tan edited comment on YARN-8079 at 3/28/18 4:58 PM:
---

Thanks [~gsaha], 

Is there any additional suggestions to the patch or we're good to go?

cc: [~billie.rinaldi]/[~eyang]


was (Author: leftnoteasy):
Thanks [~gsaha], 

Is there any additional suggestions to the patch or we're good to go?

> YARN native service should respect source file of ConfigFile inside 
> Service/Component spec
> --
>
> Key: YARN-8079
> URL: https://issues.apache.org/jira/browse/YARN-8079
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-8079.001.patch, YARN-8079.002.patch, 
> YARN-8079.003.patch
>
>
> Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly 
> read srcFile, instead it always construct {{remoteFile}} by using 
> componentDir and fileName of {{destFile}}:
> {code}
> Path remoteFile = new Path(compInstanceDir, fileName);
> {code} 
> To me it is a common use case which services have some files existed in HDFS 
> and need to be localized when components get launched. (For example, if we 
> want to serve a Tensorflow model, we need to localize Tensorflow model 
> (typically not huge, less than GB) to local disk. Otherwise launched docker 
> container has to access HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919

2018-03-28 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417773#comment-16417773
 ] 

Rohith Sharma K S commented on YARN-7946:
-

The similar changes required in Building.txt file also i.e 2nd sentence in 1st 
paragraph. 

> Update TimelineServerV2 doc as per YARN-7919
> 
>
> Key: YARN-7946
> URL: https://issues.apache.org/jira/browse/YARN-7946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-7946.00.patch, YARN-7946.01.patch
>
>
> Post YARN-7919, document need to be updated for co processor jar name and 
> other related details if any.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved YARN-7859.
--
Resolution: Won't Do

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Assignee: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Attachments: YARN-7859-v1.patch, YARN-7859-v2.patch, log, 
> screenshot-1.png, screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen reopened YARN-7859:
--

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Assignee: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Attachments: YARN-7859-v1.patch, YARN-7859-v2.patch, log, 
> screenshot-1.png, screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7859:
-
 Hadoop Flags:   (was: Reviewed)
Fix Version/s: (was: 3.0.0)

> New feature: add queue scheduling deadLine in fairScheduler.
> 
>
> Key: YARN-7859
> URL: https://issues.apache.org/jira/browse/YARN-7859
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: wangwj
>Assignee: wangwj
>Priority: Major
>  Labels: fairscheduler, features, patch
> Attachments: YARN-7859-v1.patch, YARN-7859-v2.patch, log, 
> screenshot-1.png, screenshot-3.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
>  As everyone knows.In FairScheduler the phenomenon of queue scheduling 
> starvation often occurs when the number of cluster jobs is large.The App in 
> one or more queue are pending.So I have thought a way to solve this 
> problem.Add queue scheduling deadLine in fairScheduler.When a queue is not 
> scheduled for FairScheduler within a specified time.We mandatory scheduler it!
> On the basis of the above, I propose this issue...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-03-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417963#comment-16417963
 ] 

genericqa commented on YARN-7221:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 37s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 25s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-7221 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916658/YARN-7221.012.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 6b8784f3fffb 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / cdee0a4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20125/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20125/testReport/ |
| Max. process+thread count | 341 (vs. ulimit of 1) |
| modules | C: 

[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance

2018-03-28 Thread Chandni Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418014#comment-16418014
 ] 

Chandni Singh commented on YARN-7939:
-

[~gsaha]
{quote}Yes, we should use UPGRADING instead of UPGRADE (which is an action 
verb).
{quote}
This is inconsistent with other states. Please see my previous comment.
 # To trigger stop of the service, the ServiceState that is specified is 
{{STOPPED}} instead of {{STOP}}
 # To trigger start of the service, the ServiceState that is specified is 
{{STARTED}} instead of {{START}}

> Yarn Service Upgrade: add support to upgrade a component instance 
> --
>
> Key: YARN-7939
> URL: https://issues.apache.org/jira/browse/YARN-7939
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-7939.001.patch
>
>
> Yarn core supports in-place upgrade of containers. A yarn service can 
> leverage that to provide in-place upgrade of component instances. Please see 
> YARN-7512 for details.
> Will add support to upgrade a single component instance first and then 
> iteratively add other APIs and features.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7142) Support placement policy in yarn native services

2018-03-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418053#comment-16418053
 ] 

Wangda Tan commented on YARN-7142:
--

Thanks [~gsaha], the last patch looks good.

I would prefer to let another set of eyes to look at this patch, [~sunilg] 
could you help with the patch review? I plan to commit the patch by end of 
tomorrow if no objections / additional reviews.

> Support placement policy in yarn native services
> 
>
> Key: YARN-7142
> URL: https://issues.apache.org/jira/browse/YARN-7142
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Billie Rinaldi
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-7142.001.patch, YARN-7142.002.patch, 
> YARN-7142.003.patch, YARN-7142.004.patch
>
>
> Placement policy exists in the API but is not implemented yet.
> I have filed YARN-8074 to move the composite constraints implementation out 
> of this phase-1 implementation of placement policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-2477) DockerContainerExecutor must support secure mode

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved YARN-2477.
--
Resolution: Won't Fix

Closing this as DockerContainerExecutor has been deprecated in branch-2 and 
removed in trunk.

> DockerContainerExecutor must support secure mode
> 
>
> Key: YARN-2477
> URL: https://issues.apache.org/jira/browse/YARN-2477
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Abin Shahab
>Priority: Major
>  Labels: security
>
> DockerContainerExecutor(patch in YARN-1964) does not support Kerberized 
> hadoop clusters yet, as Kerberized hadoop cluster has a strict dependency on 
> the LinuxContainerExecutor. 
> For Docker containers to be used in production environment, they must support 
> secure hadoop. Issues regarding Java's AES encryption library in a 
> containerized environment also need to be worked out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8079) YARN native service should respect source file of ConfigFile inside Service/Component spec

2018-03-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417747#comment-16417747
 ] 

Wangda Tan commented on YARN-8079:
--

Thanks [~gsaha], 

Is there any additional suggestions to the patch or we're good to go?

> YARN native service should respect source file of ConfigFile inside 
> Service/Component spec
> --
>
> Key: YARN-8079
> URL: https://issues.apache.org/jira/browse/YARN-8079
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-8079.001.patch, YARN-8079.002.patch, 
> YARN-8079.003.patch
>
>
> Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly 
> read srcFile, instead it always construct {{remoteFile}} by using 
> componentDir and fileName of {{destFile}}:
> {code}
> Path remoteFile = new Path(compInstanceDir, fileName);
> {code} 
> To me it is a common use case which services have some files existed in HDFS 
> and need to be localized when components get launched. (For example, if we 
> want to serve a Tensorflow model, we need to localize Tensorflow model 
> (typically not huge, less than GB) to local disk. Otherwise launched docker 
> container has to access HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8079) YARN native service should respect source file of ConfigFile inside Service/Component spec

2018-03-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417852#comment-16417852
 ] 

Wangda Tan commented on YARN-8079:
--

[~eyang], thanks for the review!

> YARN native service should respect source file of ConfigFile inside 
> Service/Component spec
> --
>
> Key: YARN-8079
> URL: https://issues.apache.org/jira/browse/YARN-8079
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-8079.001.patch, YARN-8079.002.patch, 
> YARN-8079.003.patch
>
>
> Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly 
> read srcFile, instead it always construct {{remoteFile}} by using 
> componentDir and fileName of {{destFile}}:
> {code}
> Path remoteFile = new Path(compInstanceDir, fileName);
> {code} 
> To me it is a common use case which services have some files existed in HDFS 
> and need to be localized when components get launched. (For example, if we 
> want to serve a Tensorflow model, we need to localize Tensorflow model 
> (typically not huge, less than GB) to local disk. Otherwise launched docker 
> container has to access HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8048) Support auto-spawning of admin configured services during bootstrap of rm/apiserver

2018-03-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417877#comment-16417877
 ] 

Wangda Tan commented on YARN-8048:
--

[~rohithsharma],

Thanks for your responses.

bq. For the 2nd level, it's better to only read files ended with .yarnfile.
What's your opinion for this?

bq. making synchronization will delay RM transition period if more services to 
be started.
I'm not sure about correctness of this behavior. Maybe have two sub folders, 
sync/async under the service root, so admin can choose:
{code}
   service-root
   sync 
user1 
service1.yarnfile
user2
serivce2.yarnfile 
   async 
user3
... 
{code}
I think we should consider this at least in the dir hierarchy otherwise it will 
be very hard to add the new field.

> Support auto-spawning of admin configured services during bootstrap of 
> rm/apiserver
> ---
>
> Key: YARN-8048
> URL: https://issues.apache.org/jira/browse/YARN-8048
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-8048.001.patch, YARN-8048.002.patch, 
> YARN-8048.003.patch, YARN-8048.004.patch, YARN-8048.005.patch
>
>
> Goal is to support auto-spawning of admin configured services during 
> bootstrap of resourcemanager/apiserver. 
> *Requirement:* Some of the  services might required to be consumed by yarn 
> itself ex: Hbase for atsv2. Instead of depending on user installed HBase or 
> sometimes user may not required to install HBase at all, in such conditions 
> running HBase app on YARN will help for ATSv2.
> Before YARN cluster is started, admin configure these services spec and place 
> it in common location in HDFS. At the time of RM/apiServer bootstrap, these 
> services will be submitted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8012) Support Unmanaged Container Cleanup

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-8012:
-
Target Version/s: 2.7.1, 3.2.0  (was: 2.7.1, 3.0.0)

> Support Unmanaged Container Cleanup
> ---
>
> Key: YARN-8012
> URL: https://issues.apache.org/jira/browse/YARN-8012
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Yuqi Wang
>Assignee: Yuqi Wang
>Priority: Major
> Attachments: YARN-8012 - Unmanaged Container Cleanup.pdf, 
> YARN-8012-branch-2.7.1.001.patch
>
>
> An *unmanaged container / leaked container* is a container which is no longer 
> managed by NM. Thus, it is cannot be managed / leaked by YARN, too.
> *There are many cases a YARN managed container can become unmanaged, such as:*
>  * NM service is disabled or removed on the node.
>  * NM is unable to start up again on the node, such as depended 
> configuration, or resources cannot be ready.
>  * NM local leveldb store is corrupted or lost, such as bad disk sectors.
>  * NM has bugs, such as wrongly mark live container as complete.
> Note, they are caused or things become worse if work-preserving NM restart 
> enabled, see YARN-1336
> *Bad impacts of unmanaged container, such as:*
>  # Resource cannot be managed for YARN on the node:
>  ** Cause YARN on the node resource leak
>  ** Cannot kill the container to release YARN resource on the node to free up 
> resource for other urgent computations on the node.
>  # Container and App killing is not eventually consistent for App user:
>  ** App which has bugs can still produce bad impacts to outside even if the 
> App is killed for a long time



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8079) YARN native service should respect source file of ConfigFile inside Service/Component spec

2018-03-28 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417920#comment-16417920
 ] 

Eric Yang commented on YARN-8079:
-

[~leftnoteasy] Could you update yarn site markdown with the proper syntax?  
Thanks

> YARN native service should respect source file of ConfigFile inside 
> Service/Component spec
> --
>
> Key: YARN-8079
> URL: https://issues.apache.org/jira/browse/YARN-8079
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-8079.001.patch, YARN-8079.002.patch, 
> YARN-8079.003.patch
>
>
> Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly 
> read srcFile, instead it always construct {{remoteFile}} by using 
> componentDir and fileName of {{destFile}}:
> {code}
> Path remoteFile = new Path(compInstanceDir, fileName);
> {code} 
> To me it is a common use case which services have some files existed in HDFS 
> and need to be localized when components get launched. (For example, if we 
> want to serve a Tensorflow model, we need to localize Tensorflow model 
> (typically not huge, less than GB) to local disk. Otherwise launched docker 
> container has to access HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8084) Yarn native service classes renaming for easier development?

2018-03-28 Thread Gour Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417965#comment-16417965
 ] 

Gour Saha commented on YARN-8084:
-

For the service.component.Component I suggest to rename it to ComponentManager 
(similar to ServiceManager).

> Yarn native service classes renaming for easier development? 
> -
>
> Key: YARN-8084
> URL: https://issues.apache.org/jira/browse/YARN-8084
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Priority: Major
>
> There're a couple of classes with same name exists in YARN native service. 
> Such as: 
> 1) ...service.component.Component and api.records.Component.
> This makes harder when development in IDE since clash of class name forces to 
> use full qualified class name.
> Similarly in API definition:
> ...service.api.records:
> Container/ContainerState/Resource/ResourceInformation. How about rename them 
> to:
> ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8077) The vmemLimit parameter in ContainersMonitorImpl#isProcessTreeOverLimit is confusing

2018-03-28 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417735#comment-16417735
 ] 

Miklos Szegedi commented on YARN-8077:
--

The Jenkins failure seems to be unrelated (protoc). Let me look into this.

> The vmemLimit parameter in ContainersMonitorImpl#isProcessTreeOverLimit is 
> confusing
> 
>
> Key: YARN-8077
> URL: https://issues.apache.org/jira/browse/YARN-8077
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0
>Reporter: Sen Zhao
>Assignee: Sen Zhao
>Priority: Trivial
> Fix For: 3.2.0
>
> Attachments: YARN-8077.001.patch
>
>
> The parameter should be memLimit.   It contains the meaning of vmemLimit and 
> pmemLimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8080) YARN native service should support component restart policy

2018-03-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8080:
-
Attachment: YARN-8080.004.patch

> YARN native service should support component restart policy
> ---
>
> Key: YARN-8080
> URL: https://issues.apache.org/jira/browse/YARN-8080
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8080.001.patch, YARN-8080.002.patch, 
> YARN-8080.003.patch, YARN-8080.004.patch
>
>
> Existing native service assumes the service is long running and never 
> finishes. Containers will be restarted even if exit code == 0. 
> To support boarder use cases, we need to allow restart policy of component 
> specified by users. Propose to have following policies:
> 1) Always: containers always restarted by framework regardless of container 
> exit status. This is existing/default behavior.
> 2) Never: Do not restart containers in any cases after container finishes: To 
> support job-like workload (for example Tensorflow training job). If a task 
> exit with code == 0, we should not restart the task. This can be used by 
> services which is not restart/recovery-able.
> 3) On-failure: Similar to above, only restart task with exitcode != 0. 
> Behaviors after component *instance* finalize (Succeeded or Failed when 
> restart_policy != ALWAYS): 
> 1) For single component, single instance: complete service.
> 2) For single component, multiple instance: other running instances from the 
> same component won't be affected by the finalized component instance. Service 
> will be terminated once all instances finalized. 
> 3) For multiple components: Service will be terminated once all components 
> finalized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7623) Fix the CapacityScheduler Queue configuration documentation

2018-03-28 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417937#comment-16417937
 ] 

Jonathan Hung commented on YARN-7623:
-

Thanks!

> Fix the CapacityScheduler Queue configuration documentation
> ---
>
> Key: YARN-7623
> URL: https://issues.apache.org/jira/browse/YARN-7623
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Jonathan Hung
>Priority: Major
> Fix For: 2.10.0, 2.9.1, 3.0.2, 3.1.1
>
> Attachments: Screen Shot 2018-03-27 at 3.02.45 PM.png, 
> YARN-7623.001.patch, YARN-7623.002.patch
>
>
> It looks like the [Changing Queue 
> Configuration|https://hadoop.apache.org/docs/r2.9.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Changing_queue_configuration_via_API]
>  section is mis-formatted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8048) Support auto-spawning of admin configured services during bootstrap of rm/apiserver

2018-03-28 Thread Gour Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417980#comment-16417980
 ] 

Gour Saha commented on YARN-8048:
-

I think it is okay to assume that if a service needs to be started as a system 
service, then it needs to be dropped in the system service dir and with 
.yarnfile extension. It shouldn't affect other areas of YARN Service as it will 
continue to allow launch using any file-name as long it is a valid JSON.

> Support auto-spawning of admin configured services during bootstrap of 
> rm/apiserver
> ---
>
> Key: YARN-8048
> URL: https://issues.apache.org/jira/browse/YARN-8048
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-8048.001.patch, YARN-8048.002.patch, 
> YARN-8048.003.patch, YARN-8048.004.patch, YARN-8048.005.patch
>
>
> Goal is to support auto-spawning of admin configured services during 
> bootstrap of resourcemanager/apiserver. 
> *Requirement:* Some of the  services might required to be consumed by yarn 
> itself ex: Hbase for atsv2. Instead of depending on user installed HBase or 
> sometimes user may not required to install HBase at all, in such conditions 
> running HBase app on YARN will help for ATSv2.
> Before YARN cluster is started, admin configure these services spec and place 
> it in common location in HDFS. At the time of RM/apiServer bootstrap, these 
> services will be submitted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance

2018-03-28 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418070#comment-16418070
 ] 

Eric Yang commented on YARN-7939:
-

[~csingh] It would be nice if we clean up user submitted state to: START, STOP, 
FLEX, UPGRADE.  ServiceClient change it to STARTING, STOPPING, FLEXING, 
UPGRADING.  After action is completed, change it to STABLE.  We keep STOPPED, 
and STARTED keyword for backward compatibility.  Sorry, this part was messed up 
during the original implementation.  Can you show example of JSON that is used 
to trigger REST API upgrade?  I am getting error 500, and no errors in the log 
file.  I am unsure what is wrong in my test samples.  It would be nice to see a 
working example.  Thanks.

> Yarn Service Upgrade: add support to upgrade a component instance 
> --
>
> Key: YARN-7939
> URL: https://issues.apache.org/jira/browse/YARN-7939
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-7939.001.patch
>
>
> Yarn core supports in-place upgrade of containers. A yarn service can 
> leverage that to provide in-place upgrade of component instances. Please see 
> YARN-7512 for details.
> Will add support to upgrade a single component instance first and then 
> iteratively add other APIs and features.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-2480) DockerContainerExecutor must support user namespaces

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved YARN-2480.
--
Resolution: Won't Fix

Closing this as DockerContainerExecutor has been deprecated in branch-2 and 
removed in trunk

> DockerContainerExecutor must support user namespaces
> 
>
> Key: YARN-2480
> URL: https://issues.apache.org/jira/browse/YARN-2480
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Abin Shahab
>Priority: Major
>  Labels: security
>
> When DockerContainerExector launches a container, the root inside that 
> container has root privileges on the host. 
> This is insecure in a mult-tenant environment. The uid of the container's 
> root user must be mapped to a non-privileged user on the host.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-3988) DockerContainerExecutor should allow user specify "docker run" parameters

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved YARN-3988.
--
Resolution: Won't Fix

Closing this as DockerContainerExecutor has been deprecated in branch-2 and 
removed in trunk

> DockerContainerExecutor should allow user specify "docker run" parameters
> -
>
> Key: YARN-3988
> URL: https://issues.apache.org/jira/browse/YARN-3988
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Chen He
>Assignee: Chen He
>Priority: Major
>
> In current DockerContainerExecutor, the "docker run" command has fixed 
> parameters:
> String commandStr = commands.append(dockerExecutor)
>   .append(" ")
>   .append("run")
>   .append(" ")
>   .append("--rm --net=host")
>   .append(" ")
>   .append(" --name " + containerIdStr)
>   .append(localDirMount)
>   .append(logDirMount)
>   .append(containerWorkDirMount)
>   .append(" ")
>   .append(containerImageName)
>   .toString();
> For example, it is not flexible if users want to start a docker container 
> with attaching extra volume(s) and other "docker run" parameters. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7690) expose reserved memory/Vcores of nodemanager at webUI

2018-03-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417883#comment-16417883
 ] 

genericqa commented on YARN-7690:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-7690 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7690 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903982/YARN-7690.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20127/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> expose reserved memory/Vcores of  nodemanager at webUI
> --
>
> Key: YARN-7690
> URL: https://issues.apache.org/jira/browse/YARN-7690
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Reporter: tianjuan
>Priority: Major
> Attachments: YARN-7690.patch
>
>
> now only total reserved memory/Vcores are exposed at RM webUI, reserved 
> memory/Vcores of a single nodemanager is hard to find out. it confuses users 
> that they obeserve that there are available memory/Vcores at nodes page, but 
> their jobs are stuck and waiting for resouce to be allocated. It is helpful 
> for bedug to expose reserved memory/Vcores of every single nodemanager, and 
> memory/Vcores that can be allocated( unallocated minus reserved)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8084) Yarn native service rename for easier development?

2018-03-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417882#comment-16417882
 ] 

Wangda Tan commented on YARN-8084:
--

cc: [~gsaha]/[~billie.rinaldi]/[~eyang]

> Yarn native service rename for easier development?
> --
>
> Key: YARN-8084
> URL: https://issues.apache.org/jira/browse/YARN-8084
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Priority: Major
>
> There're a couple of classes with same name exists in YARN native service. 
> Such as: 
> 1) ...service.component.Component and api.records.Component.
> This makes harder when development in IDE since clash of class name forces to 
> use full qualified class name.
> Similarly in API definition:
> ...service.api.records:
> Container/ContainerState/Resource/ResourceInformation. How about rename them 
> to:
> ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8051) TestRMEmbeddedElector#testCallbackSynchronization is flakey

2018-03-28 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417891#comment-16417891
 ] 

Haibo Chen commented on YARN-8051:
--

Thanks [~rkanter] for the patch. HADOOP-12427 is dedicated to the mockito 
upgrade. There seems to be some incompatibility issues indicated there in the 
discussion?

If there is indeed some issues with upgrade mockito, we can fix the unit test 
without mockito upgrade.   Instead of mocking AdminService, we can create a 
subclass of AdminService in the test that tracks and exposes how many times the 
transitionTo* methods are called.

> TestRMEmbeddedElector#testCallbackSynchronization is flakey
> ---
>
> Key: YARN-8051
> URL: https://issues.apache.org/jira/browse/YARN-8051
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 3.2.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Major
> Attachments: YARN-8051.001.patch
>
>
> We've seen some rare flakey failures in 
> {{TestRMEmbeddedElector#testCallbackSynchronization}}:
> {noformat}
> org.mockito.exceptions.verification.WantedButNotInvoked: 
> Wanted but not invoked:
> adminService.transitionToStandby();
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector.testCallbackSynchronizationNeutral(TestRMEmbeddedElector.java:215)
> Actually, there were zero interactions with this mock.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector.testCallbackSynchronizationNeutral(TestRMEmbeddedElector.java:215)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector.testCallbackSynchronization(TestRMEmbeddedElector.java:146)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector.testCallbackSynchronization(TestRMEmbeddedElector.java:109)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8084) Yarn native service classes renaming for easier development?

2018-03-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8084:
-
Summary: Yarn native service classes renaming for easier development?   
(was: Yarn native service rename for easier development?)

> Yarn native service classes renaming for easier development? 
> -
>
> Key: YARN-8084
> URL: https://issues.apache.org/jira/browse/YARN-8084
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Priority: Major
>
> There're a couple of classes with same name exists in YARN native service. 
> Such as: 
> 1) ...service.component.Component and api.records.Component.
> This makes harder when development in IDE since clash of class name forces to 
> use full qualified class name.
> Similarly in API definition:
> ...service.api.records:
> Container/ContainerState/Resource/ResourceInformation. How about rename them 
> to:
> ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance

2018-03-28 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417907#comment-16417907
 ] 

Eric Yang commented on YARN-7939:
-

[~csingh] Thank you for the patch.  At high level, the patch looks good.  I 
think we need to open a task to fix the CLI command to support upgrade option 
or add to the current patch to call the newly introduced actionUpgrade.  My 
preference would be included here for completeness.  I don't spot much issues 
other than checkstyle reported issues.

> Yarn Service Upgrade: add support to upgrade a component instance 
> --
>
> Key: YARN-7939
> URL: https://issues.apache.org/jira/browse/YARN-7939
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-7939.001.patch
>
>
> Yarn core supports in-place upgrade of containers. A yarn service can 
> leverage that to provide in-place upgrade of component instances. Please see 
> YARN-7512 for details.
> Will add support to upgrade a single component instance first and then 
> iteratively add other APIs and features.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7988) Refactor FSNodeLabelStore code for attributes store support

2018-03-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417939#comment-16417939
 ] 

genericqa commented on YARN-7988:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-3409 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
59s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
48s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
17s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
39s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
41s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} YARN-3409 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
59s{color} | {color:green} hadoop-yarn-project_hadoop-yarn generated 0 new + 86 
unchanged - 1 fixed = 86 total (was 87) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  2s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 20 new + 62 unchanged - 22 fixed = 82 total (was 84) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
47s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
45s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common 
generated 2 new + 4183 unchanged - 0 fixed = 4185 total (was 4183) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
7s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 
35s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}159m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7988 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916611/YARN-7988-YARN-3409.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6e42c7a1c1d9 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919

2018-03-28 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417960#comment-16417960
 ] 

Rohith Sharma K S commented on YARN-7946:
-

+1 lgtm.. [~vrushalic] would you take a look at the patch?

> Update TimelineServerV2 doc as per YARN-7919
> 
>
> Key: YARN-7946
> URL: https://issues.apache.org/jira/browse/YARN-7946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-7946.00.patch, YARN-7946.01.patch, 
> YARN-7946.02.patch
>
>
> Post YARN-7919, document need to be updated for co processor jar name and 
> other related details if any.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8084) Yarn native service classes renaming for easier development?

2018-03-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418059#comment-16418059
 ] 

Wangda Tan commented on YARN-8084:
--

[~gsaha], 

ComponentManager sounds like a manager of a bunch of components, if you don't 
like "ServiceComponent", how about call it "RuntimeComponent"? 

> Yarn native service classes renaming for easier development? 
> -
>
> Key: YARN-8084
> URL: https://issues.apache.org/jira/browse/YARN-8084
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Priority: Major
>
> There're a couple of classes with same name exists in YARN native service. 
> Such as: 
> 1) ...service.component.Component and api.records.Component.
> This makes harder when development in IDE since clash of class name forces to 
> use full qualified class name.
> Similarly in API definition:
> ...service.api.records:
> Container/ContainerState/Resource/ResourceInformation. How about rename them 
> to:
> ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8082) Include LocalizedResource size information in the NM download log for localization

2018-03-28 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418060#comment-16418060
 ] 

Jason Lowe commented on YARN-8082:
--

Thanks for the patch!  The line-length checkstyle issue should be fixed 
otherwise looks good.


> Include LocalizedResource size information in the NM download log for 
> localization
> --
>
> Key: YARN-8082
> URL: https://issues.apache.org/jira/browse/YARN-8082
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Minor
> Attachments: YARN-8082.001.patch
>
>
> The size of the resource that finished downloading helps with debugging 
> localization delays and failures. A close approximate local size of the 
> resource is available in the LocalizedResource object which can be used to 
> address this minor change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-2482) DockerContainerExecutor configuration

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved YARN-2482.
--
Resolution: Won't Fix

Closing this as DockerContainerExecutor has been deprecated in branch-2 and 
removed in trunk

> DockerContainerExecutor configuration
> -
>
> Key: YARN-2482
> URL: https://issues.apache.org/jira/browse/YARN-2482
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Abin Shahab
>Priority: Major
>  Labels: security
>
> Currently DockerContainerExecutor can be configured from yarn-site.xml, and 
> users can add arbtrary arguments to the container launch command. This should 
> be fixed so that the cluster and other jobs are protected from malicious 
> string injections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-2479) DockerContainerExecutor must support handling of distributed cache

2018-03-28 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved YARN-2479.
--
Resolution: Won't Fix

Closing this as DockerContainerExecutor has been deprecated in branch-2 and 
removed in trunk

> DockerContainerExecutor must support handling of distributed cache
> --
>
> Key: YARN-2479
> URL: https://issues.apache.org/jira/browse/YARN-2479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Abin Shahab
>Priority: Major
>  Labels: security
>
> Interaction between Docker containers and distributed cache has not yet been 
> worked out. There should be a way to securely access distributed cache 
> without compromising the isolation Docker provides.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8010) Add config in FederationRMFailoverProxy to not bypass facade cache when failing over

2018-03-28 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417768#comment-16417768
 ] 

Botong Huang commented on YARN-8010:


Thanks [~subru] and [~giovanni.fumarola]!

> Add config in FederationRMFailoverProxy to not bypass facade cache when 
> failing over
> 
>
> Key: YARN-8010
> URL: https://issues.apache.org/jira/browse/YARN-8010
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Fix For: 2.10.0, 2.9.1, 3.1.1
>
> Attachments: YARN-8010.v1.patch, YARN-8010.v1.patch, 
> YARN-8010.v2.patch, YARN-8010.v3.patch
>
>
> Today when YarnRM is failing over, the FederationRMFailoverProxy running in 
> AMRMProxy will perform failover, try to get latest subcluster info from 
> FederationStateStore and then retry connect to the latest YarnRM master. When 
> calling getSubCluster() to FederationStateStoreFacade, it bypasses the cache 
> with a flush flag. When YarnRM is failing over, every AM heartbeat thread 
> creates a different thread inside FederationInterceptor, each of which keeps 
> performing failover several times. This leads to a big spike of getSubCluster 
> call to FederationStateStore. 
> Depending on the cluster setup (e.g. putting a VIP before all YarnRMs), 
> YarnRM master slave change might not result in RM addr change. In other 
> cases, a small delay of getting latest subcluster information may be 
> acceptable. This patch thus creates a config option, so that it is possible 
> to ask the FederationRMFailoverProxy to not flush cache when calling 
> getSubCluster(). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8079) YARN native service should respect source file of ConfigFile inside Service/Component spec

2018-03-28 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417816#comment-16417816
 ] 

Eric Yang commented on YARN-8079:
-

[~leftnoteasy] For accessing remote HDFS, it requires username + password of 
the remote cluster, and the cluster has a way to contact to remote cluster KDC 
server to verify the user.  I don't think Hadoop supports 
hdfs://user:pass@cluster:port/path.  I think remoteFile throw me off in 
thinking to access another HDFS other than current cluster.  Sorry for the 
confusion.  For S3, s3://ID:SECRET@BUCKET/ maybe this works.  +1 for patch 3.

> YARN native service should respect source file of ConfigFile inside 
> Service/Component spec
> --
>
> Key: YARN-8079
> URL: https://issues.apache.org/jira/browse/YARN-8079
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-8079.001.patch, YARN-8079.002.patch, 
> YARN-8079.003.patch
>
>
> Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly 
> read srcFile, instead it always construct {{remoteFile}} by using 
> componentDir and fileName of {{destFile}}:
> {code}
> Path remoteFile = new Path(compInstanceDir, fileName);
> {code} 
> To me it is a common use case which services have some files existed in HDFS 
> and need to be localized when components get launched. (For example, if we 
> want to serve a Tensorflow model, we need to localize Tensorflow model 
> (typically not huge, less than GB) to local disk. Otherwise launched docker 
> container has to access HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6629) NPE occurred when container allocation proposal is applied but its resource requests are removed before

2018-03-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417905#comment-16417905
 ] 

Hudson commented on YARN-6629:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13893 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13893/])
YARN-6629. NPE occurred when container allocation proposal is applied (wangda: 
rev 47f711eebca315804c80012eea5f31275ac25518)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java


> NPE occurred when container allocation proposal is applied but its resource 
> requests are removed before
> ---
>
> Key: YARN-6629
> URL: https://issues.apache.org/jira/browse/YARN-6629
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-6629.001.patch, YARN-6629.002.patch, 
> YARN-6629.003.patch, YARN-6629.004.patch, YARN-6629.005.patch, 
> YARN-6629.006.patch
>
>
> I wrote a test case to reproduce another problem for branch-2 and found new 
> NPE error,  log: 
> {code}
> FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in 
> handling event type NODE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516)
> at 
> org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225)
> at 
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31)
> at org.mockito.internal.MockHandler.handle(MockHandler.java:97)
> at 
> org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply()
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Reproduce this error in chronological order:
> 1. AM started and requested 1 container with schedulerRequestKey#1 : 
> ApplicationMasterService#allocate -->  CapacityScheduler#allocate --> 
> SchedulerApplicationAttempt#updateResourceRequests --> 
> AppSchedulingInfo#updateResourceRequests 
> Added schedulerRequestKey#1 into schedulerKeyToPlacementSets
> 2. Scheduler allocatd 1 container for this request and accepted the proposal
> 3. AM removed this request
> ApplicationMasterService#allocate -->  CapacityScheduler#allocate --> 
> SchedulerApplicationAttempt#updateResourceRequests --> 
> 

[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance

2018-03-28 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417916#comment-16417916
 ] 

Eric Yang commented on YARN-7939:
-

[~csingh] When user submit a curl PUT, the ServiceState needs to be specified 
as UPGRADING to trigger upgrade?  It seems to be more intuitive if the state is 
called UPGRADE, and UPGRADING is updated by ServiceClient when request is 
received.

> Yarn Service Upgrade: add support to upgrade a component instance 
> --
>
> Key: YARN-7939
> URL: https://issues.apache.org/jira/browse/YARN-7939
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-7939.001.patch
>
>
> Yarn core supports in-place upgrade of containers. A yarn service can 
> leverage that to provide in-place upgrade of component instances. Please see 
> YARN-7512 for details.
> Will add support to upgrade a single component instance first and then 
> iteratively add other APIs and features.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8048) Support auto-spawning of admin configured services during bootstrap of rm/apiserver

2018-03-28 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417956#comment-16417956
 ] 

Rohith Sharma K S commented on YARN-8048:
-

bq. For the 2nd level, it's better to only read files ended with .yarnfile
I am in dilemma! Given native service framework is enforcing all spec file to 
be ended with .yarnfile i.e making standardized spec file extension, it make 
sense to check file extension. Otherwise it is normal json file which NOT 
required to check for file extension. However in my last patch I incorporated 
this file extension check, but I am _not sure_ does native service going to 
make it standard. I see that from CLI command which submits a service reads 
normal json file i.e ApiServiceClient#loadAppJsonFromLocalFS. It doesn't check 
for yarnfile extension. 

bq. I'm not sure about correctness of this behavior.
Though it is called as async mode, it doesn't executed after some delay. It is 
started in separate thread which in turn launches services in no time. This 
makes admin service to release lock faster during transitioning to active 
phase. However, going with sync and async folder hierarchy also make sense to 
me. I will update the patch with this change.

> Support auto-spawning of admin configured services during bootstrap of 
> rm/apiserver
> ---
>
> Key: YARN-8048
> URL: https://issues.apache.org/jira/browse/YARN-8048
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-8048.001.patch, YARN-8048.002.patch, 
> YARN-8048.003.patch, YARN-8048.004.patch, YARN-8048.005.patch
>
>
> Goal is to support auto-spawning of admin configured services during 
> bootstrap of resourcemanager/apiserver. 
> *Requirement:* Some of the  services might required to be consumed by yarn 
> itself ex: Hbase for atsv2. Instead of depending on user installed HBase or 
> sometimes user may not required to install HBase at all, in such conditions 
> running HBase app on YARN will help for ATSv2.
> Before YARN cluster is started, admin configure these services spec and place 
> it in common location in HDFS. At the time of RM/apiServer bootstrap, these 
> services will be submitted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8080) YARN native service should support component restart policy

2018-03-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417990#comment-16417990
 ] 

genericqa commented on YARN-8080:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
55s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 12s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 30 new + 70 unchanged - 0 fixed = 100 total (was 70) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  6m 
14s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
43s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-8080 |
| JIRA Patch URL | 

[jira] [Comment Edited] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active

2018-03-28 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418450#comment-16418450
 ] 

Weiwei Yang edited comment on YARN-8085 at 3/29/18 5:18 AM:


Thanks [~Tao Yang], nice catch. ResourceProfilesManager was added in YARN-5707, 
so only affected version is 3.1.0. Since this one would cause NPE in fail-over, 
not sure if we can get this into 3.1.0 as it already enters RC0. + [~wangda] to 
the loop.

Regarding to the fix, can we move {{ResourceProfilesManager}} into the 
{{RMServiceContext}} ? As ResourceManager#resetRMContext is supposedly to get 
the reset done by
{code:java}
rmContextImpl.setServiceContext(rmContext.getServiceContext());

{code}
 don't think we need an extra set here. Does that make sense?

Thanks


was (Author: cheersyang):
Thanks [~Tao Yang], nice catch. ResourceProfilesManager was added in YARN-5707, 
so only affected version is 3.1.0. Since this one would cause NPE in fail-over, 
not sure if we can get this into 3.1.0 as it already enters RC0. + [~wangda] to 
the loop.

Regarding to the fix, can we move \{{ResourceProfilesManager}} into the 
\{{RMServiceContext}} ? As ResourceManager#resetRMContext is supposedly to get 
the reset done by

{code}

rmContextImpl.setServiceContext(rmContext.getServiceContext());

{code} 

don't think we need an extra set here. Does that make sense?

Thanks

> RMContext#resourceProfilesManager is lost after RM went standby then back to 
> active
> ---
>
> Key: YARN-8085
> URL: https://issues.apache.org/jira/browse/YARN-8085
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8085.001.patch
>
>
> We submited a distributed shell application after RM failover and back to 
> active, then got NPE error in RM log:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {noformat}
> The cause is that currently resourceProfilesManager is not transferred to new 
> RMContext instance in RMContext#resetRMContext. We should do this transfer to 
> fix this error.
> {code:java}
> @@ -1488,6 +1488,10 @@ private void resetRMContext() {
>  // transfer service context to new RM service Context
>  rmContextImpl.setServiceContext(rmContext.getServiceContext());
> +// transfer resource profiles manager
> +rmContextImpl
> +.setResourceProfilesManager(rmContext.getResourceProfilesManager());
> +
>  // reset dispatcher
>  Dispatcher dispatcher = setupDispatcher();
>  ((Service) dispatcher).init(this.conf);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active

2018-03-28 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8085:
--
Affects Version/s: (was: 3.2.0)
   3.1.0

> RMContext#resourceProfilesManager is lost after RM went standby then back to 
> active
> ---
>
> Key: YARN-8085
> URL: https://issues.apache.org/jira/browse/YARN-8085
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8085.001.patch
>
>
> We submited a distributed shell application after RM failover and back to 
> active, then got NPE error in RM log:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {noformat}
> The cause is that currently resourceProfilesManager is not transferred to new 
> RMContext instance in RMContext#resetRMContext. We should do this transfer to 
> fix this error.
> {code:java}
> @@ -1488,6 +1488,10 @@ private void resetRMContext() {
>  // transfer service context to new RM service Context
>  rmContextImpl.setServiceContext(rmContext.getServiceContext());
> +// transfer resource profiles manager
> +rmContextImpl
> +.setResourceProfilesManager(rmContext.getResourceProfilesManager());
> +
>  // reset dispatcher
>  Dispatcher dispatcher = setupDispatcher();
>  ((Service) dispatcher).init(this.conf);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active

2018-03-28 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418450#comment-16418450
 ] 

Weiwei Yang commented on YARN-8085:
---

Thanks [~Tao Yang], nice catch. ResourceProfilesManager was added in YARN-5707, 
so only affected version is 3.1.0. Since this one would cause NPE in fail-over, 
not sure if we can get this into 3.1.0 as it already enters RC0. + [~wangda] to 
the loop.

Regarding to the fix, can we move \{{ResourceProfilesManager}} into the 
\{{RMServiceContext}} ? As ResourceManager#resetRMContext is supposedly to get 
the reset done by

{code}

rmContextImpl.setServiceContext(rmContext.getServiceContext());

{code} 

don't think we need an extra set here. Does that make sense?

Thanks

> RMContext#resourceProfilesManager is lost after RM went standby then back to 
> active
> ---
>
> Key: YARN-8085
> URL: https://issues.apache.org/jira/browse/YARN-8085
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8085.001.patch
>
>
> We submited a distributed shell application after RM failover and back to 
> active, then got NPE error in RM log:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {noformat}
> The cause is that currently resourceProfilesManager is not transferred to new 
> RMContext instance in RMContext#resetRMContext. We should do this transfer to 
> fix this error.
> {code:java}
> @@ -1488,6 +1488,10 @@ private void resetRMContext() {
>  // transfer service context to new RM service Context
>  rmContextImpl.setServiceContext(rmContext.getServiceContext());
> +// transfer resource profiles manager
> +rmContextImpl
> +.setResourceProfilesManager(rmContext.getResourceProfilesManager());
> +
>  // reset dispatcher
>  Dispatcher dispatcher = setupDispatcher();
>  ((Service) dispatcher).init(this.conf);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA

2018-03-28 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418458#comment-16418458
 ] 

Weiwei Yang edited comment on YARN-7497 at 3/29/18 5:34 AM:


Hi [~yangjiandan]

Thanks for the updates. Please see my comments below
 # YarnConfiguration.HDFS_CONFIGURATION_STORE also needs to be renamed to 
YarnConfiguration.FS_CONFIGURATION_STORE
 # public static final String HDFS_CONFIGURATION_STORE = "hdfs";  >> lets 
rename this to "fs" to be more general
 # FSSchedulerConfigurationStore: I don't see any place to close fileSystem. We 
need to ensure logMutation, confirmMutation and retrieve both closes fileSystem 
after they have done using it.

 Thanks


was (Author: cheersyang):
Hi [~yangjiandan]

Thanks for the updates. Please see my comments below
 # YarnConfiguration.HDFS_CONFIGURATION_STORE also needs to be renamed to 
\{{YarnConfiguration.FS_CONFIGURATION_STORE}}
 # public static final String HDFS_CONFIGURATION_STORE = "hdfs";  >> lets 
rename this to "fs" to be more general
 # FSSchedulerConfigurationStore: I don't see any place to close fileSystem. We 
need to ensure logMutation, confirmMutation and retrieve both closes fileSystem 
after they have done using it.

 Thanks

> Add HDFSSchedulerConfigurationStore for RM HA
> -
>
> Key: YARN-7497
> URL: https://issues.apache.org/jira/browse/YARN-7497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: YARN-7497.001.patch, YARN-7497.002.patch, 
> YARN-7497.003.patch, YARN-7497.004.patch, YARN-7497.005.patch, 
> YARN-7497.006.patch, YARN-7497.007.patch, YARN-7497.008.patch, 
> YARN-7497.009.patch
>
>
> YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but 
> it does not support Yarn RM HA. 
> YARN-6840 supports RM HA, but too many scheduler configurations may exceed 
> znode limit, for example 10 thousand queues.
> HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, 
> new active RM can load scheduler configuration from HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA

2018-03-28 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418458#comment-16418458
 ] 

Weiwei Yang commented on YARN-7497:
---

Hi [~yangjiandan]

Thanks for the updates. Please see my comments below
 # YarnConfiguration.HDFS_CONFIGURATION_STORE also needs to be renamed to 
\{{YarnConfiguration.FS_CONFIGURATION_STORE}}
 # public static final String HDFS_CONFIGURATION_STORE = "hdfs";  >> lets 
rename this to "fs" to be more general
 # FSSchedulerConfigurationStore: I don't see any place to close fileSystem. We 
need to ensure logMutation, confirmMutation and retrieve both closes fileSystem 
after they have done using it.

 Thanks

> Add HDFSSchedulerConfigurationStore for RM HA
> -
>
> Key: YARN-7497
> URL: https://issues.apache.org/jira/browse/YARN-7497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: YARN-7497.001.patch, YARN-7497.002.patch, 
> YARN-7497.003.patch, YARN-7497.004.patch, YARN-7497.005.patch, 
> YARN-7497.006.patch, YARN-7497.007.patch, YARN-7497.008.patch, 
> YARN-7497.009.patch
>
>
> YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but 
> it does not support Yarn RM HA. 
> YARN-6840 supports RM HA, but too many scheduler configurations may exceed 
> znode limit, for example 10 thousand queues.
> HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, 
> new active RM can load scheduler configuration from HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA

2018-03-28 Thread Jiandan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiandan Yang  updated YARN-7497:

Attachment: YARN-7497.010.patch

> Add HDFSSchedulerConfigurationStore for RM HA
> -
>
> Key: YARN-7497
> URL: https://issues.apache.org/jira/browse/YARN-7497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: YARN-7497.001.patch, YARN-7497.002.patch, 
> YARN-7497.003.patch, YARN-7497.004.patch, YARN-7497.005.patch, 
> YARN-7497.006.patch, YARN-7497.007.patch, YARN-7497.008.patch, 
> YARN-7497.009.patch, YARN-7497.010.patch
>
>
> YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but 
> it does not support Yarn RM HA. 
> YARN-6840 supports RM HA, but too many scheduler configurations may exceed 
> znode limit, for example 10 thousand queues.
> HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, 
> new active RM can load scheduler configuration from HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active

2018-03-28 Thread Tao Yang (JIRA)
Tao Yang created YARN-8085:
--

 Summary: RMContext#resourceProfilesManager is lost after RM went 
standby then back to active
 Key: YARN-8085
 URL: https://issues.apache.org/jira/browse/YARN-8085
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 3.2.0
Reporter: Tao Yang
Assignee: Tao Yang


We submited a distributed shell application after RM failover and back to 
active, then got NPE error in RM log:
{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
{noformat}

The cause is that currently resourceProfilesManager is not transferred to new 
RMContext instance in RMContext#resetRMContext. We should do this transfer to 
fix this error.
{code:java}
@@ -1488,6 +1488,10 @@ private void resetRMContext() {
 // transfer service context to new RM service Context
 rmContextImpl.setServiceContext(rmContext.getServiceContext());

+// transfer resource profiles manager
+rmContextImpl
+.setResourceProfilesManager(rmContext.getResourceProfilesManager());
+
 // reset dispatcher
 Dispatcher dispatcher = setupDispatcher();
 ((Service) dispatcher).init(this.conf);
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA

2018-03-28 Thread Jiandan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiandan Yang  updated YARN-7497:

Attachment: (was: YARN-7497.010.patch)

> Add HDFSSchedulerConfigurationStore for RM HA
> -
>
> Key: YARN-7497
> URL: https://issues.apache.org/jira/browse/YARN-7497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: YARN-7497.001.patch, YARN-7497.002.patch, 
> YARN-7497.003.patch, YARN-7497.004.patch, YARN-7497.005.patch, 
> YARN-7497.006.patch, YARN-7497.007.patch, YARN-7497.008.patch, 
> YARN-7497.009.patch
>
>
> YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but 
> it does not support Yarn RM HA. 
> YARN-6840 supports RM HA, but too many scheduler configurations may exceed 
> znode limit, for example 10 thousand queues.
> HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, 
> new active RM can load scheduler configuration from HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA

2018-03-28 Thread Jiandan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418484#comment-16418484
 ] 

Jiandan Yang  commented on YARN-7497:
-

Hi, [~cheersyang]
Thanks for your review.
I will rename the name of variable to general ones and close fileSystem in v10 
patch.

> Add HDFSSchedulerConfigurationStore for RM HA
> -
>
> Key: YARN-7497
> URL: https://issues.apache.org/jira/browse/YARN-7497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: YARN-7497.001.patch, YARN-7497.002.patch, 
> YARN-7497.003.patch, YARN-7497.004.patch, YARN-7497.005.patch, 
> YARN-7497.006.patch, YARN-7497.007.patch, YARN-7497.008.patch, 
> YARN-7497.009.patch, YARN-7497.010.patch
>
>
> YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but 
> it does not support Yarn RM HA. 
> YARN-6840 supports RM HA, but too many scheduler configurations may exceed 
> znode limit, for example 10 thousand queues.
> HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, 
> new active RM can load scheduler configuration from HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement

2018-03-28 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418411#comment-16418411
 ] 

Weiwei Yang commented on YARN-7494:
---

Hi [~sunilg]

I was proposing to have configuration like following:

{noformat}


  yarn.scheduler.capacity.multi-node-sorting.polices
  resource-usage
 


  
yarn.scheduler.capacity.multi-node-sorting.policy.resource-usage.class
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.ResourceUsageSortingNodesPolicy



  
yarn.scheduler.capacity.multi-node-sorting.policy.resource-usage.sorting-task.interval.ms
  3000


{noformat}

Such fashion will be easy for user to plug their own policies. And this is how 
queues are configured so should be easy for user to understand. What do you 
think? I am also open if you can elaborate your config details (currently it 
cannot be cleanly seen from the patch), I can vote yes as long as it addresses 
these problems. 

Thanks

 

> Add muti node lookup support for better placement
> -
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active

2018-03-28 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8085:
---
Attachment: YARN-8085.001.patch

> RMContext#resourceProfilesManager is lost after RM went standby then back to 
> active
> ---
>
> Key: YARN-8085
> URL: https://issues.apache.org/jira/browse/YARN-8085
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8085.001.patch
>
>
> We submited a distributed shell application after RM failover and back to 
> active, then got NPE error in RM log:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {noformat}
> The cause is that currently resourceProfilesManager is not transferred to new 
> RMContext instance in RMContext#resetRMContext. We should do this transfer to 
> fix this error.
> {code:java}
> @@ -1488,6 +1488,10 @@ private void resetRMContext() {
>  // transfer service context to new RM service Context
>  rmContextImpl.setServiceContext(rmContext.getServiceContext());
> +// transfer resource profiles manager
> +rmContextImpl
> +.setResourceProfilesManager(rmContext.getResourceProfilesManager());
> +
>  // reset dispatcher
>  Dispatcher dispatcher = setupDispatcher();
>  ((Service) dispatcher).init(this.conf);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active

2018-03-28 Thread Tao Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418461#comment-16418461
 ] 

Tao Yang commented on YARN-8085:


Thanks [~cheersyang] for your suggestion.

Yes, RMServiceContext contains services which will be running always 
irrespective of the HA state of the RM. It's better to move 
ResourceProfilesManager into RMServiceContext.

Attached v2 patch for review.

> RMContext#resourceProfilesManager is lost after RM went standby then back to 
> active
> ---
>
> Key: YARN-8085
> URL: https://issues.apache.org/jira/browse/YARN-8085
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8085.001.patch, YARN-8085.002.patch
>
>
> We submited a distributed shell application after RM failover and back to 
> active, then got NPE error in RM log:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {noformat}
> The cause is that currently resourceProfilesManager is not transferred to new 
> RMContext instance in RMContext#resetRMContext. We should do this transfer to 
> fix this error.
> {code:java}
> @@ -1488,6 +1488,10 @@ private void resetRMContext() {
>  // transfer service context to new RM service Context
>  rmContextImpl.setServiceContext(rmContext.getServiceContext());
> +// transfer resource profiles manager
> +rmContextImpl
> +.setResourceProfilesManager(rmContext.getResourceProfilesManager());
> +
>  // reset dispatcher
>  Dispatcher dispatcher = setupDispatcher();
>  ((Service) dispatcher).init(this.conf);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active

2018-03-28 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8085:
---
Attachment: YARN-8085.002.patch

> RMContext#resourceProfilesManager is lost after RM went standby then back to 
> active
> ---
>
> Key: YARN-8085
> URL: https://issues.apache.org/jira/browse/YARN-8085
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8085.001.patch, YARN-8085.002.patch
>
>
> We submited a distributed shell application after RM failover and back to 
> active, then got NPE error in RM log:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {noformat}
> The cause is that currently resourceProfilesManager is not transferred to new 
> RMContext instance in RMContext#resetRMContext. We should do this transfer to 
> fix this error.
> {code:java}
> @@ -1488,6 +1488,10 @@ private void resetRMContext() {
>  // transfer service context to new RM service Context
>  rmContextImpl.setServiceContext(rmContext.getServiceContext());
> +// transfer resource profiles manager
> +rmContextImpl
> +.setResourceProfilesManager(rmContext.getResourceProfilesManager());
> +
>  // reset dispatcher
>  Dispatcher dispatcher = setupDispatcher();
>  ((Service) dispatcher).init(this.conf);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources

2018-03-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-5881:


Assignee: Sunil G  (was: Wangda Tan)

> Enable configuration of queue capacity in terms of absolute resources
> -
>
> Key: YARN-5881
> URL: https://issues.apache.org/jira/browse/YARN-5881
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sean Po
>Assignee: Sunil G
>Priority: Major
> Attachments: 
> YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf,
>  YARN-5881.v0.patch, YARN-5881.v1.patch
>
>
> Currently, Yarn RM supports the configuration of queue capacity in terms of a 
> proportion to cluster capacity. In the context of Yarn being used as a public 
> cloud service, it makes more sense if queues can be configured absolutely. 
> This will allow administrators to set usage limits more concretely and 
> simplify customer expectations for cluster allocation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources

2018-03-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan resolved YARN-5881.
--
Resolution: Done

> Enable configuration of queue capacity in terms of absolute resources
> -
>
> Key: YARN-5881
> URL: https://issues.apache.org/jira/browse/YARN-5881
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sean Po
>Assignee: Sunil G
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: 
> YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf,
>  YARN-5881.v0.patch, YARN-5881.v1.patch
>
>
> Currently, Yarn RM supports the configuration of queue capacity in terms of a 
> proportion to cluster capacity. In the context of Yarn being used as a public 
> cloud service, it makes more sense if queues can be configured absolutely. 
> This will allow administrators to set usage limits more concretely and 
> simplify customer expectations for cluster allocation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources

2018-03-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418424#comment-16418424
 ] 

Wangda Tan commented on YARN-5881:
--

Closing this JIRA as all sub jiras are completed.

> Enable configuration of queue capacity in terms of absolute resources
> -
>
> Key: YARN-5881
> URL: https://issues.apache.org/jira/browse/YARN-5881
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sean Po
>Assignee: Sunil G
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: 
> YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf,
>  YARN-5881.v0.patch, YARN-5881.v1.patch
>
>
> Currently, Yarn RM supports the configuration of queue capacity in terms of a 
> proportion to cluster capacity. In the context of Yarn being used as a public 
> cloud service, it makes more sense if queues can be configured absolutely. 
> This will allow administrators to set usage limits more concretely and 
> simplify customer expectations for cluster allocation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources

2018-03-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-5881:
-
Fix Version/s: 3.1.0

> Enable configuration of queue capacity in terms of absolute resources
> -
>
> Key: YARN-5881
> URL: https://issues.apache.org/jira/browse/YARN-5881
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sean Po
>Assignee: Sunil G
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: 
> YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf,
>  YARN-5881.v0.patch, YARN-5881.v1.patch
>
>
> Currently, Yarn RM supports the configuration of queue capacity in terms of a 
> proportion to cluster capacity. In the context of Yarn being used as a public 
> cloud service, it makes more sense if queues can be configured absolutely. 
> This will allow administrators to set usage limits more concretely and 
> simplify customer expectations for cluster allocation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >