[jira] [Updated] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
[ https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-7734: --- Attachment: YARN-7734.001.patch > YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess > - > > Key: YARN-7734 > URL: https://issues.apache.org/jira/browse/YARN-7734 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Xuan Gong >Priority: Major > Attachments: YARN-7734.001.patch > > > It adds a call to LogAggregationFileControllerFactory where the context is > not filled in with the configuration in the mock in the unit test. > {code} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage > [ERROR] > testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage) > Time elapsed: 0.208 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key
[ https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416921#comment-16416921 ] Sunil G commented on YARN-6257: --- Thanks [~Tao Yang] {{the health metrics of capacity scheduler. This object existed but can't be actually used as the operationsInfo made illegal JSON data from 2.8.x to 3.1.x, and was corrected from 3.2.0}} I think such an explanation seems better. I am trying to improve this message like below. {{the health metrics of capacity scheduler. This information existed from 2.8.x to 3.1.x however this information is constructed with illegal JSON data format. Hence users can not make use of this field cleanly and is corrected from 3.2.0 onwards.}} > CapacityScheduler REST API produces incorrect JSON - JSON object > operationsInfo contains deplicate key > -- > > Key: YARN-6257 > URL: https://issues.apache.org/jira/browse/YARN-6257 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.8.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Minor > Attachments: YARN-6257.001.patch, YARN-6257.002.patch, > YARN-6257.003.patch > > > In response string of CapacityScheduler REST API, > scheduler/schedulerInfo/health/operationsInfo have duplicate key 'entry' as a > JSON object : > {code} > "operationsInfo":{ > > "entry":{"key":"last-preemption","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, > > "entry":{"key":"last-reservation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, > > "entry":{"key":"last-allocation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, > > "entry":{"key":"last-release","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}} > } > {code} > To solve this problem, I suppose the type of operationsInfo field in > CapacitySchedulerHealthInfo class should be converted from Map to List. > After convert to List, The operationsInfo string will be: > {code} > "operationInfos":[ > > {"operation":"last-allocation","nodeId":"N/A","containerId":"N/A","queue":"N/A"}, > > {"operation":"last-release","nodeId":"N/A","containerId":"N/A","queue":"N/A"}, > > {"operation":"last-preemption","nodeId":"N/A","containerId":"N/A","queue":"N/A"}, > > {"operation":"last-reservation","nodeId":"N/A","containerId":"N/A","queue":"N/A"} > ] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
[ https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang reassigned YARN-7734: -- Assignee: Tao Yang (was: Xuan Gong) > YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess > - > > Key: YARN-7734 > URL: https://issues.apache.org/jira/browse/YARN-7734 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Tao Yang >Priority: Major > Attachments: YARN-7734.001.patch > > > It adds a call to LogAggregationFileControllerFactory where the context is > not filled in with the configuration in the mock in the unit test. > {code} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage > [ERROR] > testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage) > Time elapsed: 0.208 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key
[ https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416971#comment-16416971 ] genericqa commented on YARN-6257: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 62 unchanged - 5 fixed = 64 total (was 67) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 10s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 22s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 66m 40s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}145m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Unread field:CapacitySchedulerHealthInfo.java:[line 45] | \\ \\ || Subsystem || Report/Notes || | Docker |
[jira] [Commented] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
[ https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416898#comment-16416898 ] Tao Yang commented on YARN-7734: This UT failure is still there. Attached patch which adds {{when(context.getConf()).thenReturn(conf);}} for mock context to solve this failure. > YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess > - > > Key: YARN-7734 > URL: https://issues.apache.org/jira/browse/YARN-7734 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Xuan Gong >Priority: Major > Attachments: YARN-7734.001.patch > > > It adds a call to LogAggregationFileControllerFactory where the context is > not filled in with the configuration in the mock in the unit test. > {code} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage > [ERROR] > testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage) > Time elapsed: 0.208 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
[ https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-7734: -- Affects Version/s: 3.0.1 3.1.0 > YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess > - > > Key: YARN-7734 > URL: https://issues.apache.org/jira/browse/YARN-7734 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Miklos Szegedi >Assignee: Tao Yang >Priority: Major > Attachments: YARN-7734.001.patch > > > It adds a call to LogAggregationFileControllerFactory where the context is > not filled in with the configuration in the mock in the unit test. > {code} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage > [ERROR] > testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage) > Time elapsed: 0.208 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
[ https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417105#comment-16417105 ] Tao Yang commented on YARN-7734: Thanks [~cheersyang] for review and committing. > YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess > - > > Key: YARN-7734 > URL: https://issues.apache.org/jira/browse/YARN-7734 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Miklos Szegedi >Assignee: Tao Yang >Priority: Major > Fix For: 3.0.2, 3.2.0 > > Attachments: YARN-7734.001.patch > > > It adds a call to LogAggregationFileControllerFactory where the context is > not filled in with the configuration in the mock in the unit test. > {code} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage > [ERROR] > testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage) > Time elapsed: 0.208 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7935) Expose container's hostname to applications running within the docker container
[ https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417175#comment-16417175 ] Shane Kumpf commented on YARN-7935: --- {quote}Docker embedded DNS will use /etc/resolv.conf from host, and filter out local IP addresses (127.0.0.1 etc), if no entires are available, it will route to 8.8.8.8 {quote} [~eyang] this isn't true for overlay networks. You can't assume Registry DNS will be in use and it won't be used by some of these network types without additional modifications to Hadoop ({{--dns}} for {{docker run}}). {quote}I am concerned that some end user code will end up invoking InetAddress Java class{quote} This will use the IP of the container and whatever resolver the container is configured to use. Adding this environment variable doesn't change that. I'm not seeing the issue with adding an additional environment variable that is set to the same value as --hostname if this solves a problem for a class of application. No one is proposing modifying Hadoop IPC code to support NAT here or to use the {{--link}} feature, just adding an additional environment variable in non-entrypoint mode. Can you elaborate on the exact issue you see this new environment variable causing? > Expose container's hostname to applications running within the docker > container > --- > > Key: YARN-7935 > URL: https://issues.apache.org/jira/browse/YARN-7935 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > Attachments: YARN-7935.1.patch, YARN-7935.2.patch, YARN-7935.3.patch > > > Some applications have a need to bind to the container's hostname (like > Spark) which is different from the NodeManager's hostname(NM_HOST which is > available as an env during container launch) when launched through Docker > runtime. The container's hostname can be exposed to applications via an env > CONTAINER_HOSTNAME. Another potential candidate is the container's IP but > this can be addressed in a separate jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8048) Support auto-spawning of admin configured services during bootstrap of rm/apiserver
[ https://issues.apache.org/jira/browse/YARN-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-8048: Attachment: YARN-8048.005.patch > Support auto-spawning of admin configured services during bootstrap of > rm/apiserver > --- > > Key: YARN-8048 > URL: https://issues.apache.org/jira/browse/YARN-8048 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Major > Attachments: YARN-8048.001.patch, YARN-8048.002.patch, > YARN-8048.003.patch, YARN-8048.004.patch, YARN-8048.005.patch > > > Goal is to support auto-spawning of admin configured services during > bootstrap of resourcemanager/apiserver. > *Requirement:* Some of the services might required to be consumed by yarn > itself ex: Hbase for atsv2. Instead of depending on user installed HBase or > sometimes user may not required to install HBase at all, in such conditions > running HBase app on YARN will help for ATSv2. > Before YARN cluster is started, admin configure these services spec and place > it in common location in HDFS. At the time of RM/apiServer bootstrap, these > services will be submitted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key
[ https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-6257: --- Attachment: YARN-6257.004.patch > CapacityScheduler REST API produces incorrect JSON - JSON object > operationsInfo contains deplicate key > -- > > Key: YARN-6257 > URL: https://issues.apache.org/jira/browse/YARN-6257 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.8.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Minor > Attachments: YARN-6257.001.patch, YARN-6257.002.patch, > YARN-6257.003.patch, YARN-6257.004.patch > > > In response string of CapacityScheduler REST API, > scheduler/schedulerInfo/health/operationsInfo have duplicate key 'entry' as a > JSON object : > {code} > "operationsInfo":{ > > "entry":{"key":"last-preemption","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, > > "entry":{"key":"last-reservation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, > > "entry":{"key":"last-allocation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, > > "entry":{"key":"last-release","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}} > } > {code} > To solve this problem, I suppose the type of operationsInfo field in > CapacitySchedulerHealthInfo class should be converted from Map to List. > After convert to List, The operationsInfo string will be: > {code} > "operationInfos":[ > > {"operation":"last-allocation","nodeId":"N/A","containerId":"N/A","queue":"N/A"}, > > {"operation":"last-release","nodeId":"N/A","containerId":"N/A","queue":"N/A"}, > > {"operation":"last-preemption","nodeId":"N/A","containerId":"N/A","queue":"N/A"}, > > {"operation":"last-reservation","nodeId":"N/A","containerId":"N/A","queue":"N/A"} > ] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7988) Refactor FSNodeLabelStore code for attributes store support
[ https://issues.apache.org/jira/browse/YARN-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417197#comment-16417197 ] Bibin A Chundatt commented on YARN-7988: [~sunilg] Attaching patch after handling review comments. Basic test done from 2.8.3 to current *2.8.3* {noformat} root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin -addToClusterNodeLabels bibin 18/03/28 15:22:13 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8033 root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin -replaceLabelsOnNode xxx,bibin 18/03/28 15:22:32 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8033 root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin -replaceLabelsOnNode xxy,bibin 18/03/28 15:22:40 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8033 root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin -replaceLabelsOnNode xxz,bibin 18/03/28 15:22:49 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8033 root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin -replaceLabelsOnNode xxy, 18/03/28 15:23:08 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8033 root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin -addToClusterNodeLabels xxy 18/03/28 15:23:39 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8033 root@bibinpc:/opt/apacheprojects/hadoop/apache/hadoop-2.8.3/bin# ./yarn rmadmin -removeFromClusterNodeLabels xxy 18/03/28 15:23:51 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8033 {noformat} recovered in BRANCH {noformat} root@bibinpc:/opt/apacheprojects/hadoop/YARN3409/hadoop-dist/target/hadoop-3.1.0-SNAPSHOT/bin# ./yarn cluster -lnl 2018-03-28 16:45:53,065 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 Node Labels:
[jira] [Commented] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919
[ https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417208#comment-16417208 ] Rohith Sharma K S commented on YARN-7946: - Overall looks good. Does below change make sense? However two bullet points explains about each versions. {code:java} The version of Apache HBase that is supported with Timeline Service v.2 is 1.2.6 (default) and 2.0.0-beta1. {code} to {code:java} The supported version of Apache HBase are 1.2.6 (default) and 2.0.0-beta1. {code} > Update TimelineServerV2 doc as per YARN-7919 > > > Key: YARN-7946 > URL: https://issues.apache.org/jira/browse/YARN-7946 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-7946.00.patch > > > Post YARN-7919, document need to be updated for co processor jar name and > other related details if any. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
[ https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417027#comment-16417027 ] genericqa commented on YARN-7734: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 9s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 74m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | YARN-7734 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12916544/YARN-7734.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1f786d6b4fa6 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a71656c | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20117/testReport/ | | Max. process+thread count | 292 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20117/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > YARN-5418 breaks
[jira] [Commented] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
[ https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417091#comment-16417091 ] Weiwei Yang commented on YARN-7734: --- +1, will commit this shortly > YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess > - > > Key: YARN-7734 > URL: https://issues.apache.org/jira/browse/YARN-7734 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Tao Yang >Priority: Major > Attachments: YARN-7734.001.patch > > > It adds a call to LogAggregationFileControllerFactory where the context is > not filled in with the configuration in the mock in the unit test. > {code} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage > [ERROR] > testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage) > Time elapsed: 0.208 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7734) YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess
[ https://issues.apache.org/jira/browse/YARN-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417126#comment-16417126 ] Hudson commented on YARN-7734: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13891 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13891/]) YARN-7734. Fix UT failure (wwei: rev 411993f6e5723c8cba8100bff0269418e46f6367) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java > YARN-5418 breaks TestContainerLogsPage.testContainerLogPageAccess > - > > Key: YARN-7734 > URL: https://issues.apache.org/jira/browse/YARN-7734 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Miklos Szegedi >Assignee: Tao Yang >Priority: Major > Fix For: 3.0.2, 3.2.0 > > Attachments: YARN-7734.001.patch > > > It adds a call to LogAggregationFileControllerFactory where the context is > not filled in with the configuration in the mock in the unit test. > {code} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.492 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage > [ERROR] > testContainerLogPageAccess(org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage) > Time elapsed: 0.208 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.(LogAggregationFileControllerFactory.java:68) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.ContainerLogsPage$ContainersLogsBlock.(ContainerLogsPage.java:100) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage.testContainerLogPageAccess(TestContainerLogsPage.java:268) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7935) Expose container's hostname to applications running within the docker container
[ https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417175#comment-16417175 ] Shane Kumpf edited comment on YARN-7935 at 3/28/18 11:10 AM: - {quote}Docker embedded DNS will use /etc/resolv.conf from host, and filter out local IP addresses (127.0.0.1 etc), if no entires are available, it will route to 8.8.8.8 {quote} [~eyang] this isn't true for overlay networks. You can't assume Registry DNS will be in use and it won't be used by some of these network types without additional modifications to Hadoop ({{--dns}} for {{docker run}}). {quote}I am concerned that some end user code will end up invoking InetAddress Java class{quote} This will use the IP of the container and whatever resolver the container is configured to use. Adding this environment variable doesn't change that. I'm not seeing the issue with adding an additional environment variable that is set to the same value as {{\-\-hostname}} if this solves a problem for a class of application. No one is proposing modifying Hadoop IPC code to support NAT here or to use the {{--link}} feature, just adding an additional environment variable in non-entrypoint mode. Can you elaborate on the exact issue you see this new environment variable causing? was (Author: shaneku...@gmail.com): {quote}Docker embedded DNS will use /etc/resolv.conf from host, and filter out local IP addresses (127.0.0.1 etc), if no entires are available, it will route to 8.8.8.8 {quote} [~eyang] this isn't true for overlay networks. You can't assume Registry DNS will be in use and it won't be used by some of these network types without additional modifications to Hadoop ({{--dns}} for {{docker run}}). {quote}I am concerned that some end user code will end up invoking InetAddress Java class{quote} This will use the IP of the container and whatever resolver the container is configured to use. Adding this environment variable doesn't change that. I'm not seeing the issue with adding an additional environment variable that is set to the same value as {{--hostname}} if this solves a problem for a class of application. No one is proposing modifying Hadoop IPC code to support NAT here or to use the {{--link}} feature, just adding an additional environment variable in non-entrypoint mode. Can you elaborate on the exact issue you see this new environment variable causing? > Expose container's hostname to applications running within the docker > container > --- > > Key: YARN-7935 > URL: https://issues.apache.org/jira/browse/YARN-7935 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > Attachments: YARN-7935.1.patch, YARN-7935.2.patch, YARN-7935.3.patch > > > Some applications have a need to bind to the container's hostname (like > Spark) which is different from the NodeManager's hostname(NM_HOST which is > available as an env during container launch) when launched through Docker > runtime. The container's hostname can be exposed to applications via an env > CONTAINER_HOSTNAME. Another potential candidate is the container's IP but > this can be addressed in a separate jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7935) Expose container's hostname to applications running within the docker container
[ https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417175#comment-16417175 ] Shane Kumpf edited comment on YARN-7935 at 3/28/18 11:10 AM: - {quote}Docker embedded DNS will use /etc/resolv.conf from host, and filter out local IP addresses (127.0.0.1 etc), if no entires are available, it will route to 8.8.8.8 {quote} [~eyang] this isn't true for overlay networks. You can't assume Registry DNS will be in use and it won't be used by some of these network types without additional modifications to Hadoop ({{--dns}} for {{docker run}}). {quote}I am concerned that some end user code will end up invoking InetAddress Java class{quote} This will use the IP of the container and whatever resolver the container is configured to use. Adding this environment variable doesn't change that. I'm not seeing the issue with adding an additional environment variable that is set to the same value as {{--hostname}} if this solves a problem for a class of application. No one is proposing modifying Hadoop IPC code to support NAT here or to use the {{--link}} feature, just adding an additional environment variable in non-entrypoint mode. Can you elaborate on the exact issue you see this new environment variable causing? was (Author: shaneku...@gmail.com): {quote}Docker embedded DNS will use /etc/resolv.conf from host, and filter out local IP addresses (127.0.0.1 etc), if no entires are available, it will route to 8.8.8.8 {quote} [~eyang] this isn't true for overlay networks. You can't assume Registry DNS will be in use and it won't be used by some of these network types without additional modifications to Hadoop ({{--dns}} for {{docker run}}). {quote}I am concerned that some end user code will end up invoking InetAddress Java class{quote} This will use the IP of the container and whatever resolver the container is configured to use. Adding this environment variable doesn't change that. I'm not seeing the issue with adding an additional environment variable that is set to the same value as --hostname if this solves a problem for a class of application. No one is proposing modifying Hadoop IPC code to support NAT here or to use the {{--link}} feature, just adding an additional environment variable in non-entrypoint mode. Can you elaborate on the exact issue you see this new environment variable causing? > Expose container's hostname to applications running within the docker > container > --- > > Key: YARN-7935 > URL: https://issues.apache.org/jira/browse/YARN-7935 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > Attachments: YARN-7935.1.patch, YARN-7935.2.patch, YARN-7935.3.patch > > > Some applications have a need to bind to the container's hostname (like > Spark) which is different from the NodeManager's hostname(NM_HOST which is > available as an env during container launch) when launched through Docker > runtime. The container's hostname can be exposed to applications via an env > CONTAINER_HOSTNAME. Another potential candidate is the container's IP but > this can be addressed in a separate jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key
[ https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417016#comment-16417016 ] Weiwei Yang commented on YARN-6257: --- Hi [~Tao Yang] Can you please fix the checkstyle and findbugs issues? About the message, based on [~sunilg]'s comment, how about {quote}the health metrics of capacity scheduler. This metrics existed since 2.8.0, but the output was not well formatted. Hence users can not make use of this field cleanly, this is optimized from 3.2.0 onwards. {quote} Basically I don't want to say it was an illegal JSON as it follows JSON spec. Does that make sense? > CapacityScheduler REST API produces incorrect JSON - JSON object > operationsInfo contains deplicate key > -- > > Key: YARN-6257 > URL: https://issues.apache.org/jira/browse/YARN-6257 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.8.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Minor > Attachments: YARN-6257.001.patch, YARN-6257.002.patch, > YARN-6257.003.patch > > > In response string of CapacityScheduler REST API, > scheduler/schedulerInfo/health/operationsInfo have duplicate key 'entry' as a > JSON object : > {code} > "operationsInfo":{ > > "entry":{"key":"last-preemption","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, > > "entry":{"key":"last-reservation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, > > "entry":{"key":"last-allocation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, > > "entry":{"key":"last-release","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}} > } > {code} > To solve this problem, I suppose the type of operationsInfo field in > CapacitySchedulerHealthInfo class should be converted from Map to List. > After convert to List, The operationsInfo string will be: > {code} > "operationInfos":[ > > {"operation":"last-allocation","nodeId":"N/A","containerId":"N/A","queue":"N/A"}, > > {"operation":"last-release","nodeId":"N/A","containerId":"N/A","queue":"N/A"}, > > {"operation":"last-preemption","nodeId":"N/A","containerId":"N/A","queue":"N/A"}, > > {"operation":"last-reservation","nodeId":"N/A","containerId":"N/A","queue":"N/A"} > ] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8048) Support auto-spawning of admin configured services during bootstrap of rm/apiserver
[ https://issues.apache.org/jira/browse/YARN-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417304#comment-16417304 ] genericqa commented on YARN-8048: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 18s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 17s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 19s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 22s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 3 new + 272 unchanged - 0 fixed = 275 total (was 272) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 7s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 12s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 65m 44s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 24s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 36s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 35s{color} | {color:red} The patch generated 4 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}169m 54s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce
[jira] [Updated] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919
[ https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-7946: - Attachment: YARN-7946.01.patch > Update TimelineServerV2 doc as per YARN-7919 > > > Key: YARN-7946 > URL: https://issues.apache.org/jira/browse/YARN-7946 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-7946.00.patch, YARN-7946.01.patch > > > Post YARN-7919, document need to be updated for co processor jar name and > other related details if any. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7988) Refactor FSNodeLabelStore code for attributes store support
[ https://issues.apache.org/jira/browse/YARN-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-7988: --- Attachment: YARN-7988-YARN-3409.007.patch > Refactor FSNodeLabelStore code for attributes store support > --- > > Key: YARN-7988 > URL: https://issues.apache.org/jira/browse/YARN-7988 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Major > Attachments: YARN-7988-YARN-3409.002.patch, > YARN-7988-YARN-3409.003.patch, YARN-7988-YARN-3409.004.patch, > YARN-7988-YARN-3409.005.patch, YARN-7988-YARN-3409.006.patch, > YARN-7988-YARN-3409.007.patch, YARN-7988.001.patch > > > # Abstract out file FileSystemStore operation > # Define EditLog Operartions and Mirror operation > # Support compatibility with old nodelabel store -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919
[ https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417319#comment-16417319 ] Haibo Chen commented on YARN-7946: -- Let me make that change in a new patch. > Update TimelineServerV2 doc as per YARN-7919 > > > Key: YARN-7946 > URL: https://issues.apache.org/jira/browse/YARN-7946 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-7946.00.patch > > > Post YARN-7919, document need to be updated for co processor jar name and > other related details if any. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key
[ https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417247#comment-16417247 ] genericqa commented on YARN-6257: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 9s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 2s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 62 unchanged - 5 fixed = 64 total (was 67) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 13s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}146m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | YARN-6257 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12916577/YARN-6257.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname |
[jira] [Updated] (YARN-7221) Add security check for privileged docker container
[ https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7221: Attachment: YARN-7221.012.patch > Add security check for privileged docker container > -- > > Key: YARN-7221 > URL: https://issues.apache.org/jira/browse/YARN-7221 > Project: Hadoop YARN > Issue Type: Sub-task > Components: security >Affects Versions: 3.0.0, 3.1.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-7221.001.patch, YARN-7221.002.patch, > YARN-7221.003.patch, YARN-7221.004.patch, YARN-7221.005.patch, > YARN-7221.006.patch, YARN-7221.007.patch, YARN-7221.008.patch, > YARN-7221.009.patch, YARN-7221.010.patch, YARN-7221.011.patch, > YARN-7221.012.patch > > > When a docker is running with privileges, majority of the use case is to have > some program running with root then drop privileges to another user. i.e. > httpd to start with privileged and bind to port 80, then drop privileges to > www user. > # We should add security check for submitting users, to verify they have > "sudo" access to run privileged container. > # We should remove --user=uid:gid for privileged containers. > > Docker can be launched with --privileged=true, and --user=uid:gid flag. With > this parameter combinations, user will not have access to become root user. > All docker exec command will be drop to uid:gid user to run instead of > granting privileges. User can gain root privileges if container file system > contains files that give user extra power, but this type of image is > considered as dangerous. Non-privileged user can launch container with > special bits to acquire same level of root power. Hence, we lose control of > which image should be run with --privileges, and who have sudo rights to use > privileged container images. As the result, we should check for sudo access > then decide to parameterize --privileged=true OR --user=uid:gid. This will > avoid leading developer down the wrong path. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1151) Ability to configure auxiliary services from HDFS-based JAR files
[ https://issues.apache.org/jira/browse/YARN-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417819#comment-16417819 ] Xuan Gong commented on YARN-1151: - [~rkanter] Could you review the latest patch, please? > Ability to configure auxiliary services from HDFS-based JAR files > - > > Key: YARN-1151 > URL: https://issues.apache.org/jira/browse/YARN-1151 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.1.0-beta, 2.9.0 >Reporter: john lilley >Assignee: Xuan Gong >Priority: Major > Labels: auxiliary-service, yarn > Attachments: YARN-1151.1.patch, YARN-1151.2.patch, > YARN-1151.branch-2.poc.2.patch, YARN-1151.branch-2.poc.3.patch, > YARN-1151.branch-2.poc.patch, [YARN-1151] [Design] Configure auxiliary > services from HDFS-based JAR files.pdf > > > I would like to install an auxiliary service in Hadoop YARN without actually > installing files/services on every node in the system. Discussions on the > user@ list indicate that this is not easily done. The reason we want an > auxiliary service is that our application has some persistent-data components > that are not appropriate for HDFS. In fact, they are somewhat analogous to > the mapper output of MapReduce's shuffle, which is what led me to > auxiliary-services in the first place. It would be much easier if we could > just place our service's JARs in HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8080) YARN native service should support component restart policy
[ https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8080: - Attachment: (was: YARN-8080.004.patch) > YARN native service should support component restart policy > --- > > Key: YARN-8080 > URL: https://issues.apache.org/jira/browse/YARN-8080 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8080.001.patch, YARN-8080.002.patch, > YARN-8080.003.patch, YARN-8080.005.patch > > > Existing native service assumes the service is long running and never > finishes. Containers will be restarted even if exit code == 0. > To support boarder use cases, we need to allow restart policy of component > specified by users. Propose to have following policies: > 1) Always: containers always restarted by framework regardless of container > exit status. This is existing/default behavior. > 2) Never: Do not restart containers in any cases after container finishes: To > support job-like workload (for example Tensorflow training job). If a task > exit with code == 0, we should not restart the task. This can be used by > services which is not restart/recovery-able. > 3) On-failure: Similar to above, only restart task with exitcode != 0. > Behaviors after component *instance* finalize (Succeeded or Failed when > restart_policy != ALWAYS): > 1) For single component, single instance: complete service. > 2) For single component, multiple instance: other running instances from the > same component won't be affected by the finalized component instance. Service > will be terminated once all instances finalized. > 3) For multiple components: Service will be terminated once all components > finalized. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8080) YARN native service should support component restart policy
[ https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8080: - Attachment: YARN-8080.005.patch > YARN native service should support component restart policy > --- > > Key: YARN-8080 > URL: https://issues.apache.org/jira/browse/YARN-8080 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8080.001.patch, YARN-8080.002.patch, > YARN-8080.003.patch, YARN-8080.005.patch > > > Existing native service assumes the service is long running and never > finishes. Containers will be restarted even if exit code == 0. > To support boarder use cases, we need to allow restart policy of component > specified by users. Propose to have following policies: > 1) Always: containers always restarted by framework regardless of container > exit status. This is existing/default behavior. > 2) Never: Do not restart containers in any cases after container finishes: To > support job-like workload (for example Tensorflow training job). If a task > exit with code == 0, we should not restart the task. This can be used by > services which is not restart/recovery-able. > 3) On-failure: Similar to above, only restart task with exitcode != 0. > Behaviors after component *instance* finalize (Succeeded or Failed when > restart_policy != ALWAYS): > 1) For single component, single instance: complete service. > 2) For single component, multiple instance: other running instances from the > same component won't be affected by the finalized component instance. Service > will be terminated once all instances finalized. > 3) For multiple components: Service will be terminated once all components > finalized. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8010) Add config in FederationRMFailoverProxy to not bypass facade cache when failing over
[ https://issues.apache.org/jira/browse/YARN-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417927#comment-16417927 ] Hudson commented on YARN-8010: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13894 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13894/]) Revert "YARN-8010. Add config in FederationRMFailoverProxy to not bypass (subru: rev 725b10e3aee383d049c97f8ed2b0b1ae873d5ae8) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/failover/FederationRMFailoverProxyProvider.java YARN-8010. Add config in FederationRMFailoverProxy to not bypass facade (subru: rev 0d7e014fde717e8b122773b68664f4594106) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfigurationFields.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/failover/FederationRMFailoverProxyProvider.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestFederationRMFailoverProxyProvider.java > Add config in FederationRMFailoverProxy to not bypass facade cache when > failing over > > > Key: YARN-8010 > URL: https://issues.apache.org/jira/browse/YARN-8010 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Fix For: 2.10.0, 2.9.1, 3.1.1 > > Attachments: YARN-8010.v1.patch, YARN-8010.v1.patch, > YARN-8010.v2.patch, YARN-8010.v3.patch > > > Today when YarnRM is failing over, the FederationRMFailoverProxy running in > AMRMProxy will perform failover, try to get latest subcluster info from > FederationStateStore and then retry connect to the latest YarnRM master. When > calling getSubCluster() to FederationStateStoreFacade, it bypasses the cache > with a flush flag. When YarnRM is failing over, every AM heartbeat thread > creates a different thread inside FederationInterceptor, each of which keeps > performing failover several times. This leads to a big spike of getSubCluster > call to FederationStateStore. > Depending on the cluster setup (e.g. putting a VIP before all YarnRMs), > YarnRM master slave change might not result in RM addr change. In other > cases, a small delay of getting latest subcluster information may be > acceptable. This patch thus creates a config option, so that it is possible > to ask the FederationRMFailoverProxy to not flush cache when calling > getSubCluster(). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7623) Fix the CapacityScheduler Queue configuration documentation
[ https://issues.apache.org/jira/browse/YARN-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417928#comment-16417928 ] Hudson commented on YARN-7623: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13894 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13894/]) YARN-7623. Fix the CapacityScheduler Queue configuration documentation. (zezhang: rev 0b1c2b5fe1b5c225d208936ecb1d3e307a535ee6) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md > Fix the CapacityScheduler Queue configuration documentation > --- > > Key: YARN-7623 > URL: https://issues.apache.org/jira/browse/YARN-7623 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Jonathan Hung >Priority: Major > Fix For: 2.10.0, 2.9.1, 3.0.2, 3.1.1 > > Attachments: Screen Shot 2018-03-27 at 3.02.45 PM.png, > YARN-7623.001.patch, YARN-7623.002.patch > > > It looks like the [Changing Queue > Configuration|https://hadoop.apache.org/docs/r2.9.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Changing_queue_configuration_via_API] > section is mis-formatted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6936) [Atsv2] Retrospect storing entities into sub application table from client perspective
[ https://issues.apache.org/jira/browse/YARN-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417766#comment-16417766 ] Rohith Sharma K S commented on YARN-6936: - bq. Let's add the scope of the entities to each of the four methods OK, Does this modified sentence fine? {{Send the information of a number of conceptual entities in the scope of a YARN application to the timeline service v.2 collector.}}. Does all 4 API need to be modified with same way? For newer API, it should be out side scope of application also right? bq. Is it intended to extend updateAggregateStatus() so that sub application metrics are rolled up? I vaguely remember this we discussed in weekly call and decided to aggregate for both APIs. Because newer APIs write into both tables i.e entity and subapp table which. So aggregated metrics can also available in application scope as well. bq. The TimelineCollectorContext is bound to an application attempt. Adding a subApplicationWrite flag to TimelineCollectorContext may not be the most intuitive approach. How about we leave subApplicationWrite as a separate flag instead? I would inclined to send required information in record rather sending in parameter. This avoids compatibility in future. May be let's define newer record that contains context, ugi and subappwrite. cc :/ [~vrushalic] > [Atsv2] Retrospect storing entities into sub application table from client > perspective > -- > > Key: YARN-6936 > URL: https://issues.apache.org/jira/browse/YARN-6936 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Major > Attachments: YARN-6936.000.patch, YARN-6936.001.patch > > > Currently YARN-6734 stores entities into sub application table only if doAs > user and submitted users are different. This holds good for Tez kind of use > cases. But AM runs as same as submitted user like MR also need to store > entities in sub application table so that it could read entities without > application id. > This would be a point of concern later stages when ATSv2 is deployed into > production. This JIRA is to retrospect decision of storing entities into sub > application table based on client side configuration driven rather than user > driven. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8080) YARN native service should support component restart policy
[ https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417861#comment-16417861 ] Wangda Tan commented on YARN-8080: -- Attached ver.005 patch, which added tests to cover single component / multi components cases; > YARN native service should support component restart policy > --- > > Key: YARN-8080 > URL: https://issues.apache.org/jira/browse/YARN-8080 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8080.001.patch, YARN-8080.002.patch, > YARN-8080.003.patch, YARN-8080.005.patch > > > Existing native service assumes the service is long running and never > finishes. Containers will be restarted even if exit code == 0. > To support boarder use cases, we need to allow restart policy of component > specified by users. Propose to have following policies: > 1) Always: containers always restarted by framework regardless of container > exit status. This is existing/default behavior. > 2) Never: Do not restart containers in any cases after container finishes: To > support job-like workload (for example Tensorflow training job). If a task > exit with code == 0, we should not restart the task. This can be used by > services which is not restart/recovery-able. > 3) On-failure: Similar to above, only restart task with exitcode != 0. > Behaviors after component *instance* finalize (Succeeded or Failed when > restart_policy != ALWAYS): > 1) For single component, single instance: complete service. > 2) For single component, multiple instance: other running instances from the > same component won't be affected by the finalized component instance. Service > will be terminated once all instances finalized. > 3) For multiple components: Service will be terminated once all components > finalized. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7690) expose reserved memory/Vcores of nodemanager at webUI
[ https://issues.apache.org/jira/browse/YARN-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-7690: - Issue Type: Improvement (was: New Feature) > expose reserved memory/Vcores of nodemanager at webUI > -- > > Key: YARN-7690 > URL: https://issues.apache.org/jira/browse/YARN-7690 > Project: Hadoop YARN > Issue Type: Improvement > Components: webapp >Reporter: tianjuan >Priority: Major > Attachments: YARN-7690.patch > > > now only total reserved memory/Vcores are exposed at RM webUI, reserved > memory/Vcores of a single nodemanager is hard to find out. it confuses users > that they obeserve that there are available memory/Vcores at nodes page, but > their jobs are stuck and waiting for resouce to be allocated. It is helpful > for bedug to expose reserved memory/Vcores of every single nodemanager, and > memory/Vcores that can be allocated( unallocated minus reserved) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8084) Yarn native service rename for easier development?
[ https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8084: - Environment: (was: There're a couple of classes with same name exists in YARN native service. Such as: 1) ...service.component.Component and api.records.Component. This makes harder when development in IDE since clash of class name forces to use full qualified class name. Similarly in API definition: ...service.api.records: Container/ContainerState/Resource/ResourceInformation. How about rename them to: ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation?) > Yarn native service rename for easier development? > -- > > Key: YARN-8084 > URL: https://issues.apache.org/jira/browse/YARN-8084 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8084) Yarn native service rename for easier development?
Wangda Tan created YARN-8084: Summary: Yarn native service rename for easier development? Key: YARN-8084 URL: https://issues.apache.org/jira/browse/YARN-8084 Project: Hadoop YARN Issue Type: Task Environment: There're a couple of classes with same name exists in YARN native service. Such as: 1) ...service.component.Component and api.records.Component. This makes harder when development in IDE since clash of class name forces to use full qualified class name. Similarly in API definition: ...service.api.records: Container/ContainerState/Resource/ResourceInformation. How about rename them to: ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation? Reporter: Wangda Tan -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8084) Yarn native service rename for easier development?
[ https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8084: - Description: There're a couple of classes with same name exists in YARN native service. Such as: 1) ...service.component.Component and api.records.Component. This makes harder when development in IDE since clash of class name forces to use full qualified class name. Similarly in API definition: ...service.api.records: Container/ContainerState/Resource/ResourceInformation. How about rename them to: ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation? > Yarn native service rename for easier development? > -- > > Key: YARN-8084 > URL: https://issues.apache.org/jira/browse/YARN-8084 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Priority: Major > > There're a couple of classes with same name exists in YARN native service. > Such as: > 1) ...service.component.Component and api.records.Component. > This makes harder when development in IDE since clash of class name forces to > use full qualified class name. > Similarly in API definition: > ...service.api.records: > Container/ContainerState/Resource/ResourceInformation. How about rename them > to: > ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919
[ https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-7946: - Attachment: YARN-7946.02.patch > Update TimelineServerV2 doc as per YARN-7919 > > > Key: YARN-7946 > URL: https://issues.apache.org/jira/browse/YARN-7946 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-7946.00.patch, YARN-7946.01.patch, > YARN-7946.02.patch > > > Post YARN-7919, document need to be updated for co processor jar name and > other related details if any. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance
[ https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417988#comment-16417988 ] Gour Saha commented on YARN-7939: - Yes, we should use UPGRADING instead of UPGRADE (which is an action verb). > Yarn Service Upgrade: add support to upgrade a component instance > -- > > Key: YARN-7939 > URL: https://issues.apache.org/jira/browse/YARN-7939 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-7939.001.patch > > > Yarn core supports in-place upgrade of containers. A yarn service can > leverage that to provide in-place upgrade of component instances. Please see > YARN-7512 for details. > Will add support to upgrade a single component instance first and then > iteratively add other APIs and features. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7935) Expose container's hostname to applications running within the docker container
[ https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418029#comment-16418029 ] Mridul Muralidharan commented on YARN-7935: --- [~eyang] I think there is some confusion here. Spark does not require user defined networks - I dont think it was mentioned that this was required. Taking a step back: With "host" networking mode, we get it to work without any changes to the application code at all - giving us all the benefits of isolation without any loss in existing functionality (modulo specifying the env variables required ofcourse). When used with bridge/overlay/user defined networks/etc, the container hostname passed to spark AM via allocation request is that of nodemanager, and not the actual container hostname used in the docker container. This patch exposes the container hostname as an env variable - just as we have other container and node specific env variables exposed to the container (CONTAINER_ID, NM_HOST, etc). Do you see any concern with exposing this variable ? I want to make sure I am not missing something here. What spark (or any other application) does with this variable is their implementation detail; I can go into details of why this is required in the case of spark specifically if required, but that might digress from the jira. > Expose container's hostname to applications running within the docker > container > --- > > Key: YARN-7935 > URL: https://issues.apache.org/jira/browse/YARN-7935 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > Attachments: YARN-7935.1.patch, YARN-7935.2.patch, YARN-7935.3.patch > > > Some applications have a need to bind to the container's hostname (like > Spark) which is different from the NodeManager's hostname(NM_HOST which is > available as an env during container launch) when launched through Docker > runtime. The container's hostname can be exposed to applications via an env > CONTAINER_HOSTNAME. Another potential candidate is the container's IP but > this can be addressed in a separate jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919
[ https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418058#comment-16418058 ] genericqa commented on YARN-7946: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 47s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 25m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 58m 29s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 27m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 97m 7s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | YARN-7946 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12916670/YARN-7946.02.patch | | Optional Tests | asflicense mvnsite | | uname | Linux 59798d31bb1c 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0b1c2b5 | | maven | version: Apache Maven 3.3.9 | | Max. process+thread count | 441 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site . U: . | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20128/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Update TimelineServerV2 doc as per YARN-7919 > > > Key: YARN-7946 > URL: https://issues.apache.org/jira/browse/YARN-7946 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-7946.00.patch, YARN-7946.01.patch, > YARN-7946.02.patch > > > Post YARN-7919, document need to be updated for co processor jar name and > other related details if any. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-2478) Nested containers should be supported
[ https://issues.apache.org/jira/browse/YARN-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen resolved YARN-2478. -- Resolution: Won't Fix Closing this as DockerContainerExecutor has been deprecated in branch-2 and removed in trunk > Nested containers should be supported > - > > Key: YARN-2478 > URL: https://issues.apache.org/jira/browse/YARN-2478 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abin Shahab >Priority: Major > > Currently DockerContainerExecutor only supports one level of containers. > However, YARN's responsibility is to handle resource isolation, and nested > containers would allow YARN to delegate handling software isolation to the > jobs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7905) Parent directory permission incorrect during public localization
[ https://issues.apache.org/jira/browse/YARN-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-7905: --- Attachment: YARN-7905-008.patch > Parent directory permission incorrect during public localization > - > > Key: YARN-7905 > URL: https://issues.apache.org/jira/browse/YARN-7905 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-7905-001.patch, YARN-7905-002.patch, > YARN-7905-003.patch, YARN-7905-004.patch, YARN-7905-005.patch, > YARN-7905-006.patch, YARN-7905-007.patch, YARN-7905-008.patch > > > Similar to YARN-6708 during public localization also we have to take care for > parent directory if the umask is 027 during node manager start up. > /filecache/0/200 > the directory permission of /filecache/0 is 750. Which cause > application failure -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7905) Parent directory permission incorrect during public localization
[ https://issues.apache.org/jira/browse/YARN-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417752#comment-16417752 ] Bibin A Chundatt commented on YARN-7905: Uploaded patch again to trigger jenkins. Missed to commit this patch. > Parent directory permission incorrect during public localization > - > > Key: YARN-7905 > URL: https://issues.apache.org/jira/browse/YARN-7905 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-7905-001.patch, YARN-7905-002.patch, > YARN-7905-003.patch, YARN-7905-004.patch, YARN-7905-005.patch, > YARN-7905-006.patch, YARN-7905-007.patch, YARN-7905-008.patch > > > Similar to YARN-6708 during public localization also we have to take care for > parent directory if the umask is 027 during node manager start up. > /filecache/0/200 > the directory permission of /filecache/0 is 750. Which cause > application failure -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-1151) Ability to configure auxiliary services from HDFS-based JAR files
[ https://issues.apache.org/jira/browse/YARN-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1151: Attachment: YARN-1151.2.patch > Ability to configure auxiliary services from HDFS-based JAR files > - > > Key: YARN-1151 > URL: https://issues.apache.org/jira/browse/YARN-1151 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.1.0-beta, 2.9.0 >Reporter: john lilley >Assignee: Xuan Gong >Priority: Major > Labels: auxiliary-service, yarn > Attachments: YARN-1151.1.patch, YARN-1151.2.patch, > YARN-1151.branch-2.poc.2.patch, YARN-1151.branch-2.poc.3.patch, > YARN-1151.branch-2.poc.patch, [YARN-1151] [Design] Configure auxiliary > services from HDFS-based JAR files.pdf > > > I would like to install an auxiliary service in Hadoop YARN without actually > installing files/services on every node in the system. Discussions on the > user@ list indicate that this is not easily done. The reason we want an > auxiliary service is that our application has some persistent-data components > that are not appropriate for HDFS. In fact, they are somewhat analogous to > the mapper output of MapReduce's shuffle, which is what led me to > auxiliary-services in the first place. It would be much easier if we could > just place our service's JARs in HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7905) Parent directory permission incorrect during public localization
[ https://issues.apache.org/jira/browse/YARN-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417873#comment-16417873 ] genericqa commented on YARN-7905: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 26s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 6s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | YARN-7905 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12916651/YARN-7905-008.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4d4da0e09aea 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cdee0a4 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/20124/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20124/testReport/ | | Max. process+thread count | 407 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U:
[jira] [Comment Edited] (YARN-8079) YARN native service should respect source file of ConfigFile inside Service/Component spec
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417747#comment-16417747 ] Wangda Tan edited comment on YARN-8079 at 3/28/18 4:58 PM: --- Thanks [~gsaha], Is there any additional suggestions to the patch or we're good to go? cc: [~billie.rinaldi]/[~eyang] was (Author: leftnoteasy): Thanks [~gsaha], Is there any additional suggestions to the patch or we're good to go? > YARN native service should respect source file of ConfigFile inside > Service/Component spec > -- > > Key: YARN-8079 > URL: https://issues.apache.org/jira/browse/YARN-8079 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-8079.001.patch, YARN-8079.002.patch, > YARN-8079.003.patch > > > Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly > read srcFile, instead it always construct {{remoteFile}} by using > componentDir and fileName of {{destFile}}: > {code} > Path remoteFile = new Path(compInstanceDir, fileName); > {code} > To me it is a common use case which services have some files existed in HDFS > and need to be localized when components get launched. (For example, if we > want to serve a Tensorflow model, we need to localize Tensorflow model > (typically not huge, less than GB) to local disk. Otherwise launched docker > container has to access HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919
[ https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417773#comment-16417773 ] Rohith Sharma K S commented on YARN-7946: - The similar changes required in Building.txt file also i.e 2nd sentence in 1st paragraph. > Update TimelineServerV2 doc as per YARN-7919 > > > Key: YARN-7946 > URL: https://issues.apache.org/jira/browse/YARN-7946 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-7946.00.patch, YARN-7946.01.patch > > > Post YARN-7919, document need to be updated for co processor jar name and > other related details if any. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.
[ https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen resolved YARN-7859. -- Resolution: Won't Do > New feature: add queue scheduling deadLine in fairScheduler. > > > Key: YARN-7859 > URL: https://issues.apache.org/jira/browse/YARN-7859 > Project: Hadoop YARN > Issue Type: New Feature > Components: fairscheduler >Affects Versions: 3.0.0 >Reporter: wangwj >Assignee: wangwj >Priority: Major > Labels: fairscheduler, features, patch > Attachments: YARN-7859-v1.patch, YARN-7859-v2.patch, log, > screenshot-1.png, screenshot-3.png > > Original Estimate: 24h > Remaining Estimate: 24h > > As everyone knows.In FairScheduler the phenomenon of queue scheduling > starvation often occurs when the number of cluster jobs is large.The App in > one or more queue are pending.So I have thought a way to solve this > problem.Add queue scheduling deadLine in fairScheduler.When a queue is not > scheduled for FairScheduler within a specified time.We mandatory scheduler it! > On the basis of the above, I propose this issue... -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.
[ https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen reopened YARN-7859: -- > New feature: add queue scheduling deadLine in fairScheduler. > > > Key: YARN-7859 > URL: https://issues.apache.org/jira/browse/YARN-7859 > Project: Hadoop YARN > Issue Type: New Feature > Components: fairscheduler >Affects Versions: 3.0.0 >Reporter: wangwj >Assignee: wangwj >Priority: Major > Labels: fairscheduler, features, patch > Attachments: YARN-7859-v1.patch, YARN-7859-v2.patch, log, > screenshot-1.png, screenshot-3.png > > Original Estimate: 24h > Remaining Estimate: 24h > > As everyone knows.In FairScheduler the phenomenon of queue scheduling > starvation often occurs when the number of cluster jobs is large.The App in > one or more queue are pending.So I have thought a way to solve this > problem.Add queue scheduling deadLine in fairScheduler.When a queue is not > scheduled for FairScheduler within a specified time.We mandatory scheduler it! > On the basis of the above, I propose this issue... -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7859) New feature: add queue scheduling deadLine in fairScheduler.
[ https://issues.apache.org/jira/browse/YARN-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-7859: - Hadoop Flags: (was: Reviewed) Fix Version/s: (was: 3.0.0) > New feature: add queue scheduling deadLine in fairScheduler. > > > Key: YARN-7859 > URL: https://issues.apache.org/jira/browse/YARN-7859 > Project: Hadoop YARN > Issue Type: New Feature > Components: fairscheduler >Affects Versions: 3.0.0 >Reporter: wangwj >Assignee: wangwj >Priority: Major > Labels: fairscheduler, features, patch > Attachments: YARN-7859-v1.patch, YARN-7859-v2.patch, log, > screenshot-1.png, screenshot-3.png > > Original Estimate: 24h > Remaining Estimate: 24h > > As everyone knows.In FairScheduler the phenomenon of queue scheduling > starvation often occurs when the number of cluster jobs is large.The App in > one or more queue are pending.So I have thought a way to solve this > problem.Add queue scheduling deadLine in fairScheduler.When a queue is not > scheduled for FairScheduler within a specified time.We mandatory scheduler it! > On the basis of the above, I propose this issue... -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7221) Add security check for privileged docker container
[ https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417963#comment-16417963 ] genericqa commented on YARN-7221: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 37s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 25s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | YARN-7221 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12916658/YARN-7221.012.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 6b8784f3fffb 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cdee0a4 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/20125/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20125/testReport/ | | Max. process+thread count | 341 (vs. ulimit of 1) | | modules | C:
[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance
[ https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418014#comment-16418014 ] Chandni Singh commented on YARN-7939: - [~gsaha] {quote}Yes, we should use UPGRADING instead of UPGRADE (which is an action verb). {quote} This is inconsistent with other states. Please see my previous comment. # To trigger stop of the service, the ServiceState that is specified is {{STOPPED}} instead of {{STOP}} # To trigger start of the service, the ServiceState that is specified is {{STARTED}} instead of {{START}} > Yarn Service Upgrade: add support to upgrade a component instance > -- > > Key: YARN-7939 > URL: https://issues.apache.org/jira/browse/YARN-7939 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-7939.001.patch > > > Yarn core supports in-place upgrade of containers. A yarn service can > leverage that to provide in-place upgrade of component instances. Please see > YARN-7512 for details. > Will add support to upgrade a single component instance first and then > iteratively add other APIs and features. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7142) Support placement policy in yarn native services
[ https://issues.apache.org/jira/browse/YARN-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418053#comment-16418053 ] Wangda Tan commented on YARN-7142: -- Thanks [~gsaha], the last patch looks good. I would prefer to let another set of eyes to look at this patch, [~sunilg] could you help with the patch review? I plan to commit the patch by end of tomorrow if no objections / additional reviews. > Support placement policy in yarn native services > > > Key: YARN-7142 > URL: https://issues.apache.org/jira/browse/YARN-7142 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Reporter: Billie Rinaldi >Assignee: Gour Saha >Priority: Major > Attachments: YARN-7142.001.patch, YARN-7142.002.patch, > YARN-7142.003.patch, YARN-7142.004.patch > > > Placement policy exists in the API but is not implemented yet. > I have filed YARN-8074 to move the composite constraints implementation out > of this phase-1 implementation of placement policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-2477) DockerContainerExecutor must support secure mode
[ https://issues.apache.org/jira/browse/YARN-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen resolved YARN-2477. -- Resolution: Won't Fix Closing this as DockerContainerExecutor has been deprecated in branch-2 and removed in trunk. > DockerContainerExecutor must support secure mode > > > Key: YARN-2477 > URL: https://issues.apache.org/jira/browse/YARN-2477 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Abin Shahab >Priority: Major > Labels: security > > DockerContainerExecutor(patch in YARN-1964) does not support Kerberized > hadoop clusters yet, as Kerberized hadoop cluster has a strict dependency on > the LinuxContainerExecutor. > For Docker containers to be used in production environment, they must support > secure hadoop. Issues regarding Java's AES encryption library in a > containerized environment also need to be worked out. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8079) YARN native service should respect source file of ConfigFile inside Service/Component spec
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417747#comment-16417747 ] Wangda Tan commented on YARN-8079: -- Thanks [~gsaha], Is there any additional suggestions to the patch or we're good to go? > YARN native service should respect source file of ConfigFile inside > Service/Component spec > -- > > Key: YARN-8079 > URL: https://issues.apache.org/jira/browse/YARN-8079 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-8079.001.patch, YARN-8079.002.patch, > YARN-8079.003.patch > > > Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly > read srcFile, instead it always construct {{remoteFile}} by using > componentDir and fileName of {{destFile}}: > {code} > Path remoteFile = new Path(compInstanceDir, fileName); > {code} > To me it is a common use case which services have some files existed in HDFS > and need to be localized when components get launched. (For example, if we > want to serve a Tensorflow model, we need to localize Tensorflow model > (typically not huge, less than GB) to local disk. Otherwise launched docker > container has to access HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8079) YARN native service should respect source file of ConfigFile inside Service/Component spec
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417852#comment-16417852 ] Wangda Tan commented on YARN-8079: -- [~eyang], thanks for the review! > YARN native service should respect source file of ConfigFile inside > Service/Component spec > -- > > Key: YARN-8079 > URL: https://issues.apache.org/jira/browse/YARN-8079 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-8079.001.patch, YARN-8079.002.patch, > YARN-8079.003.patch > > > Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly > read srcFile, instead it always construct {{remoteFile}} by using > componentDir and fileName of {{destFile}}: > {code} > Path remoteFile = new Path(compInstanceDir, fileName); > {code} > To me it is a common use case which services have some files existed in HDFS > and need to be localized when components get launched. (For example, if we > want to serve a Tensorflow model, we need to localize Tensorflow model > (typically not huge, less than GB) to local disk. Otherwise launched docker > container has to access HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8048) Support auto-spawning of admin configured services during bootstrap of rm/apiserver
[ https://issues.apache.org/jira/browse/YARN-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417877#comment-16417877 ] Wangda Tan commented on YARN-8048: -- [~rohithsharma], Thanks for your responses. bq. For the 2nd level, it's better to only read files ended with .yarnfile. What's your opinion for this? bq. making synchronization will delay RM transition period if more services to be started. I'm not sure about correctness of this behavior. Maybe have two sub folders, sync/async under the service root, so admin can choose: {code} service-root sync user1 service1.yarnfile user2 serivce2.yarnfile async user3 ... {code} I think we should consider this at least in the dir hierarchy otherwise it will be very hard to add the new field. > Support auto-spawning of admin configured services during bootstrap of > rm/apiserver > --- > > Key: YARN-8048 > URL: https://issues.apache.org/jira/browse/YARN-8048 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Major > Attachments: YARN-8048.001.patch, YARN-8048.002.patch, > YARN-8048.003.patch, YARN-8048.004.patch, YARN-8048.005.patch > > > Goal is to support auto-spawning of admin configured services during > bootstrap of resourcemanager/apiserver. > *Requirement:* Some of the services might required to be consumed by yarn > itself ex: Hbase for atsv2. Instead of depending on user installed HBase or > sometimes user may not required to install HBase at all, in such conditions > running HBase app on YARN will help for ATSv2. > Before YARN cluster is started, admin configure these services spec and place > it in common location in HDFS. At the time of RM/apiServer bootstrap, these > services will be submitted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8012) Support Unmanaged Container Cleanup
[ https://issues.apache.org/jira/browse/YARN-8012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8012: - Target Version/s: 2.7.1, 3.2.0 (was: 2.7.1, 3.0.0) > Support Unmanaged Container Cleanup > --- > > Key: YARN-8012 > URL: https://issues.apache.org/jira/browse/YARN-8012 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Yuqi Wang >Assignee: Yuqi Wang >Priority: Major > Attachments: YARN-8012 - Unmanaged Container Cleanup.pdf, > YARN-8012-branch-2.7.1.001.patch > > > An *unmanaged container / leaked container* is a container which is no longer > managed by NM. Thus, it is cannot be managed / leaked by YARN, too. > *There are many cases a YARN managed container can become unmanaged, such as:* > * NM service is disabled or removed on the node. > * NM is unable to start up again on the node, such as depended > configuration, or resources cannot be ready. > * NM local leveldb store is corrupted or lost, such as bad disk sectors. > * NM has bugs, such as wrongly mark live container as complete. > Note, they are caused or things become worse if work-preserving NM restart > enabled, see YARN-1336 > *Bad impacts of unmanaged container, such as:* > # Resource cannot be managed for YARN on the node: > ** Cause YARN on the node resource leak > ** Cannot kill the container to release YARN resource on the node to free up > resource for other urgent computations on the node. > # Container and App killing is not eventually consistent for App user: > ** App which has bugs can still produce bad impacts to outside even if the > App is killed for a long time -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8079) YARN native service should respect source file of ConfigFile inside Service/Component spec
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417920#comment-16417920 ] Eric Yang commented on YARN-8079: - [~leftnoteasy] Could you update yarn site markdown with the proper syntax? Thanks > YARN native service should respect source file of ConfigFile inside > Service/Component spec > -- > > Key: YARN-8079 > URL: https://issues.apache.org/jira/browse/YARN-8079 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-8079.001.patch, YARN-8079.002.patch, > YARN-8079.003.patch > > > Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly > read srcFile, instead it always construct {{remoteFile}} by using > componentDir and fileName of {{destFile}}: > {code} > Path remoteFile = new Path(compInstanceDir, fileName); > {code} > To me it is a common use case which services have some files existed in HDFS > and need to be localized when components get launched. (For example, if we > want to serve a Tensorflow model, we need to localize Tensorflow model > (typically not huge, less than GB) to local disk. Otherwise launched docker > container has to access HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8084) Yarn native service classes renaming for easier development?
[ https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417965#comment-16417965 ] Gour Saha commented on YARN-8084: - For the service.component.Component I suggest to rename it to ComponentManager (similar to ServiceManager). > Yarn native service classes renaming for easier development? > - > > Key: YARN-8084 > URL: https://issues.apache.org/jira/browse/YARN-8084 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Priority: Major > > There're a couple of classes with same name exists in YARN native service. > Such as: > 1) ...service.component.Component and api.records.Component. > This makes harder when development in IDE since clash of class name forces to > use full qualified class name. > Similarly in API definition: > ...service.api.records: > Container/ContainerState/Resource/ResourceInformation. How about rename them > to: > ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8077) The vmemLimit parameter in ContainersMonitorImpl#isProcessTreeOverLimit is confusing
[ https://issues.apache.org/jira/browse/YARN-8077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417735#comment-16417735 ] Miklos Szegedi commented on YARN-8077: -- The Jenkins failure seems to be unrelated (protoc). Let me look into this. > The vmemLimit parameter in ContainersMonitorImpl#isProcessTreeOverLimit is > confusing > > > Key: YARN-8077 > URL: https://issues.apache.org/jira/browse/YARN-8077 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Sen Zhao >Assignee: Sen Zhao >Priority: Trivial > Fix For: 3.2.0 > > Attachments: YARN-8077.001.patch > > > The parameter should be memLimit. It contains the meaning of vmemLimit and > pmemLimit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8080) YARN native service should support component restart policy
[ https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8080: - Attachment: YARN-8080.004.patch > YARN native service should support component restart policy > --- > > Key: YARN-8080 > URL: https://issues.apache.org/jira/browse/YARN-8080 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8080.001.patch, YARN-8080.002.patch, > YARN-8080.003.patch, YARN-8080.004.patch > > > Existing native service assumes the service is long running and never > finishes. Containers will be restarted even if exit code == 0. > To support boarder use cases, we need to allow restart policy of component > specified by users. Propose to have following policies: > 1) Always: containers always restarted by framework regardless of container > exit status. This is existing/default behavior. > 2) Never: Do not restart containers in any cases after container finishes: To > support job-like workload (for example Tensorflow training job). If a task > exit with code == 0, we should not restart the task. This can be used by > services which is not restart/recovery-able. > 3) On-failure: Similar to above, only restart task with exitcode != 0. > Behaviors after component *instance* finalize (Succeeded or Failed when > restart_policy != ALWAYS): > 1) For single component, single instance: complete service. > 2) For single component, multiple instance: other running instances from the > same component won't be affected by the finalized component instance. Service > will be terminated once all instances finalized. > 3) For multiple components: Service will be terminated once all components > finalized. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7623) Fix the CapacityScheduler Queue configuration documentation
[ https://issues.apache.org/jira/browse/YARN-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417937#comment-16417937 ] Jonathan Hung commented on YARN-7623: - Thanks! > Fix the CapacityScheduler Queue configuration documentation > --- > > Key: YARN-7623 > URL: https://issues.apache.org/jira/browse/YARN-7623 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Jonathan Hung >Priority: Major > Fix For: 2.10.0, 2.9.1, 3.0.2, 3.1.1 > > Attachments: Screen Shot 2018-03-27 at 3.02.45 PM.png, > YARN-7623.001.patch, YARN-7623.002.patch > > > It looks like the [Changing Queue > Configuration|https://hadoop.apache.org/docs/r2.9.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Changing_queue_configuration_via_API] > section is mis-formatted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8048) Support auto-spawning of admin configured services during bootstrap of rm/apiserver
[ https://issues.apache.org/jira/browse/YARN-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417980#comment-16417980 ] Gour Saha commented on YARN-8048: - I think it is okay to assume that if a service needs to be started as a system service, then it needs to be dropped in the system service dir and with .yarnfile extension. It shouldn't affect other areas of YARN Service as it will continue to allow launch using any file-name as long it is a valid JSON. > Support auto-spawning of admin configured services during bootstrap of > rm/apiserver > --- > > Key: YARN-8048 > URL: https://issues.apache.org/jira/browse/YARN-8048 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Major > Attachments: YARN-8048.001.patch, YARN-8048.002.patch, > YARN-8048.003.patch, YARN-8048.004.patch, YARN-8048.005.patch > > > Goal is to support auto-spawning of admin configured services during > bootstrap of resourcemanager/apiserver. > *Requirement:* Some of the services might required to be consumed by yarn > itself ex: Hbase for atsv2. Instead of depending on user installed HBase or > sometimes user may not required to install HBase at all, in such conditions > running HBase app on YARN will help for ATSv2. > Before YARN cluster is started, admin configure these services spec and place > it in common location in HDFS. At the time of RM/apiServer bootstrap, these > services will be submitted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance
[ https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418070#comment-16418070 ] Eric Yang commented on YARN-7939: - [~csingh] It would be nice if we clean up user submitted state to: START, STOP, FLEX, UPGRADE. ServiceClient change it to STARTING, STOPPING, FLEXING, UPGRADING. After action is completed, change it to STABLE. We keep STOPPED, and STARTED keyword for backward compatibility. Sorry, this part was messed up during the original implementation. Can you show example of JSON that is used to trigger REST API upgrade? I am getting error 500, and no errors in the log file. I am unsure what is wrong in my test samples. It would be nice to see a working example. Thanks. > Yarn Service Upgrade: add support to upgrade a component instance > -- > > Key: YARN-7939 > URL: https://issues.apache.org/jira/browse/YARN-7939 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-7939.001.patch > > > Yarn core supports in-place upgrade of containers. A yarn service can > leverage that to provide in-place upgrade of component instances. Please see > YARN-7512 for details. > Will add support to upgrade a single component instance first and then > iteratively add other APIs and features. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-2480) DockerContainerExecutor must support user namespaces
[ https://issues.apache.org/jira/browse/YARN-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen resolved YARN-2480. -- Resolution: Won't Fix Closing this as DockerContainerExecutor has been deprecated in branch-2 and removed in trunk > DockerContainerExecutor must support user namespaces > > > Key: YARN-2480 > URL: https://issues.apache.org/jira/browse/YARN-2480 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Abin Shahab >Priority: Major > Labels: security > > When DockerContainerExector launches a container, the root inside that > container has root privileges on the host. > This is insecure in a mult-tenant environment. The uid of the container's > root user must be mapped to a non-privileged user on the host. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-3988) DockerContainerExecutor should allow user specify "docker run" parameters
[ https://issues.apache.org/jira/browse/YARN-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen resolved YARN-3988. -- Resolution: Won't Fix Closing this as DockerContainerExecutor has been deprecated in branch-2 and removed in trunk > DockerContainerExecutor should allow user specify "docker run" parameters > - > > Key: YARN-3988 > URL: https://issues.apache.org/jira/browse/YARN-3988 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Chen He >Assignee: Chen He >Priority: Major > > In current DockerContainerExecutor, the "docker run" command has fixed > parameters: > String commandStr = commands.append(dockerExecutor) > .append(" ") > .append("run") > .append(" ") > .append("--rm --net=host") > .append(" ") > .append(" --name " + containerIdStr) > .append(localDirMount) > .append(logDirMount) > .append(containerWorkDirMount) > .append(" ") > .append(containerImageName) > .toString(); > For example, it is not flexible if users want to start a docker container > with attaching extra volume(s) and other "docker run" parameters. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7690) expose reserved memory/Vcores of nodemanager at webUI
[ https://issues.apache.org/jira/browse/YARN-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417883#comment-16417883 ] genericqa commented on YARN-7690: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} YARN-7690 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-7690 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903982/YARN-7690.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20127/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > expose reserved memory/Vcores of nodemanager at webUI > -- > > Key: YARN-7690 > URL: https://issues.apache.org/jira/browse/YARN-7690 > Project: Hadoop YARN > Issue Type: Improvement > Components: webapp >Reporter: tianjuan >Priority: Major > Attachments: YARN-7690.patch > > > now only total reserved memory/Vcores are exposed at RM webUI, reserved > memory/Vcores of a single nodemanager is hard to find out. it confuses users > that they obeserve that there are available memory/Vcores at nodes page, but > their jobs are stuck and waiting for resouce to be allocated. It is helpful > for bedug to expose reserved memory/Vcores of every single nodemanager, and > memory/Vcores that can be allocated( unallocated minus reserved) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8084) Yarn native service rename for easier development?
[ https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417882#comment-16417882 ] Wangda Tan commented on YARN-8084: -- cc: [~gsaha]/[~billie.rinaldi]/[~eyang] > Yarn native service rename for easier development? > -- > > Key: YARN-8084 > URL: https://issues.apache.org/jira/browse/YARN-8084 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Priority: Major > > There're a couple of classes with same name exists in YARN native service. > Such as: > 1) ...service.component.Component and api.records.Component. > This makes harder when development in IDE since clash of class name forces to > use full qualified class name. > Similarly in API definition: > ...service.api.records: > Container/ContainerState/Resource/ResourceInformation. How about rename them > to: > ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8051) TestRMEmbeddedElector#testCallbackSynchronization is flakey
[ https://issues.apache.org/jira/browse/YARN-8051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417891#comment-16417891 ] Haibo Chen commented on YARN-8051: -- Thanks [~rkanter] for the patch. HADOOP-12427 is dedicated to the mockito upgrade. There seems to be some incompatibility issues indicated there in the discussion? If there is indeed some issues with upgrade mockito, we can fix the unit test without mockito upgrade. Instead of mocking AdminService, we can create a subclass of AdminService in the test that tracks and exposes how many times the transitionTo* methods are called. > TestRMEmbeddedElector#testCallbackSynchronization is flakey > --- > > Key: YARN-8051 > URL: https://issues.apache.org/jira/browse/YARN-8051 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 3.2.0 >Reporter: Robert Kanter >Assignee: Robert Kanter >Priority: Major > Attachments: YARN-8051.001.patch > > > We've seen some rare flakey failures in > {{TestRMEmbeddedElector#testCallbackSynchronization}}: > {noformat} > org.mockito.exceptions.verification.WantedButNotInvoked: > Wanted but not invoked: > adminService.transitionToStandby(); > -> at > org.apache.hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector.testCallbackSynchronizationNeutral(TestRMEmbeddedElector.java:215) > Actually, there were zero interactions with this mock. > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector.testCallbackSynchronizationNeutral(TestRMEmbeddedElector.java:215) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector.testCallbackSynchronization(TestRMEmbeddedElector.java:146) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector.testCallbackSynchronization(TestRMEmbeddedElector.java:109) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8084) Yarn native service classes renaming for easier development?
[ https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8084: - Summary: Yarn native service classes renaming for easier development? (was: Yarn native service rename for easier development?) > Yarn native service classes renaming for easier development? > - > > Key: YARN-8084 > URL: https://issues.apache.org/jira/browse/YARN-8084 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Priority: Major > > There're a couple of classes with same name exists in YARN native service. > Such as: > 1) ...service.component.Component and api.records.Component. > This makes harder when development in IDE since clash of class name forces to > use full qualified class name. > Similarly in API definition: > ...service.api.records: > Container/ContainerState/Resource/ResourceInformation. How about rename them > to: > ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance
[ https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417907#comment-16417907 ] Eric Yang commented on YARN-7939: - [~csingh] Thank you for the patch. At high level, the patch looks good. I think we need to open a task to fix the CLI command to support upgrade option or add to the current patch to call the newly introduced actionUpgrade. My preference would be included here for completeness. I don't spot much issues other than checkstyle reported issues. > Yarn Service Upgrade: add support to upgrade a component instance > -- > > Key: YARN-7939 > URL: https://issues.apache.org/jira/browse/YARN-7939 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-7939.001.patch > > > Yarn core supports in-place upgrade of containers. A yarn service can > leverage that to provide in-place upgrade of component instances. Please see > YARN-7512 for details. > Will add support to upgrade a single component instance first and then > iteratively add other APIs and features. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7988) Refactor FSNodeLabelStore code for attributes store support
[ https://issues.apache.org/jira/browse/YARN-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417939#comment-16417939 ] genericqa commented on YARN-7988: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} YARN-3409 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 59s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 48s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 17s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 39s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 21s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 41s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green} YARN-3409 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 59s{color} | {color:green} hadoop-yarn-project_hadoop-yarn generated 0 new + 86 unchanged - 1 fixed = 86 total (was 87) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 2s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 20 new + 62 unchanged - 22 fixed = 82 total (was 84) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 45s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common generated 2 new + 4183 unchanged - 0 fixed = 4185 total (was 4183) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 7s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 35s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}159m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7988 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12916611/YARN-7988-YARN-3409.007.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6e42c7a1c1d9 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (YARN-7946) Update TimelineServerV2 doc as per YARN-7919
[ https://issues.apache.org/jira/browse/YARN-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417960#comment-16417960 ] Rohith Sharma K S commented on YARN-7946: - +1 lgtm.. [~vrushalic] would you take a look at the patch? > Update TimelineServerV2 doc as per YARN-7919 > > > Key: YARN-7946 > URL: https://issues.apache.org/jira/browse/YARN-7946 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-7946.00.patch, YARN-7946.01.patch, > YARN-7946.02.patch > > > Post YARN-7919, document need to be updated for co processor jar name and > other related details if any. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8084) Yarn native service classes renaming for easier development?
[ https://issues.apache.org/jira/browse/YARN-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418059#comment-16418059 ] Wangda Tan commented on YARN-8084: -- [~gsaha], ComponentManager sounds like a manager of a bunch of components, if you don't like "ServiceComponent", how about call it "RuntimeComponent"? > Yarn native service classes renaming for easier development? > - > > Key: YARN-8084 > URL: https://issues.apache.org/jira/browse/YARN-8084 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Priority: Major > > There're a couple of classes with same name exists in YARN native service. > Such as: > 1) ...service.component.Component and api.records.Component. > This makes harder when development in IDE since clash of class name forces to > use full qualified class name. > Similarly in API definition: > ...service.api.records: > Container/ContainerState/Resource/ResourceInformation. How about rename them > to: > ServiceContainer/ServiceContainerState/ServiceResource/ServiceResourceInformation? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8082) Include LocalizedResource size information in the NM download log for localization
[ https://issues.apache.org/jira/browse/YARN-8082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418060#comment-16418060 ] Jason Lowe commented on YARN-8082: -- Thanks for the patch! The line-length checkstyle issue should be fixed otherwise looks good. > Include LocalizedResource size information in the NM download log for > localization > -- > > Key: YARN-8082 > URL: https://issues.apache.org/jira/browse/YARN-8082 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Minor > Attachments: YARN-8082.001.patch > > > The size of the resource that finished downloading helps with debugging > localization delays and failures. A close approximate local size of the > resource is available in the LocalizedResource object which can be used to > address this minor change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-2482) DockerContainerExecutor configuration
[ https://issues.apache.org/jira/browse/YARN-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen resolved YARN-2482. -- Resolution: Won't Fix Closing this as DockerContainerExecutor has been deprecated in branch-2 and removed in trunk > DockerContainerExecutor configuration > - > > Key: YARN-2482 > URL: https://issues.apache.org/jira/browse/YARN-2482 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abin Shahab >Priority: Major > Labels: security > > Currently DockerContainerExecutor can be configured from yarn-site.xml, and > users can add arbtrary arguments to the container launch command. This should > be fixed so that the cluster and other jobs are protected from malicious > string injections. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-2479) DockerContainerExecutor must support handling of distributed cache
[ https://issues.apache.org/jira/browse/YARN-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen resolved YARN-2479. -- Resolution: Won't Fix Closing this as DockerContainerExecutor has been deprecated in branch-2 and removed in trunk > DockerContainerExecutor must support handling of distributed cache > -- > > Key: YARN-2479 > URL: https://issues.apache.org/jira/browse/YARN-2479 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abin Shahab >Priority: Major > Labels: security > > Interaction between Docker containers and distributed cache has not yet been > worked out. There should be a way to securely access distributed cache > without compromising the isolation Docker provides. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8010) Add config in FederationRMFailoverProxy to not bypass facade cache when failing over
[ https://issues.apache.org/jira/browse/YARN-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417768#comment-16417768 ] Botong Huang commented on YARN-8010: Thanks [~subru] and [~giovanni.fumarola]! > Add config in FederationRMFailoverProxy to not bypass facade cache when > failing over > > > Key: YARN-8010 > URL: https://issues.apache.org/jira/browse/YARN-8010 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Fix For: 2.10.0, 2.9.1, 3.1.1 > > Attachments: YARN-8010.v1.patch, YARN-8010.v1.patch, > YARN-8010.v2.patch, YARN-8010.v3.patch > > > Today when YarnRM is failing over, the FederationRMFailoverProxy running in > AMRMProxy will perform failover, try to get latest subcluster info from > FederationStateStore and then retry connect to the latest YarnRM master. When > calling getSubCluster() to FederationStateStoreFacade, it bypasses the cache > with a flush flag. When YarnRM is failing over, every AM heartbeat thread > creates a different thread inside FederationInterceptor, each of which keeps > performing failover several times. This leads to a big spike of getSubCluster > call to FederationStateStore. > Depending on the cluster setup (e.g. putting a VIP before all YarnRMs), > YarnRM master slave change might not result in RM addr change. In other > cases, a small delay of getting latest subcluster information may be > acceptable. This patch thus creates a config option, so that it is possible > to ask the FederationRMFailoverProxy to not flush cache when calling > getSubCluster(). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8079) YARN native service should respect source file of ConfigFile inside Service/Component spec
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417816#comment-16417816 ] Eric Yang commented on YARN-8079: - [~leftnoteasy] For accessing remote HDFS, it requires username + password of the remote cluster, and the cluster has a way to contact to remote cluster KDC server to verify the user. I don't think Hadoop supports hdfs://user:pass@cluster:port/path. I think remoteFile throw me off in thinking to access another HDFS other than current cluster. Sorry for the confusion. For S3, s3://ID:SECRET@BUCKET/ maybe this works. +1 for patch 3. > YARN native service should respect source file of ConfigFile inside > Service/Component spec > -- > > Key: YARN-8079 > URL: https://issues.apache.org/jira/browse/YARN-8079 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-8079.001.patch, YARN-8079.002.patch, > YARN-8079.003.patch > > > Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly > read srcFile, instead it always construct {{remoteFile}} by using > componentDir and fileName of {{destFile}}: > {code} > Path remoteFile = new Path(compInstanceDir, fileName); > {code} > To me it is a common use case which services have some files existed in HDFS > and need to be localized when components get launched. (For example, if we > want to serve a Tensorflow model, we need to localize Tensorflow model > (typically not huge, less than GB) to local disk. Otherwise launched docker > container has to access HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6629) NPE occurred when container allocation proposal is applied but its resource requests are removed before
[ https://issues.apache.org/jira/browse/YARN-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417905#comment-16417905 ] Hudson commented on YARN-6629: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13893 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13893/]) YARN-6629. NPE occurred when container allocation proposal is applied (wangda: rev 47f711eebca315804c80012eea5f31275ac25518) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java > NPE occurred when container allocation proposal is applied but its resource > requests are removed before > --- > > Key: YARN-6629 > URL: https://issues.apache.org/jira/browse/YARN-6629 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-6629.001.patch, YARN-6629.002.patch, > YARN-6629.003.patch, YARN-6629.004.patch, YARN-6629.005.patch, > YARN-6629.006.patch > > > I wrote a test case to reproduce another problem for branch-2 and found new > NPE error, log: > {code} > FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in > handling event type NODE_UPDATE to the Event Dispatcher > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516) > at > org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225) > at > org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31) > at org.mockito.internal.MockHandler.handle(MockHandler.java:97) > at > org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply() > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:745) > {code} > Reproduce this error in chronological order: > 1. AM started and requested 1 container with schedulerRequestKey#1 : > ApplicationMasterService#allocate --> CapacityScheduler#allocate --> > SchedulerApplicationAttempt#updateResourceRequests --> > AppSchedulingInfo#updateResourceRequests > Added schedulerRequestKey#1 into schedulerKeyToPlacementSets > 2. Scheduler allocatd 1 container for this request and accepted the proposal > 3. AM removed this request > ApplicationMasterService#allocate --> CapacityScheduler#allocate --> > SchedulerApplicationAttempt#updateResourceRequests --> >
[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance
[ https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417916#comment-16417916 ] Eric Yang commented on YARN-7939: - [~csingh] When user submit a curl PUT, the ServiceState needs to be specified as UPGRADING to trigger upgrade? It seems to be more intuitive if the state is called UPGRADE, and UPGRADING is updated by ServiceClient when request is received. > Yarn Service Upgrade: add support to upgrade a component instance > -- > > Key: YARN-7939 > URL: https://issues.apache.org/jira/browse/YARN-7939 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-7939.001.patch > > > Yarn core supports in-place upgrade of containers. A yarn service can > leverage that to provide in-place upgrade of component instances. Please see > YARN-7512 for details. > Will add support to upgrade a single component instance first and then > iteratively add other APIs and features. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8048) Support auto-spawning of admin configured services during bootstrap of rm/apiserver
[ https://issues.apache.org/jira/browse/YARN-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417956#comment-16417956 ] Rohith Sharma K S commented on YARN-8048: - bq. For the 2nd level, it's better to only read files ended with .yarnfile I am in dilemma! Given native service framework is enforcing all spec file to be ended with .yarnfile i.e making standardized spec file extension, it make sense to check file extension. Otherwise it is normal json file which NOT required to check for file extension. However in my last patch I incorporated this file extension check, but I am _not sure_ does native service going to make it standard. I see that from CLI command which submits a service reads normal json file i.e ApiServiceClient#loadAppJsonFromLocalFS. It doesn't check for yarnfile extension. bq. I'm not sure about correctness of this behavior. Though it is called as async mode, it doesn't executed after some delay. It is started in separate thread which in turn launches services in no time. This makes admin service to release lock faster during transitioning to active phase. However, going with sync and async folder hierarchy also make sense to me. I will update the patch with this change. > Support auto-spawning of admin configured services during bootstrap of > rm/apiserver > --- > > Key: YARN-8048 > URL: https://issues.apache.org/jira/browse/YARN-8048 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Major > Attachments: YARN-8048.001.patch, YARN-8048.002.patch, > YARN-8048.003.patch, YARN-8048.004.patch, YARN-8048.005.patch > > > Goal is to support auto-spawning of admin configured services during > bootstrap of resourcemanager/apiserver. > *Requirement:* Some of the services might required to be consumed by yarn > itself ex: Hbase for atsv2. Instead of depending on user installed HBase or > sometimes user may not required to install HBase at all, in such conditions > running HBase app on YARN will help for ATSv2. > Before YARN cluster is started, admin configure these services spec and place > it in common location in HDFS. At the time of RM/apiServer bootstrap, these > services will be submitted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8080) YARN native service should support component restart policy
[ https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417990#comment-16417990 ] genericqa commented on YARN-8080: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 39s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 12s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 30 new + 70 unchanged - 0 fixed = 100 total (was 70) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 14s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 80m 58s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | YARN-8080 | | JIRA Patch URL |
[jira] [Comment Edited] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active
[ https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418450#comment-16418450 ] Weiwei Yang edited comment on YARN-8085 at 3/29/18 5:18 AM: Thanks [~Tao Yang], nice catch. ResourceProfilesManager was added in YARN-5707, so only affected version is 3.1.0. Since this one would cause NPE in fail-over, not sure if we can get this into 3.1.0 as it already enters RC0. + [~wangda] to the loop. Regarding to the fix, can we move {{ResourceProfilesManager}} into the {{RMServiceContext}} ? As ResourceManager#resetRMContext is supposedly to get the reset done by {code:java} rmContextImpl.setServiceContext(rmContext.getServiceContext()); {code} don't think we need an extra set here. Does that make sense? Thanks was (Author: cheersyang): Thanks [~Tao Yang], nice catch. ResourceProfilesManager was added in YARN-5707, so only affected version is 3.1.0. Since this one would cause NPE in fail-over, not sure if we can get this into 3.1.0 as it already enters RC0. + [~wangda] to the loop. Regarding to the fix, can we move \{{ResourceProfilesManager}} into the \{{RMServiceContext}} ? As ResourceManager#resetRMContext is supposedly to get the reset done by {code} rmContextImpl.setServiceContext(rmContext.getServiceContext()); {code} don't think we need an extra set here. Does that make sense? Thanks > RMContext#resourceProfilesManager is lost after RM went standby then back to > active > --- > > Key: YARN-8085 > URL: https://issues.apache.org/jira/browse/YARN-8085 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.1.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8085.001.patch > > > We submited a distributed shell application after RM failover and back to > active, then got NPE error in RM log: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {noformat} > The cause is that currently resourceProfilesManager is not transferred to new > RMContext instance in RMContext#resetRMContext. We should do this transfer to > fix this error. > {code:java} > @@ -1488,6 +1488,10 @@ private void resetRMContext() { > // transfer service context to new RM service Context > rmContextImpl.setServiceContext(rmContext.getServiceContext()); > +// transfer resource profiles manager > +rmContextImpl > +.setResourceProfilesManager(rmContext.getResourceProfilesManager()); > + > // reset dispatcher > Dispatcher dispatcher = setupDispatcher(); > ((Service) dispatcher).init(this.conf); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active
[ https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8085: -- Affects Version/s: (was: 3.2.0) 3.1.0 > RMContext#resourceProfilesManager is lost after RM went standby then back to > active > --- > > Key: YARN-8085 > URL: https://issues.apache.org/jira/browse/YARN-8085 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.1.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8085.001.patch > > > We submited a distributed shell application after RM failover and back to > active, then got NPE error in RM log: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {noformat} > The cause is that currently resourceProfilesManager is not transferred to new > RMContext instance in RMContext#resetRMContext. We should do this transfer to > fix this error. > {code:java} > @@ -1488,6 +1488,10 @@ private void resetRMContext() { > // transfer service context to new RM service Context > rmContextImpl.setServiceContext(rmContext.getServiceContext()); > +// transfer resource profiles manager > +rmContextImpl > +.setResourceProfilesManager(rmContext.getResourceProfilesManager()); > + > // reset dispatcher > Dispatcher dispatcher = setupDispatcher(); > ((Service) dispatcher).init(this.conf); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active
[ https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418450#comment-16418450 ] Weiwei Yang commented on YARN-8085: --- Thanks [~Tao Yang], nice catch. ResourceProfilesManager was added in YARN-5707, so only affected version is 3.1.0. Since this one would cause NPE in fail-over, not sure if we can get this into 3.1.0 as it already enters RC0. + [~wangda] to the loop. Regarding to the fix, can we move \{{ResourceProfilesManager}} into the \{{RMServiceContext}} ? As ResourceManager#resetRMContext is supposedly to get the reset done by {code} rmContextImpl.setServiceContext(rmContext.getServiceContext()); {code} don't think we need an extra set here. Does that make sense? Thanks > RMContext#resourceProfilesManager is lost after RM went standby then back to > active > --- > > Key: YARN-8085 > URL: https://issues.apache.org/jira/browse/YARN-8085 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.1.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8085.001.patch > > > We submited a distributed shell application after RM failover and back to > active, then got NPE error in RM log: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {noformat} > The cause is that currently resourceProfilesManager is not transferred to new > RMContext instance in RMContext#resetRMContext. We should do this transfer to > fix this error. > {code:java} > @@ -1488,6 +1488,10 @@ private void resetRMContext() { > // transfer service context to new RM service Context > rmContextImpl.setServiceContext(rmContext.getServiceContext()); > +// transfer resource profiles manager > +rmContextImpl > +.setResourceProfilesManager(rmContext.getResourceProfilesManager()); > + > // reset dispatcher > Dispatcher dispatcher = setupDispatcher(); > ((Service) dispatcher).init(this.conf); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA
[ https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418458#comment-16418458 ] Weiwei Yang edited comment on YARN-7497 at 3/29/18 5:34 AM: Hi [~yangjiandan] Thanks for the updates. Please see my comments below # YarnConfiguration.HDFS_CONFIGURATION_STORE also needs to be renamed to YarnConfiguration.FS_CONFIGURATION_STORE # public static final String HDFS_CONFIGURATION_STORE = "hdfs"; >> lets rename this to "fs" to be more general # FSSchedulerConfigurationStore: I don't see any place to close fileSystem. We need to ensure logMutation, confirmMutation and retrieve both closes fileSystem after they have done using it. Thanks was (Author: cheersyang): Hi [~yangjiandan] Thanks for the updates. Please see my comments below # YarnConfiguration.HDFS_CONFIGURATION_STORE also needs to be renamed to \{{YarnConfiguration.FS_CONFIGURATION_STORE}} # public static final String HDFS_CONFIGURATION_STORE = "hdfs"; >> lets rename this to "fs" to be more general # FSSchedulerConfigurationStore: I don't see any place to close fileSystem. We need to ensure logMutation, confirmMutation and retrieve both closes fileSystem after they have done using it. Thanks > Add HDFSSchedulerConfigurationStore for RM HA > - > > Key: YARN-7497 > URL: https://issues.apache.org/jira/browse/YARN-7497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Major > Attachments: YARN-7497.001.patch, YARN-7497.002.patch, > YARN-7497.003.patch, YARN-7497.004.patch, YARN-7497.005.patch, > YARN-7497.006.patch, YARN-7497.007.patch, YARN-7497.008.patch, > YARN-7497.009.patch > > > YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but > it does not support Yarn RM HA. > YARN-6840 supports RM HA, but too many scheduler configurations may exceed > znode limit, for example 10 thousand queues. > HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, > new active RM can load scheduler configuration from HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA
[ https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418458#comment-16418458 ] Weiwei Yang commented on YARN-7497: --- Hi [~yangjiandan] Thanks for the updates. Please see my comments below # YarnConfiguration.HDFS_CONFIGURATION_STORE also needs to be renamed to \{{YarnConfiguration.FS_CONFIGURATION_STORE}} # public static final String HDFS_CONFIGURATION_STORE = "hdfs"; >> lets rename this to "fs" to be more general # FSSchedulerConfigurationStore: I don't see any place to close fileSystem. We need to ensure logMutation, confirmMutation and retrieve both closes fileSystem after they have done using it. Thanks > Add HDFSSchedulerConfigurationStore for RM HA > - > > Key: YARN-7497 > URL: https://issues.apache.org/jira/browse/YARN-7497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Major > Attachments: YARN-7497.001.patch, YARN-7497.002.patch, > YARN-7497.003.patch, YARN-7497.004.patch, YARN-7497.005.patch, > YARN-7497.006.patch, YARN-7497.007.patch, YARN-7497.008.patch, > YARN-7497.009.patch > > > YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but > it does not support Yarn RM HA. > YARN-6840 supports RM HA, but too many scheduler configurations may exceed > znode limit, for example 10 thousand queues. > HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, > new active RM can load scheduler configuration from HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA
[ https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated YARN-7497: Attachment: YARN-7497.010.patch > Add HDFSSchedulerConfigurationStore for RM HA > - > > Key: YARN-7497 > URL: https://issues.apache.org/jira/browse/YARN-7497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Major > Attachments: YARN-7497.001.patch, YARN-7497.002.patch, > YARN-7497.003.patch, YARN-7497.004.patch, YARN-7497.005.patch, > YARN-7497.006.patch, YARN-7497.007.patch, YARN-7497.008.patch, > YARN-7497.009.patch, YARN-7497.010.patch > > > YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but > it does not support Yarn RM HA. > YARN-6840 supports RM HA, but too many scheduler configurations may exceed > znode limit, for example 10 thousand queues. > HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, > new active RM can load scheduler configuration from HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active
Tao Yang created YARN-8085: -- Summary: RMContext#resourceProfilesManager is lost after RM went standby then back to active Key: YARN-8085 URL: https://issues.apache.org/jira/browse/YARN-8085 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang We submited a distributed shell application after RM failover and back to active, then got NPE error in RM log: {noformat} java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {noformat} The cause is that currently resourceProfilesManager is not transferred to new RMContext instance in RMContext#resetRMContext. We should do this transfer to fix this error. {code:java} @@ -1488,6 +1488,10 @@ private void resetRMContext() { // transfer service context to new RM service Context rmContextImpl.setServiceContext(rmContext.getServiceContext()); +// transfer resource profiles manager +rmContextImpl +.setResourceProfilesManager(rmContext.getResourceProfilesManager()); + // reset dispatcher Dispatcher dispatcher = setupDispatcher(); ((Service) dispatcher).init(this.conf); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA
[ https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated YARN-7497: Attachment: (was: YARN-7497.010.patch) > Add HDFSSchedulerConfigurationStore for RM HA > - > > Key: YARN-7497 > URL: https://issues.apache.org/jira/browse/YARN-7497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Major > Attachments: YARN-7497.001.patch, YARN-7497.002.patch, > YARN-7497.003.patch, YARN-7497.004.patch, YARN-7497.005.patch, > YARN-7497.006.patch, YARN-7497.007.patch, YARN-7497.008.patch, > YARN-7497.009.patch > > > YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but > it does not support Yarn RM HA. > YARN-6840 supports RM HA, but too many scheduler configurations may exceed > znode limit, for example 10 thousand queues. > HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, > new active RM can load scheduler configuration from HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA
[ https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418484#comment-16418484 ] Jiandan Yang commented on YARN-7497: - Hi, [~cheersyang] Thanks for your review. I will rename the name of variable to general ones and close fileSystem in v10 patch. > Add HDFSSchedulerConfigurationStore for RM HA > - > > Key: YARN-7497 > URL: https://issues.apache.org/jira/browse/YARN-7497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Major > Attachments: YARN-7497.001.patch, YARN-7497.002.patch, > YARN-7497.003.patch, YARN-7497.004.patch, YARN-7497.005.patch, > YARN-7497.006.patch, YARN-7497.007.patch, YARN-7497.008.patch, > YARN-7497.009.patch, YARN-7497.010.patch > > > YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but > it does not support Yarn RM HA. > YARN-6840 supports RM HA, but too many scheduler configurations may exceed > znode limit, for example 10 thousand queues. > HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, > new active RM can load scheduler configuration from HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418411#comment-16418411 ] Weiwei Yang commented on YARN-7494: --- Hi [~sunilg] I was proposing to have configuration like following: {noformat} yarn.scheduler.capacity.multi-node-sorting.polices resource-usage yarn.scheduler.capacity.multi-node-sorting.policy.resource-usage.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.ResourceUsageSortingNodesPolicy yarn.scheduler.capacity.multi-node-sorting.policy.resource-usage.sorting-task.interval.ms 3000 {noformat} Such fashion will be easy for user to plug their own policies. And this is how queues are configured so should be easy for user to understand. What do you think? I am also open if you can elaborate your config details (currently it cannot be cleanly seen from the patch), I can vote yes as long as it addresses these problems. Thanks > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil G >Assignee: Sunil G >Priority: Major > Attachments: YARN-7494.001.patch, YARN-7494.002.patch, > YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, > YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active
[ https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8085: --- Attachment: YARN-8085.001.patch > RMContext#resourceProfilesManager is lost after RM went standby then back to > active > --- > > Key: YARN-8085 > URL: https://issues.apache.org/jira/browse/YARN-8085 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8085.001.patch > > > We submited a distributed shell application after RM failover and back to > active, then got NPE error in RM log: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {noformat} > The cause is that currently resourceProfilesManager is not transferred to new > RMContext instance in RMContext#resetRMContext. We should do this transfer to > fix this error. > {code:java} > @@ -1488,6 +1488,10 @@ private void resetRMContext() { > // transfer service context to new RM service Context > rmContextImpl.setServiceContext(rmContext.getServiceContext()); > +// transfer resource profiles manager > +rmContextImpl > +.setResourceProfilesManager(rmContext.getResourceProfilesManager()); > + > // reset dispatcher > Dispatcher dispatcher = setupDispatcher(); > ((Service) dispatcher).init(this.conf); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active
[ https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418461#comment-16418461 ] Tao Yang commented on YARN-8085: Thanks [~cheersyang] for your suggestion. Yes, RMServiceContext contains services which will be running always irrespective of the HA state of the RM. It's better to move ResourceProfilesManager into RMServiceContext. Attached v2 patch for review. > RMContext#resourceProfilesManager is lost after RM went standby then back to > active > --- > > Key: YARN-8085 > URL: https://issues.apache.org/jira/browse/YARN-8085 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.1.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8085.001.patch, YARN-8085.002.patch > > > We submited a distributed shell application after RM failover and back to > active, then got NPE error in RM log: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {noformat} > The cause is that currently resourceProfilesManager is not transferred to new > RMContext instance in RMContext#resetRMContext. We should do this transfer to > fix this error. > {code:java} > @@ -1488,6 +1488,10 @@ private void resetRMContext() { > // transfer service context to new RM service Context > rmContextImpl.setServiceContext(rmContext.getServiceContext()); > +// transfer resource profiles manager > +rmContextImpl > +.setResourceProfilesManager(rmContext.getResourceProfilesManager()); > + > // reset dispatcher > Dispatcher dispatcher = setupDispatcher(); > ((Service) dispatcher).init(this.conf); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active
[ https://issues.apache.org/jira/browse/YARN-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8085: --- Attachment: YARN-8085.002.patch > RMContext#resourceProfilesManager is lost after RM went standby then back to > active > --- > > Key: YARN-8085 > URL: https://issues.apache.org/jira/browse/YARN-8085 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.1.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8085.001.patch, YARN-8085.002.patch > > > We submited a distributed shell application after RM failover and back to > active, then got NPE error in RM log: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {noformat} > The cause is that currently resourceProfilesManager is not transferred to new > RMContext instance in RMContext#resetRMContext. We should do this transfer to > fix this error. > {code:java} > @@ -1488,6 +1488,10 @@ private void resetRMContext() { > // transfer service context to new RM service Context > rmContextImpl.setServiceContext(rmContext.getServiceContext()); > +// transfer resource profiles manager > +rmContextImpl > +.setResourceProfilesManager(rmContext.getResourceProfilesManager()); > + > // reset dispatcher > Dispatcher dispatcher = setupDispatcher(); > ((Service) dispatcher).init(this.conf); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources
[ https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-5881: Assignee: Sunil G (was: Wangda Tan) > Enable configuration of queue capacity in terms of absolute resources > - > > Key: YARN-5881 > URL: https://issues.apache.org/jira/browse/YARN-5881 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Sunil G >Priority: Major > Attachments: > YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf, > YARN-5881.v0.patch, YARN-5881.v1.patch > > > Currently, Yarn RM supports the configuration of queue capacity in terms of a > proportion to cluster capacity. In the context of Yarn being used as a public > cloud service, it makes more sense if queues can be configured absolutely. > This will allow administrators to set usage limits more concretely and > simplify customer expectations for cluster allocation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources
[ https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan resolved YARN-5881. -- Resolution: Done > Enable configuration of queue capacity in terms of absolute resources > - > > Key: YARN-5881 > URL: https://issues.apache.org/jira/browse/YARN-5881 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Sunil G >Priority: Major > Fix For: 3.1.0 > > Attachments: > YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf, > YARN-5881.v0.patch, YARN-5881.v1.patch > > > Currently, Yarn RM supports the configuration of queue capacity in terms of a > proportion to cluster capacity. In the context of Yarn being used as a public > cloud service, it makes more sense if queues can be configured absolutely. > This will allow administrators to set usage limits more concretely and > simplify customer expectations for cluster allocation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources
[ https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418424#comment-16418424 ] Wangda Tan commented on YARN-5881: -- Closing this JIRA as all sub jiras are completed. > Enable configuration of queue capacity in terms of absolute resources > - > > Key: YARN-5881 > URL: https://issues.apache.org/jira/browse/YARN-5881 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Sunil G >Priority: Major > Fix For: 3.1.0 > > Attachments: > YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf, > YARN-5881.v0.patch, YARN-5881.v1.patch > > > Currently, Yarn RM supports the configuration of queue capacity in terms of a > proportion to cluster capacity. In the context of Yarn being used as a public > cloud service, it makes more sense if queues can be configured absolutely. > This will allow administrators to set usage limits more concretely and > simplify customer expectations for cluster allocation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources
[ https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-5881: - Fix Version/s: 3.1.0 > Enable configuration of queue capacity in terms of absolute resources > - > > Key: YARN-5881 > URL: https://issues.apache.org/jira/browse/YARN-5881 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Sunil G >Priority: Major > Fix For: 3.1.0 > > Attachments: > YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf, > YARN-5881.v0.patch, YARN-5881.v1.patch > > > Currently, Yarn RM supports the configuration of queue capacity in terms of a > proportion to cluster capacity. In the context of Yarn being used as a public > cloud service, it makes more sense if queues can be configured absolutely. > This will allow administrators to set usage limits more concretely and > simplify customer expectations for cluster allocation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org