[jira] [Resolved] (YARN-9689) Router does not support kerberos proxy when in secure mode
[ https://issues.apache.org/jira/browse/YARN-9689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola resolved YARN-9689. Resolution: Fixed > Router does not support kerberos proxy when in secure mode > -- > > Key: YARN-9689 > URL: https://issues.apache.org/jira/browse/YARN-9689 > Project: Hadoop YARN > Issue Type: Improvement > Components: federation >Affects Versions: 3.1.2 >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Attachments: YARN-9689.001.patch > > > When we enable kerberos in YARN-Federation mode, we can not get new app since > it will throw kerberos exception below.Which should be handled! > {code:java} > 2019-07-22,18:43:25,523 WARN org.apache.hadoop.ipc.Client: Exception > encountered while connecting to the server : > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > 2019-07-22,18:43:25,528 WARN > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor: > Unable to create a new ApplicationId in SubCluster xxx > java.io.IOException: DestHost:destPort xxx , LocalHost:localPort xxx. Failed > on local exception: java.io.IOException: javax.security.sasl.SaslException: > GSS initiate failed [Caused by GSSException: No valid credentials provided > (Mechanism level: Failed to find any Kerberos tgt)] > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806) > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1564) > at org.apache.hadoop.ipc.Client.call(Client.java:1506) > at org.apache.hadoop.ipc.Client.call(Client.java:1416) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) > at com.sun.proxy.$Proxy91.getNewApplication(Unknown Source) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:274) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > at com.sun.proxy.$Proxy92.getNewApplication(Unknown Source) > at > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.getNewApplication(FederationClientInterceptor.java:252) > at > org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.getNewApplication(RouterClientRMService.java:218) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getNewApplication(ApplicationClientProtocolPBServiceImpl.java:263) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:559) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:525) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:992) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:885) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:831) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1716) > at
[jira] [Resolved] (YARN-8996) [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs
[ https://issues.apache.org/jira/browse/YARN-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola resolved YARN-8996. Resolution: Duplicate > [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs > -- > > Key: YARN-8996 > URL: https://issues.apache.org/jira/browse/YARN-8996 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > > In YarnServiceJobSubmitter#needHdfs. Below code can be simplified to just one > line. > {code:java} > if (content != null && content.contains("hdfs://")) { > return true; > } > return false;{code} > {code:java} > return content != null && content.contains("hdfs://");{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8982) [Router] Add locality policy
Giovanni Matteo Fumarola created YARN-8982: -- Summary: [Router] Add locality policy Key: YARN-8982 URL: https://issues.apache.org/jira/browse/YARN-8982 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8973) [Router] Add missing methods in RMWebProtocol
Giovanni Matteo Fumarola created YARN-8973: -- Summary: [Router] Add missing methods in RMWebProtocol Key: YARN-8973 URL: https://issues.apache.org/jira/browse/YARN-8973 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8972) [Router] Add support to prevent DoS attack over ApplicationSubmissionContext size
Giovanni Matteo Fumarola created YARN-8972: -- Summary: [Router] Add support to prevent DoS attack over ApplicationSubmissionContext size Key: YARN-8972 URL: https://issues.apache.org/jira/browse/YARN-8972 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8900) [Router] Federation: routing getContainers REST invocations transparently to multiple RMs
Giovanni Matteo Fumarola created YARN-8900: -- Summary: [Router] Federation: routing getContainers REST invocations transparently to multiple RMs Key: YARN-8900 URL: https://issues.apache.org/jira/browse/YARN-8900 Project: Hadoop YARN Issue Type: Sub-task Components: federation, router Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8626) Create an AMRMProxyPolicy that sends all the ResourceRequests to the home subcluster
Giovanni Matteo Fumarola created YARN-8626: -- Summary: Create an AMRMProxyPolicy that sends all the ResourceRequests to the home subcluster Key: YARN-8626 URL: https://issues.apache.org/jira/browse/YARN-8626 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola Create an AMRMProxyPolicy that sends resources to the home subcluster -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8580) yarn.resourcemanager.am.max-attempts is not respected for yarn services
[ https://issues.apache.org/jira/browse/YARN-8580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola resolved YARN-8580. Resolution: Invalid > yarn.resourcemanager.am.max-attempts is not respected for yarn services > --- > > Key: YARN-8580 > URL: https://issues.apache.org/jira/browse/YARN-8580 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Priority: Major > > 1) Max am attempt is set to 100 on all nodes. ( including gateway) > {code} > > yarn.resourcemanager.am.max-attempts > 100 > {code} > 2) Start a Yarn service ( Hbase tarball ) application > 3) Kill AM 20 times > Here, App fails with below diagnostics. > {code} > bash-4.2$ /usr/hdp/current/hadoop-yarn-client/bin/yarn application -status > application_1532481557746_0001 > 18/07/25 18:43:34 INFO client.AHSProxy: Connecting to Application History > server at xxx/xxx:10200 > 18/07/25 18:43:34 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > 18/07/25 18:43:34 INFO conf.Configuration: found resource resource-types.xml > at file:/etc/hadoop/3.0.0.0-1634/0/resource-types.xml > Application Report : > Application-Id : application_1532481557746_0001 > Application-Name : hbase-tarball-lr > Application-Type : yarn-service > User : hbase > Queue : default > Application Priority : 0 > Start-Time : 1532481864863 > Finish-Time : 1532522943103 > Progress : 100% > State : FAILED > Final-State : FAILED > Tracking-URL : > https://xxx:8090/cluster/app/application_1532481557746_0001 > RPC Port : -1 > AM Host : N/A > Aggregate Resource Allocation : 252150112 MB-seconds, 164141 > vcore-seconds > Aggregate Resource Preempted : 0 MB-seconds, 0 vcore-seconds > Log Aggregation Status : SUCCEEDED > Diagnostics : Application application_1532481557746_0001 failed 20 > times (global limit =100; local limit is =20) due to AM Container for > appattempt_1532481557746_0001_20 exited with exitCode: 137 > Failing this attempt.Diagnostics: [2018-07-25 12:49:00.784]Container killed > on request. Exit code is 137 > [2018-07-25 12:49:03.045]Container exited with a non-zero exit code 137. > [2018-07-25 12:49:03.045]Killed by external signal > For more detailed output, check the application tracking page: > https://xxx:8090/cluster/app/application_1532481557746_0001 Then click on > links to logs of each attempt. > . Failing the application. > Unmanaged Application : false > Application Node Label Expression : > AM container Node Label Expression : > TimeoutType : LIFETIME ExpiryTime : 2018-07-25T22:26:15.419+ > RemainingTime : 0seconds > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8484) Fix NPE during ServiceStop in Router classes
Giovanni Matteo Fumarola created YARN-8484: -- Summary: Fix NPE during ServiceStop in Router classes Key: YARN-8484 URL: https://issues.apache.org/jira/browse/YARN-8484 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8482) [Router] Add cache service for fast answers to getApps
Giovanni Matteo Fumarola created YARN-8482: -- Summary: [Router] Add cache service for fast answers to getApps Key: YARN-8482 URL: https://issues.apache.org/jira/browse/YARN-8482 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8422) TestAMSimulator failing with NPE
Giovanni Matteo Fumarola created YARN-8422: -- Summary: TestAMSimulator failing with NPE Key: YARN-8422 URL: https://issues.apache.org/jira/browse/YARN-8422 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8324) Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows
[ https://issues.apache.org/jira/browse/YARN-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola resolved YARN-8324. Resolution: Invalid > Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows > --- > > Key: YARN-8324 > URL: https://issues.apache.org/jira/browse/YARN-8324 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8324.v1.patch, YARN-8324.v2.patch, > YARN-8324.v3.patch, image-2018-05-18-14-42-56-314.png, > image-2018-05-18-14-43-09-321.png > > > Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows because of > the path format. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8359) Disable containermanager.linux.runtime.TEST to run on Windows
Giovanni Matteo Fumarola created YARN-8359: -- Summary: Disable containermanager.linux.runtime.TEST to run on Windows Key: YARN-8359 URL: https://issues.apache.org/jira/browse/YARN-8359 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
Giovanni Matteo Fumarola created YARN-8348: -- Summary: Incorrect and missing AfterClass in HBase-tests Key: YARN-8348 URL: https://issues.apache.org/jira/browse/YARN-8348 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8344) Missing nm.close() in TestNodeManagerResync
Giovanni Matteo Fumarola created YARN-8344: -- Summary: Missing nm.close() in TestNodeManagerResync Key: YARN-8344 URL: https://issues.apache.org/jira/browse/YARN-8344 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8336) Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils
Giovanni Matteo Fumarola created YARN-8336: -- Summary: Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils Key: YARN-8336 URL: https://issues.apache.org/jira/browse/YARN-8336 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8334) Fix potential connection leak in GPGUtils
Giovanni Matteo Fumarola created YARN-8334: -- Summary: Fix potential connection leak in GPGUtils Key: YARN-8334 URL: https://issues.apache.org/jira/browse/YARN-8334 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8327) Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows
Giovanni Matteo Fumarola created YARN-8327: -- Summary: Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows Key: YARN-8327 URL: https://issues.apache.org/jira/browse/YARN-8327 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8324) Fix TestPrivilegedOperationExecutor.testExecutorPath in Windows
Giovanni Matteo Fumarola created YARN-8324: -- Summary: Fix TestPrivilegedOperationExecutor.testExecutorPath in Windows Key: YARN-8324 URL: https://issues.apache.org/jira/browse/YARN-8324 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8282) [C] Create a JNI interface to interact with Windows
Giovanni Matteo Fumarola created YARN-8282: -- Summary: [C] Create a JNI interface to interact with Windows Key: YARN-8282 URL: https://issues.apache.org/jira/browse/YARN-8282 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8281) [Java] Create a JNI interface to interact with Windows
Giovanni Matteo Fumarola created YARN-8281: -- Summary: [Java] Create a JNI interface to interact with Windows Key: YARN-8281 URL: https://issues.apache.org/jira/browse/YARN-8281 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8275) Create a JNI interface to interact with Windows
Giovanni Matteo Fumarola created YARN-8275: -- Summary: Create a JNI interface to interact with Windows Key: YARN-8275 URL: https://issues.apache.org/jira/browse/YARN-8275 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola I did a quick investigation of the performance of WinUtils. In average NM calls 4.76 times per second and 65.51 per container. | |Requests|Requests/sec|Requests/min|Requests/container| |*Sum [WinUtils]*|*135354*|*4.761*|*286.160*|*65.51*| |[WinUtils] Execute -help|4148|0.145|8.769|2.007| |[WinUtils] Execute -ls|2842|0.0999|6.008|1.37| |[WinUtils] Execute -systeminfo|9153|0.321|19.35|4.43| |[WinUtils] Execute -symlink|115096|4.048|243.33|57.37| |[WinUtils] Execute -task isAlive|4115|0.144|8.699|2.05| Interval: 7 hours, 53 minutes and 48 seconds Each execution of WinUtils does around *140 IO ops*, of which 130 are DDL ops. This means *666.58* IO ops/second due to WinUtils. We should start considering to remove WinUtils from Hadoop and creating a JNI interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8186) [Router] Federation: routing getAppState REST invocations transparently to multiple RMs
Giovanni Matteo Fumarola created YARN-8186: -- Summary: [Router] Federation: routing getAppState REST invocations transparently to multiple RMs Key: YARN-8186 URL: https://issues.apache.org/jira/browse/YARN-8186 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
Giovanni Matteo Fumarola created YARN-7898: -- Summary: [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router Key: YARN-7898 URL: https://issues.apache.org/jira/browse/YARN-7898 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7228) GenericExceptionHandler does not log StackTrace in case of INTERNAL_SERVER_ERROR
Giovanni Matteo Fumarola created YARN-7228: -- Summary: GenericExceptionHandler does not log StackTrace in case of INTERNAL_SERVER_ERROR Key: YARN-7228 URL: https://issues.apache.org/jira/browse/YARN-7228 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7097) Federation: routing REST invocations transparently to multiple RMs (part 5 - getNode)
Giovanni Matteo Fumarola created YARN-7097: -- Summary: Federation: routing REST invocations transparently to multiple RMs (part 5 - getNode) Key: YARN-7097 URL: https://issues.apache.org/jira/browse/YARN-7097 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7096) Federation: routing REST invocations transparently to multiple RMs (part 4 - getMetrics)
Giovanni Matteo Fumarola created YARN-7096: -- Summary: Federation: routing REST invocations transparently to multiple RMs (part 4 - getMetrics) Key: YARN-7096 URL: https://issues.apache.org/jira/browse/YARN-7096 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7095) Federation: routing REST invocations transparently to multiple RMs (part 3 - getNodes)
Giovanni Matteo Fumarola created YARN-7095: -- Summary: Federation: routing REST invocations transparently to multiple RMs (part 3 - getNodes) Key: YARN-7095 URL: https://issues.apache.org/jira/browse/YARN-7095 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7010) Federation: routing REST invocations transparently to multiple RMs (part 2 - getApps)
Giovanni Matteo Fumarola created YARN-7010: -- Summary: Federation: routing REST invocations transparently to multiple RMs (part 2 - getApps) Key: YARN-7010 URL: https://issues.apache.org/jira/browse/YARN-7010 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6973) Adding RM Cluster Id in ApplicationReport
Giovanni Matteo Fumarola created YARN-6973: -- Summary: Adding RM Cluster Id in ApplicationReport Key: YARN-6973 URL: https://issues.apache.org/jira/browse/YARN-6973 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6972) Adding SubClusterId in AppInfo
Giovanni Matteo Fumarola created YARN-6972: -- Summary: Adding SubClusterId in AppInfo Key: YARN-6972 URL: https://issues.apache.org/jira/browse/YARN-6972 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6970) Add PoolInitializationException as retriable exception in FederationFacade
Giovanni Matteo Fumarola created YARN-6970: -- Summary: Add PoolInitializationException as retriable exception in FederationFacade Key: YARN-6970 URL: https://issues.apache.org/jira/browse/YARN-6970 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6924) Metrics for Federation AMRMProxy
Giovanni Matteo Fumarola created YARN-6924: -- Summary: Metrics for Federation AMRMProxy Key: YARN-6924 URL: https://issues.apache.org/jira/browse/YARN-6924 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6923) Metrics for Federation Router
Giovanni Matteo Fumarola created YARN-6923: -- Summary: Metrics for Federation Router Key: YARN-6923 URL: https://issues.apache.org/jira/browse/YARN-6923 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6897) Refactoring RMWebServices by moving some util methods in RMWebAppUtil
Giovanni Matteo Fumarola created YARN-6897: -- Summary: Refactoring RMWebServices by moving some util methods in RMWebAppUtil Key: YARN-6897 URL: https://issues.apache.org/jira/browse/YARN-6897 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola In YARN-6896 the router needs to use some methods already implemented in {{RMWebServices}}. This jira continues the work done in YARN-6634. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6896) Federation: routing REST invocations transparently to multiple RMs
Giovanni Matteo Fumarola created YARN-6896: -- Summary: Federation: routing REST invocations transparently to multiple RMs Key: YARN-6896 URL: https://issues.apache.org/jira/browse/YARN-6896 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola This JIRA tracks the design/implementation of the layer for routing RMWebServicesProtocol requests to the appropriate RM(s) in a federated YARN cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6887) WebServices methods should be moved to an interface
Giovanni Matteo Fumarola created YARN-6887: -- Summary: WebServices methods should be moved to an interface Key: YARN-6887 URL: https://issues.apache.org/jira/browse/YARN-6887 Project: Hadoop YARN Issue Type: Improvement Reporter: Giovanni Matteo Fumarola {{WebServices}} implements 3 methods that are being called from {{RMWebServices}}. These 3 methods should be defined in an interface. It will make the code clean. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6871) Add a new REST API GetAppReport call lighter and faster than the current one
Giovanni Matteo Fumarola created YARN-6871: -- Summary: Add a new REST API GetAppReport call lighter and faster than the current one Key: YARN-6871 URL: https://issues.apache.org/jira/browse/YARN-6871 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager, router Reporter: Giovanni Matteo Fumarola This jira tracks the effort to create a new REST API similar to the current GetAppReport but lighter and faster. With the current one we are facing a scalability issues. E.g. with ~500 applications running the AppReport can reach up to 300MB in size due to the {{ResourceRequest}} in the {{AppInfo}}. The new method returns for each application the following and essential information: * The application id; * The application name; * The application state according to the ResourceManager - valid values are members of the YarnApplicationState enum: NEW, NEW_SAVING, SUBMITTED, ACCEPTED, RUNNING, FINISHED, FAILED, KILLED; * The final status of the application if finished - reported by the application itself - valid values are: UNDEFINED, SUCCEEDED, FAILED, KILLED; * The web URL that can be used to track the application; * Detailed diagnostics information; * The URL of the application master container logs; * The nodes http address of the application master; * The progress of the application as a percent. Yarn RM will return the new result faster and it will use less compute cycles to create the report and it will improve the YARN RM and Client's performances. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6853) Add MySql Scripts for FederationStateStore
Giovanni Matteo Fumarola created YARN-6853: -- Summary: Add MySql Scripts for FederationStateStore Key: YARN-6853 URL: https://issues.apache.org/jira/browse/YARN-6853 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola In YARN-3663 we added the SQL scripts for SQLServer. We want to add the MySql scripts to be able to run Federation with a MySQL servers which will be less performant but convenient. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6817) Add XML encode/decode tests for ResourceManager.webapp.dao classes
Giovanni Matteo Fumarola created YARN-6817: -- Summary: Add XML encode/decode tests for ResourceManager.webapp.dao classes Key: YARN-6817 URL: https://issues.apache.org/jira/browse/YARN-6817 Project: Hadoop YARN Issue Type: Test Reporter: Giovanni Matteo Fumarola Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6792) Incorrect XML convertion in NodeIDsInfo and LabelsToNodesInfo
Giovanni Matteo Fumarola created YARN-6792: -- Summary: Incorrect XML convertion in NodeIDsInfo and LabelsToNodesInfo Key: YARN-6792 URL: https://issues.apache.org/jira/browse/YARN-6792 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola Priority: Blocker NodeIDsInfo contains a typo and there is a missing constructor in LabelsToNodesInfo. These bugs does not allow a correct conversation in XML of LabelsToNodesInfo. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6740) Federation Router (hiding multiple RMs for ApplicationClientProtocol) phase 2
Giovanni Matteo Fumarola created YARN-6740: -- Summary: Federation Router (hiding multiple RMs for ApplicationClientProtocol) phase 2 Key: YARN-6740 URL: https://issues.apache.org/jira/browse/YARN-6740 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola This JIRA tracks the implementation of the layer for routing ApplicaitonClientProtocol requests to the appropriate RM(s) in a federated YARN cluster. Under the YARN-3659 we only implemented getNewApplication, submitApplication, forceKillApplication and getApplicationReport to execute applications E2E. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6587) Refactor of ResourceManager#startWebApp in a Util class
Giovanni Matteo Fumarola created YARN-6587: -- Summary: Refactor of ResourceManager#startWebApp in a Util class Key: YARN-6587 URL: https://issues.apache.org/jira/browse/YARN-6587 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola This jira tracks the refactor of ResourceManager#startWebApp in a util class since Router in YARN-5412 has to implement the same logic for Filtering and Authentication. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6572) Refactoring Router services to use common util classes for pipeline creations
Giovanni Matteo Fumarola created YARN-6572: -- Summary: Refactoring Router services to use common util classes for pipeline creations Key: YARN-6572 URL: https://issues.apache.org/jira/browse/YARN-6572 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6553) Remove MockResourceManagerFacade and use MockRM for AMRMProxy/Router tests
Giovanni Matteo Fumarola created YARN-6553: -- Summary: Remove MockResourceManagerFacade and use MockRM for AMRMProxy/Router tests Key: YARN-6553 URL: https://issues.apache.org/jira/browse/YARN-6553 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6539) Create SecureLogin inside Router
Giovanni Matteo Fumarola created YARN-6539: -- Summary: Create SecureLogin inside Router Key: YARN-6539 URL: https://issues.apache.org/jira/browse/YARN-6539 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6537) Consider running RM tests against the Router
Giovanni Matteo Fumarola created YARN-6537: -- Summary: Consider running RM tests against the Router Key: YARN-6537 URL: https://issues.apache.org/jira/browse/YARN-6537 Project: Hadoop YARN Issue Type: Sub-task Components: federation, resourcemanager Reporter: Giovanni Matteo Fumarola Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6526) Refactoring SQLFederationStateStore by avoiding to recreate the connections at every call
Giovanni Matteo Fumarola created YARN-6526: -- Summary: Refactoring SQLFederationStateStore by avoiding to recreate the connections at every call Key: YARN-6526 URL: https://issues.apache.org/jira/browse/YARN-6526 Project: Hadoop YARN Issue Type: Sub-task Components: federation Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5612) Return SubClusterId in FederationStateStoreFacade#addApplicationHomeSubCluster for Router Failover
Giovanni Matteo Fumarola created YARN-5612: -- Summary: Return SubClusterId in FederationStateStoreFacade#addApplicationHomeSubCluster for Router Failover Key: YARN-5612 URL: https://issues.apache.org/jira/browse/YARN-5612 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5602) Utils for Federation State and Policy Store
Giovanni Matteo Fumarola created YARN-5602: -- Summary: Utils for Federation State and Policy Store Key: YARN-5602 URL: https://issues.apache.org/jira/browse/YARN-5602 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5525) Make log aggregation service class configurable
Giovanni Matteo Fumarola created YARN-5525: -- Summary: Make log aggregation service class configurable Key: YARN-5525 URL: https://issues.apache.org/jira/browse/YARN-5525 Project: Hadoop YARN Issue Type: Improvement Components: log-aggregation Reporter: Giovanni Matteo Fumarola Priority: Minor Make the log aggregation class configurable and extensible, so that alternative log aggregation behaviors like app specific log aggregation directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5519) Add SubClusterId in AddApplicationHomeSubClusterResponse for Router Failover
Giovanni Matteo Fumarola created YARN-5519: -- Summary: Add SubClusterId in AddApplicationHomeSubClusterResponse for Router Failover Key: YARN-5519 URL: https://issues.apache.org/jira/browse/YARN-5519 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Ellen Hui This JIRA tracks the addition of SubClusterId into AddApplicationHomeSubClusterResponse. in the design of [YARN-3659|https://issues.apache.org/jira/browse/YARN-3659], to handle better fail-over scenario the response needs SubclusterId as field. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5471) Adding null pointer checks in ResourceRequest#newInstance
Giovanni Matteo Fumarola created YARN-5471: -- Summary: Adding null pointer checks in ResourceRequest#newInstance Key: YARN-5471 URL: https://issues.apache.org/jira/browse/YARN-5471 Project: Hadoop YARN Issue Type: Bug Components: applications, resourcemanager Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola ResourceRequest#newInstance has Priority, Resource and Strings fields. The application master can set these value to null. The proposal is to add null pointer checks in the class. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5186) Adding config to enable/disable completed reservations in queue metrics
Giovanni Matteo Fumarola created YARN-5186: -- Summary: Adding config to enable/disable completed reservations in queue metrics Key: YARN-5186 URL: https://issues.apache.org/jira/browse/YARN-5186 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Giovanni Matteo Fumarola Priority: Minor After some stress testing on reservations we found out that Yarn queue metrics show up in JMX for both finished and active reservations. This causes JMX to accumulate queue metrics for all finished reservations and become in efficient to use. We want to introduce a config parameter to enable/disable the finished reservations in JMX. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5120) Metric for RM async dispatcher queue size
Giovanni Matteo Fumarola created YARN-5120: -- Summary: Metric for RM async dispatcher queue size Key: YARN-5120 URL: https://issues.apache.org/jira/browse/YARN-5120 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Giovanni Matteo Fumarola Priority: Minor It is difficult to identify the health of the RM AsyncDispatcher. Solution: Add a metric for the AsyncDispatcher queue size. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5119) Timeout HA issue: Enable IPC ping for all calls by default
Giovanni Matteo Fumarola created YARN-5119: -- Summary: Timeout HA issue: Enable IPC ping for all calls by default Key: YARN-5119 URL: https://issues.apache.org/jira/browse/YARN-5119 Project: Hadoop YARN Issue Type: Improvement Reporter: Giovanni Matteo Fumarola We have RM work preserving HA setup with 2 RM instances – RM1 and RM2. RM1 is initially active and so clients connect successfully to a RM1. RM1 subsequently hangs (for any reason) and RM2 takes over as active but clients wait indefinitely on RM1. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die
Giovanni Matteo Fumarola created YARN-4976: -- Summary: Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die Key: YARN-4976 URL: https://issues.apache.org/jira/browse/YARN-4976 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4470) Application Master in-place upgrade
Giovanni Matteo Fumarola created YARN-4470: -- Summary: Application Master in-place upgrade Key: YARN-4470 URL: https://issues.apache.org/jira/browse/YARN-4470 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola It would be nice if clients could ask for an AM in-place upgrade. It will give to YARN the possibility to upgrade the AM, without losing the work done within its containers. This allows to deploy bug-fixes and new versions of the AM incurring in long service downtimes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4471) Add a public API to request an application upgrade
Giovanni Matteo Fumarola created YARN-4471: -- Summary: Add a public API to request an application upgrade Key: YARN-4471 URL: https://issues.apache.org/jira/browse/YARN-4471 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Giovanni Matteo Fumarola This JIRA tracks the definition of a new public API for YARN, which allows users to upgrade AM. This is part of the AM upgrade in-place enhancement proposed in YARN-4470. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4472) Introduce additional states in the app and app attempt state machines to keep track of the upgrade process
Giovanni Matteo Fumarola created YARN-4472: -- Summary: Introduce additional states in the app and app attempt state machines to keep track of the upgrade process Key: YARN-4472 URL: https://issues.apache.org/jira/browse/YARN-4472 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Marco Rabozzi -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4473) Add version information for the application and the application attempts
Giovanni Matteo Fumarola created YARN-4473: -- Summary: Add version information for the application and the application attempts Key: YARN-4473 URL: https://issues.apache.org/jira/browse/YARN-4473 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Marco Rabozzi In order to allow to upgrade an application master across different attempts, we need to keep track of different attempts versions and provide a mean to temporarily store the upgrade information until the upgrade completes. Concretely we would add: - A version identifier for each attempt - A temporary upgrade context for each application -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4475) Add the possibility to recover the state of an ongoing upgrade in case of RM restart
Giovanni Matteo Fumarola created YARN-4475: -- Summary: Add the possibility to recover the state of an ongoing upgrade in case of RM restart Key: YARN-4475 URL: https://issues.apache.org/jira/browse/YARN-4475 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4474) Implementation of the application master upgrade logic
Giovanni Matteo Fumarola created YARN-4474: -- Summary: Implementation of the application master upgrade logic Key: YARN-4474 URL: https://issues.apache.org/jira/browse/YARN-4474 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4188) MoveApplicationAcrossQueuesResponse should be an abstract class
Giovanni Matteo Fumarola created YARN-4188: -- Summary: MoveApplicationAcrossQueuesResponse should be an abstract class Key: YARN-4188 URL: https://issues.apache.org/jira/browse/YARN-4188 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.6.0 Reporter: Giovanni Matteo Fumarola Assignee: Brahma Reddy Battula Priority: Blocker Fix For: 2.7.0, 2.6.1 In AppSchedulingInfo.java the method checkForDeactivation() has these 2 consecutive lines: {code} ResourceRequest request = getResourceRequest(priority, ResourceRequest.ANY); if (request.getNumContainers() > 0) { {code} the first line calls getResourceRequest and it can return null. {code} synchronized public ResourceRequest getResourceRequest( Priority priority, String resourceName) { MapnodeRequests = requests.get(priority); return (nodeRequests == null) ? {color:red} null : nodeRequests.get(resourceName); } {code} The second line dereferences the pointer directly without a check. If the pointer is null, the RM dies. {quote}2015-03-17 14:14:04,757 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1384) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1263) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:816) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:588) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:449) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1017) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1059) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:114) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:739) at java.lang.Thread.run(Thread.java:722) {color:red} *2015-03-17 14:14:04,758 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..*{color} {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3661) Federation UI
Giovanni Matteo Fumarola created YARN-3661: -- Summary: Federation UI Key: YARN-3661 URL: https://issues.apache.org/jira/browse/YARN-3661 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, resourcemanager Reporter: Giovanni Matteo Fumarola The UIs provided by each RM, provide a correct local view of what is running in a sub-cluster. In the context of federation we need new UIs that can track load, jobs, users across sub-clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3659) Federation Router (hiding multiple RMs for ApplicationSubmissionProtocol)
Giovanni Matteo Fumarola created YARN-3659: -- Summary: Federation Router (hiding multiple RMs for ApplicationSubmissionProtocol) Key: YARN-3659 URL: https://issues.apache.org/jira/browse/YARN-3659 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Giovanni Matteo Fumarola This JIRA tracks the design/implementation of the layer for routing ApplicaitonSubmissionProtocol requests to the appropriate RM(s) in a federated YARN cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3663) Federation State and Policy Store (DBMS implementation)
Giovanni Matteo Fumarola created YARN-3663: -- Summary: Federation State and Policy Store (DBMS implementation) Key: YARN-3663 URL: https://issues.apache.org/jira/browse/YARN-3663 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, resourcemanager Reporter: Giovanni Matteo Fumarola This JIRA tracks a SQL-based implementation of the Federation State and Policy Store, which implements YARN-3662 APIs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3369) Missing NullPointer check in AppSchedulingInfo causes RM to die
Giovanni Matteo Fumarola created YARN-3369: -- Summary: Missing NullPointer check in AppSchedulingInfo causes RM to die Key: YARN-3369 URL: https://issues.apache.org/jira/browse/YARN-3369 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.6.0 Reporter: Giovanni Matteo Fumarola In AppSchedulingInfo.java the method checkForDeactivation() has these 2 consecutive lines: {quote} {color:red} ResourceRequest request = getResourceRequest(priority, ResourceRequest.ANY); if (request.getNumContainers() 0) { {color} {quote} the first line calls getResourceRequest and it can return null. {quote} synchronized public ResourceRequest getResourceRequest( Priority priority, String resourceName) { MapString, ResourceRequest nodeRequests = requests.get(priority); {color:red} *return* {color} (nodeRequests == null) ? {color:red} *null* {color} : nodeRequests.get(resourceName); } {quote} The second line dereferences the pointer directly without a check. If the pointer is null, the RM dies. {quote}2015-03-17 14:14:04,757 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1384) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1263) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:816) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:588) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:449) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1017) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1059) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:114) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:739) at java.lang.Thread.run(Thread.java:722) {color:red} *2015-03-17 14:14:04,758 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..*{color} {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)