[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199728#comment-16199728 ] Naganarasimha G R commented on YARN-6523: - Thanks for looking into this [~subru] and [~jlowe] for the comment, Sorry to have missed to update the priority. Agree that this is not critical but will impact clusters with multiple long running jobs. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Improvement > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Manikandan R > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198847#comment-16198847 ] Jason Lowe commented on YARN-6523: -- [~Naganarasimha] is this issue still Critical after the revelations from [this comment|https://issues.apache.org/jira/browse/YARN-6523?focusedCommentId=16000998=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16000998]? I got the impression this is not a problem for most deployments in practice because it is rare for a job to be submitted without the HDFS token needed for logging. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Improvement > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Manikandan R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197691#comment-16197691 ] Subru Krishnan commented on YARN-6523: -- Thanks [~maniraj...@gmail.com], [~Naganarasimha], [~jlowe] for working on this. This looks to be a serious performance bottleneck especially considering that Federation (YARN-2915) is now part of 2.9.0. Can we accelerate this and get it resolved by 27th October? > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Improvement > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Manikandan R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196438#comment-16196438 ] Naganarasimha G R commented on YARN-6523: - Thanks [~maniraj...@gmail.com], for reworking on the comments and providing the approach as discussed offline. Steps seems to be fine to start of with. Will give additional comments once you upload the initial patch. Assigning the jira to you. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Improvement > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Manikandan R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151817#comment-16151817 ] Manikandan R commented on YARN-6523: [~Naganarasimha] I made an attempt to understand the discussion in this JIRA and come up with below steps to address this. Please review and share your thoughts. Can I work on this? A) Sequence No flow: 1. Introduce a atomic long variable in RMContextImpl to hold this sequence no 2. Make sure the above sequence no current value passed to Nodes as part of node registration process through response 3. Increment above sequence no as and when there is any update in delegation tokens, specifically in DelegationTokenRenewer#requestNewHdfsDelegationTokenAsProxyUser 4. ResourceTrackerService#nodeHeartbeat would use the above sequence number to decide whether to update SystemCredentialsForApps in NodeHeartbeatResponse or not by comparing it with the number received as part of NodeHeartbeatRequest. 5. NodeHeartbeatRequest will start having sequence no as part of the request. This requires a change in corresponding proto class. 6. NodeHeartbeatResponse will start having sequence no as part of the response. This requires a change in corresponding proto class. B) Caching system credentials objects Will go through the code and share my proposal to achieve the same. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Improvement > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136281#comment-16136281 ] Naganarasimha G R commented on YARN-6523: - Thanks [~rchiang], for having a look into this, as per the discussion above this is like good to have improvement rather than bug, and also its not of high priority hence planning to get it in next version. So have removed the target version and changed type to improvement. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135951#comment-16135951 ] Ray Chiang commented on YARN-6523: -- Looks like there hasn't been much movement on this recently. [~Naganarasimha], we're about a month away from beta1. In case there isn't a second beta, what is the likelihood we can get this one done? > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001009#comment-16001009 ] Naganarasimha G R commented on YARN-6523: - [~jlowe], In our offline discussion you had mentioned as bq. believe there's still some optimization that can be done given that once a token is retrieved by the RM on behalf of an application that token is sent for every heartbeat to every node in the cluster until that application completes. That's very wasteful. Doing a sequence number version thing as I suggested earlier with a precomputed system credentials would drastically cut down on the traffic and garbage created for every heartbeat. However I agree in light of the custom release findings that the priority of fixing this is far lower than before. Agree for the long running app unnecessary tokens will be exchanged after 7 days, which is unnecessary traffic and memory reclaiming. {{sequence number version thing}} seems to be a good fit approach will try work on it further. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000998#comment-16000998 ] Naganarasimha G R commented on YARN-6523: - Thanks [~jlowe] for supporting in analysis for this issue. And sorry to inform that due to one of the private fix in our code base we were facing this issue (spark JDBCserver sending different token when starting the AM and launching of executors later on). Did not realize it that it was only in our code we were sending out tokens for each app when app was getting submitted. We are trying to analyze further on this issue. Thanks for helping out with it, but given that in Opensource we are already sending tokens only which is newly created as part of NM - RM communication, I think we do not require much further optimization right ? As Token renew which happens once in a day doesn't need to be updated to others and only on request of new token from RM(after token expires after 7 days) we need to inform other containers using it ? > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996976#comment-15996976 ] Jason Lowe commented on YARN-6523: -- bq. well actually i was trying to say here was not a delta, but send the tokens for all apps for which atleast one of the tokens gets renewed (assuming that there will be less #apps for which renewal happens). When I last looked at the code, I thought the system credentials map only had the tokens that the RM had to go get on behalf of the app? If that's indeed the case, then the credentials being sent to each node on every heartbeat already is the subset you are proposing. Looking at the code again, I only see the system credentials being added by DelegationTokenRenewer#requestNewHdfsDelegationTokenAsProxyUser, and that is only called if the HDFS token is missing or bad for the app. In addition it is only adding the token that was retrieved and not the entire app credentials. That already seems to be the minimum set of credentials that the RM needs to send to all nodes if we're not doing a per-node delta approach. Given this is really bad on your cluster, it'd be good to understand why the RM has put so many tokens in there since it should only be putting in ones where the app was either missing the HDFS token or the token couldn't be renewed for some reason (e.g.: already expired). Maybe I'm missing something in the code. bq. But having delta per node does not solve the first issue. It does solve the issue if we're tracking the delta for all nodes in the cluster, not just nodes that have run the app's containers in the past. I hope we're all in agreement now that we cannot make this work if we're only sending the system credentials for an app to a subset of the nodes. An optimal, minimal data transfer approach is where we only send the changed credentials for an app since the last time a node heartbeats. That credential delta will be different for some nodes vs. others since their heartbeats can occur before or after an app credential update. A bit complicated to implement in practice, but it is doable. bq. IIUC there is no version concept as of now between RM and NM There is a version that is sent between the NM and RM when the NM registers. That's how the yarn.nodemanager.resourcemanager.minimum.version functionality works, check ResourceTrackerService#registerNodeManager. It is true that the version the NM reports is discarded once it passes the minimum version check, and we'd need to store the NM version (or a feature bit derived from the version) somewhere like the RMNodeImpl to handle the heartbeat semantic change properly. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996281#comment-15996281 ] Naganarasimha G R commented on YARN-6523: - Thanks for the quick reply [~jlowe], bq. I also think we can get the delta to work with some effort. Note however that the delta is per node not some global delta, because nodes may be heart beating at drastically different times. Therefore there isn't going to be a good way to build a single, pre-computed well actually i was trying to say here was not a delta, but send the tokens for all apps for which atleast one of the tokens gets renewed (assuming that there will be less #apps for which renewal happens). Based on what you mentioned and what i could understand from the code: if the tokens are not expired then usually tokens are available in ContainerLaunchContext for NM to localize the resources. So we need tokens from RM to be sent to NM only for the renewed ones only. And as you were mentioning earlier there were two issues to be addressed # Long running job with renewed token can get an allocation to a node which has not launched any container for this app. # tokens are renewed for app and either Node is down or having connectivity issues. Sending all tokens during registration might solve the later issue. But having delta per node does not solve the first issue. Hence i was suggesting we will send the tokens for all apps for which atleast one of the tokens gets renewed. bq. If we suddenly start sending a delta in heartbeats instead of the full set then that's an incompatible semantic change even though the technical signature of the interface did not change. Old nodemanagers during a rolling upgrade will not do the correct thing and apps could fail. Ohh i missed this scenario, thanks for pointing it out and also helping with the solution. IIUC there is no version concept as of now between RM and NM and we need to bring in now right ? > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995030#comment-15995030 ] Jason Lowe commented on YARN-6523: -- Sending the full list at registration time makes a lot of sense to me, and I also think we can get the delta to work with some effort. Note however that the delta is _per node_ not some global delta, because nodes may be heartbeating at drastically different times. Therefore there isn't going to be a good way to build a single, pre-computed SystemCredentialsForAppsProto for deltas. Each node will have to receive the app tokens that have been renewed since their last heartbeat, and that will be a different list than for other nodes in the cluster. There will be many that will share the same delta, but it won't be the same for all of them. Also note that there is going to be an interface change even with your proposal. The current code assumes that the system credentials received in a heartbeat _replace_ the previous set of credentials. If we suddenly start sending a delta in heartbeats instead of the full set then that's an incompatible semantic change even though the technical signature of the interface did not change. Old nodemanagers during a rolling upgrade will not do the correct thing and apps could fail. So minimally the RM would need to check the NM version and always send the full system credentials in each heartbeat if the NM version is "old" and only use the delta when the NM is beyond a certain version. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994995#comment-15994995 ] Naganarasimha G R commented on YARN-6523: - Sorry for the delay in response [~jlowe], Thanks for the very detailed response. Agree that the delta approaches initially mentioned can introduce certain amount of complexity in the cases mentioned by you. Though initially the approach mentioned by you was appealing and less complicated, i was thinking of following scenarios : # When there are large number of small jobs in a large clsuter we almost send the tokens as the sequence keeps increasing when more and more jobs get submitted. # Well we are doing interface modification, so it would be better to go for complete solution so that its not revisited again for deprecation. One other approach which i can think of is : Send all the tokens during node registration ( This will avoid most of the corner cases) and as part of heartbeat send the app tokens(all) which have been renewed (which can be done in event based model). Further we can have the cache(pre-computed) of SystemCredentialsForAppsProto which are sent as part of Heart Beat so that we reduce memory foot print. thus this approach would solve large number of small jobs too without interface change. thoughts ? > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984919#comment-15984919 ] Jason Lowe commented on YARN-6523: -- I don't know the full story behind the SystemCredentialsForApps thing. Looks like something that was put in for Slider and other long-running services where the initial tokens can expire. It would be good to get input from [~vinodkv] and [~jianhe] since they were more involved in this. I agree it seems silly for every node in the cluster to get _all_ apps HDFS credentials on _every heartbeat_. I suspect this was the simplest thing to implement, but it's far from efficient. Going to the other extreme of just sending the app credentials only once for just the apps that could be active on the node is a lot more complicated. It's true that RMNodeImpl is tracking what applications are on the node, but this is _reactive_ tracking to what the node is already doing. There are some scenarios where the updated tokens need to be on the node _before_ the container launch request arrives at the node and therefore the app becomes active in the node's RMNodeImpl. For example, a Slider app runs for months. The initial tokens at app submit time have long expired, so the RM has had to re-fetch the tokens. Then suddenly the Slider app wants to launch a container on a node it's never touched before. The node's RMNodeImpl doesn't know the app is active until a container starts running on it, but the container can't localize without the updated tokens that the node has never received yet. So we'd need to send the credentials when the scheduler allocates an app's container on the node for the first time and then also when any of the app's credentials are updated (e.g.: when a token is replaced with a refreshed version). And then there's handling lost heartbeats, node reconnect, etc. In short, efficient delta is a lot more complicated. Rather than going straight to the complicated, fully optimal implementation we could do something in-between. For example, we could have a sequence number associated with the system credentials. Nodes would send the last sequence number that they have received, and if it matches the current sequence number then the RM does _not_ send them in the heartbeat response. If the sequence numbers don't match then the RM sends the current sequence number along with the system credentials. It's still sending all the credentials instead of optimal deltas, but at least they're only being sent when the node needs the updated version. And yes, we should precompute the SystemCredentialsForAppsProto once when the credentials change and re-send the same object to any node that needs the updated credentials rather than recreate the same object over and over and over. That should drastically cut down on the number of objects related to system credentials in heartbeats and how often we're sending them. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984853#comment-15984853 ] Sunil G commented on YARN-6523: --- Yes. Its ideal to keep a copy on each RMNodeImpl itself. I think RMNodeImpl could pull for renewed tokens from {{DelegationTokenRenewer]} at regular intervals if there is a concern over increased number of events to be fired to RMNodeImpl regarding token renewal. I still feel an event publishing model will be more concrete. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984830#comment-15984830 ] Rohith Sharma K S commented on YARN-6523: - +1 for the issue. I am not sure why all the apps tokens are sent to NM rather than sending only of applications which are running on that node. In 2nd and 3rd approaches has to deal with renewal of credentials. It never be known that does credentials are renewed. But in 1st approach, performance need to be compromised for node heartbeat response time. How about keeping app credentials in RMNodeImpl i.e proposal is let RMnodeImpl maintains copy of credentials and these are sent in heartbeat response. By this way, RMnodeImpl maintains app credential for running applications on node. In case of credential renewal, an event triggered to RMnodeImpl to change its credentials. But, there would be corner cases, updated credentials will misses for couple of heartbeats. cc:/ [~jlowe] > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982284#comment-15982284 ] Naganarasimha G R commented on YARN-6523: - Approach depends on why we are sending credentials for all apps which i am not completely clear. IMO it should be sufficient to send the tokens for the apps (containers) active on the node. Possible solutions : # Send only app credentials related to the node on each heartbeat # Send only app credentials related to the node on each heartbeat and also delta modifications for the node since the last heartbeat. # Cache SystemCredentialsForAppsProto objects itself and reuse them rather than recreating for each node's heartbeat.(if require to send all the apps token to the node) P.S. credit goes to [~gu chi] for analysis of this issue. > RM requires large memory in sending out security tokens as part of Node > Heartbeat in large cluster > -- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Bug > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org