[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk
[ https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822960#comment-16822960 ] Xianghao Lu commented on YARN-6825: --- [~Feng Yuan] as what [~bibinchundatt] said, applicationUpdate can be handled by configuring less value attemptAdd, attemptUpdate can be handled by YARN-6125, YARN-6967 more details for zk data size limit in yarn, please refer to YARN-9498 > RM quit due to ApplicationStateData exceed the limit size of znode in zk > > > Key: YARN-6825 > URL: https://issues.apache.org/jira/browse/YARN-6825 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith Sharma K S >Priority: Major > > YARN-5006 fixes this issue by strict validation for ApplicationStateData > length against 1MB(default max jute buffer) during application submission > only. There is possibility of thrashing as dead zone was not properly > defined/taken care. > But it do not consider scenarios where ApplicationStateData can be increased > later point of time i.e > # If app is submitted with less than 1MB during submission, later updated > like queue name or life time value or priority is changed. The app update > call will be sent to statestore which cause same issue because > ApplicationStateData length has increased. > # Consider there is no app update, but final state are stored in ZK. This > adds up several fields such finishTime, finalState, finalApplicationState. > This increases size of ApplicationStateData. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk
[ https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089630#comment-16089630 ] Feng Yuan commented on YARN-6825: - Hi all, if we could handle this problem like YARN-5006? Do the same size check at applicationUpdate,attemptAdd,attemptUpdate operations like YARN-5006 do. > RM quit due to ApplicationStateData exceed the limit size of znode in zk > > > Key: YARN-6825 > URL: https://issues.apache.org/jira/browse/YARN-6825 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith Sharma K S > > YARN-5006 fixes this issue by strict validation for ApplicationStateData > length against 1MB(default max jute buffer) during application submission > only. There is possibility of thrashing as dead zone was not properly > defined/taken care. > But it do not consider scenarios where ApplicationStateData can be increased > later point of time i.e > # If app is submitted with less than 1MB during submission, later updated > like queue name or life time value or priority is changed. The app update > call will be sent to statestore which cause same issue because > ApplicationStateData length has increased. > # Consider there is no app update, but final state are stored in ZK. This > adds up several fields such finishTime, finalState, finalApplicationState. > This increases size of ApplicationStateData. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk
[ https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089355#comment-16089355 ] Rohith Sharma K S commented on YARN-6825: - [~bibinchundatt] would you like to provide a patch for this as per previous comment? > RM quit due to ApplicationStateData exceed the limit size of znode in zk > > > Key: YARN-6825 > URL: https://issues.apache.org/jira/browse/YARN-6825 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith Sharma K S > > YARN-5006 fixes this issue by strict validation for ApplicationStateData > length against 1MB(default max jute buffer) during application submission > only. There is possibility of thrashing as dead zone was not properly > defined/taken care. > But it do not consider scenarios where ApplicationStateData can be increased > later point of time i.e > # If app is submitted with less than 1MB during submission, later updated > like queue name or life time value or priority is changed. The app update > call will be sent to statestore which cause same issue because > ApplicationStateData length has increased. > # Consider there is no app update, but final state are stored in ZK. This > adds up several fields such finishTime, finalState, finalApplicationState. > This increases size of ApplicationStateData. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk
[ https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087303#comment-16087303 ] Bibin A Chundatt commented on YARN-6825: {quote} I think 80% of configured jute buffer should be taken implicitly rather than allowing admin to configure 80% of jute buffer. {quote} Since the default value of jute buffer size is currently 1MB , I agree we can make the default value as {{.8MB}}. So does this solve the issues you have mentioned?? I had in mind the cases mentioned in this jira during 5006 missed to explicitly mention in YARN-5006. Do you find any other scenario which couldn't be handled by configuration?? So are we good to go ahead with YARN-6819? If its fine with you , will handle the default value change also in same jira.. > RM quit due to ApplicationStateData exceed the limit size of znode in zk > > > Key: YARN-6825 > URL: https://issues.apache.org/jira/browse/YARN-6825 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith Sharma K S > > YARN-5006 fixes this issue by strict validation for ApplicationStateData > length against 1MB(default max jute buffer) during application submission > only. There is possibility of thrashing as dead zone was not properly > defined/taken care. > But it do not consider scenarios where ApplicationStateData can be increased > later point of time i.e > # If app is submitted with less than 1MB during submission, later updated > like queue name or life time value or priority is changed. The app update > call will be sent to statestore which cause same issue because > ApplicationStateData length has increased. > # Consider there is no app update, but final state are stored in ZK. This > adds up several fields such finishTime, finalState, finalApplicationState. > This increases size of ApplicationStateData. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk
[ https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087177#comment-16087177 ] Rohith Sharma K S commented on YARN-6825: - I think 80% of configured jute buffer should be taken implicitly rather than allowing admin to configure 80% of jute buffer. This gives a buffered space for allowing additional information into ApplicationStateData. And also diagnosis error message length i.e 64Kb should be private configurations. Otherwise, this issue will be there forever if there is any misconfigured length of diagnosis errors. > RM quit due to ApplicationStateData exceed the limit size of znode in zk > > > Key: YARN-6825 > URL: https://issues.apache.org/jira/browse/YARN-6825 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith Sharma K S > > YARN-5006 fixes this issue by strict validation for ApplicationStateData > length against 1MB(default max jute buffer) during application submission > only. There is possibility of thrashing as dead zone was not properly > defined/taken care. > But it do not consider scenarios where ApplicationStateData can be increased > later point of time i.e > # If app is submitted with less than 1MB during submission, later updated > like queue name or life time value or priority is changed. The app update > call will be sent to statestore which cause same issue because > ApplicationStateData length has increased. > # Consider there is no app update, but final state are stored in ZK. This > adds up several fields such finishTime, finalState, finalApplicationState. > This increases size of ApplicationStateData. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk
[ https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087093#comment-16087093 ] Bibin A Chundatt commented on YARN-6825: Hi [~rohithsharma] IIUC this could be handle by configuring the value about 80% for 1 MB(jute buffer size). # Exception log could occupy space (YARN-6125 Bytes limit is added for diagnostics message ) # Finish time , priority, queuename can be handle by configuring lesser value So incombination we should be able handle. > RM quit due to ApplicationStateData exceed the limit size of znode in zk > > > Key: YARN-6825 > URL: https://issues.apache.org/jira/browse/YARN-6825 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith Sharma K S > > YARN-5006 fixes this issue by strict validation for ApplicationStateData > length against 1MB(default max jute buffer) during application submission > only. There is possibility of thrashing as dead zone was not properly > defined/taken care. > But it do not consider scenarios where ApplicationStateData can be increased > later point of time i.e > # If app is submitted with less than 1MB during submission, later updated > like queue name or life time value or priority is changed. The app update > call will be sent to statestore which cause same issue because > ApplicationStateData length has increased. > # Consider there is no app update, but final state are stored in ZK. This > adds up several fields such finishTime, finalState, finalApplicationState. > This increases size of ApplicationStateData. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org