[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk

2019-04-22 Thread Xianghao Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822960#comment-16822960
 ] 

Xianghao Lu commented on YARN-6825:
---

[~Feng Yuan]

as what [~bibinchundatt] said, applicationUpdate can be handled by configuring 
less value

attemptAdd, attemptUpdate can be handled by YARN-6125, YARN-6967

 

more details for zk data size limit in yarn, please refer to YARN-9498 

> RM quit due to ApplicationStateData exceed the limit size of znode in zk
> 
>
> Key: YARN-6825
> URL: https://issues.apache.org/jira/browse/YARN-6825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Priority: Major
>
> YARN-5006 fixes this issue by strict validation for ApplicationStateData 
> length against 1MB(default max jute buffer) during application submission 
> only. There is possibility of thrashing as dead zone was not properly 
> defined/taken care.
> But it do not consider scenarios where ApplicationStateData can be increased 
> later point of time i.e 
> # If app is submitted with less than 1MB during submission, later updated 
> like queue name or life time value or priority is changed. The app update 
> call will be sent to statestore which cause same issue because 
> ApplicationStateData length has increased.
> # Consider there is no app update, but final state are stored in ZK. This 
> adds up several fields such finishTime, finalState, finalApplicationState. 
> This increases size of ApplicationStateData.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk

2017-07-17 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089630#comment-16089630
 ] 

Feng Yuan commented on YARN-6825:
-

Hi all, if we could handle this problem like YARN-5006?
Do the same size check at applicationUpdate,attemptAdd,attemptUpdate operations 
like YARN-5006 do.


> RM quit due to ApplicationStateData exceed the limit size of znode in zk
> 
>
> Key: YARN-6825
> URL: https://issues.apache.org/jira/browse/YARN-6825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>
> YARN-5006 fixes this issue by strict validation for ApplicationStateData 
> length against 1MB(default max jute buffer) during application submission 
> only. There is possibility of thrashing as dead zone was not properly 
> defined/taken care.
> But it do not consider scenarios where ApplicationStateData can be increased 
> later point of time i.e 
> # If app is submitted with less than 1MB during submission, later updated 
> like queue name or life time value or priority is changed. The app update 
> call will be sent to statestore which cause same issue because 
> ApplicationStateData length has increased.
> # Consider there is no app update, but final state are stored in ZK. This 
> adds up several fields such finishTime, finalState, finalApplicationState. 
> This increases size of ApplicationStateData.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk

2017-07-17 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089355#comment-16089355
 ] 

Rohith Sharma K S commented on YARN-6825:
-

[~bibinchundatt] would you like to provide a patch for this as  per previous 
comment? 

> RM quit due to ApplicationStateData exceed the limit size of znode in zk
> 
>
> Key: YARN-6825
> URL: https://issues.apache.org/jira/browse/YARN-6825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>
> YARN-5006 fixes this issue by strict validation for ApplicationStateData 
> length against 1MB(default max jute buffer) during application submission 
> only. There is possibility of thrashing as dead zone was not properly 
> defined/taken care.
> But it do not consider scenarios where ApplicationStateData can be increased 
> later point of time i.e 
> # If app is submitted with less than 1MB during submission, later updated 
> like queue name or life time value or priority is changed. The app update 
> call will be sent to statestore which cause same issue because 
> ApplicationStateData length has increased.
> # Consider there is no app update, but final state are stored in ZK. This 
> adds up several fields such finishTime, finalState, finalApplicationState. 
> This increases size of ApplicationStateData.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk

2017-07-14 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087303#comment-16087303
 ] 

Bibin A Chundatt commented on YARN-6825:


{quote}
I think 80% of configured jute buffer should be taken implicitly rather than 
allowing admin to configure 80% of jute buffer. 
{quote}
Since the default value of jute buffer size is currently 1MB , I agree we can 
make the default value as {{.8MB}}. So does this solve the issues you have 
mentioned??
I had in mind the cases mentioned in this jira during 5006 missed to explicitly 
mention in YARN-5006.

Do you find any other scenario which couldn't be handled by configuration??  
So are we good to go ahead with YARN-6819?  If its fine with you , will handle 
the  default value change also in same jira..

> RM quit due to ApplicationStateData exceed the limit size of znode in zk
> 
>
> Key: YARN-6825
> URL: https://issues.apache.org/jira/browse/YARN-6825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>
> YARN-5006 fixes this issue by strict validation for ApplicationStateData 
> length against 1MB(default max jute buffer) during application submission 
> only. There is possibility of thrashing as dead zone was not properly 
> defined/taken care.
> But it do not consider scenarios where ApplicationStateData can be increased 
> later point of time i.e 
> # If app is submitted with less than 1MB during submission, later updated 
> like queue name or life time value or priority is changed. The app update 
> call will be sent to statestore which cause same issue because 
> ApplicationStateData length has increased.
> # Consider there is no app update, but final state are stored in ZK. This 
> adds up several fields such finishTime, finalState, finalApplicationState. 
> This increases size of ApplicationStateData.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk

2017-07-14 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087177#comment-16087177
 ] 

Rohith Sharma K S commented on YARN-6825:
-

I think 80% of  configured jute buffer should be taken implicitly rather than 
allowing admin to configure 80% of jute buffer. This gives a buffered space for 
allowing additional information into ApplicationStateData. 

And also diagnosis error message length i.e 64Kb should be private 
configurations. Otherwise, this issue will be there forever if there is any 
misconfigured length of diagnosis errors. 

> RM quit due to ApplicationStateData exceed the limit size of znode in zk
> 
>
> Key: YARN-6825
> URL: https://issues.apache.org/jira/browse/YARN-6825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>
> YARN-5006 fixes this issue by strict validation for ApplicationStateData 
> length against 1MB(default max jute buffer) during application submission 
> only. There is possibility of thrashing as dead zone was not properly 
> defined/taken care.
> But it do not consider scenarios where ApplicationStateData can be increased 
> later point of time i.e 
> # If app is submitted with less than 1MB during submission, later updated 
> like queue name or life time value or priority is changed. The app update 
> call will be sent to statestore which cause same issue because 
> ApplicationStateData length has increased.
> # Consider there is no app update, but final state are stored in ZK. This 
> adds up several fields such finishTime, finalState, finalApplicationState. 
> This increases size of ApplicationStateData.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6825) RM quit due to ApplicationStateData exceed the limit size of znode in zk

2017-07-14 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087093#comment-16087093
 ] 

Bibin A Chundatt commented on YARN-6825:


Hi [~rohithsharma]

IIUC this could be handle by configuring the value about 80% for 1 MB(jute 
buffer size). 

# Exception log could occupy space (YARN-6125 Bytes limit is added for 
diagnostics message ) 
# Finish time , priority, queuename can be handle by configuring lesser value

So incombination we should be able handle. 




> RM quit due to ApplicationStateData exceed the limit size of znode in zk
> 
>
> Key: YARN-6825
> URL: https://issues.apache.org/jira/browse/YARN-6825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>
> YARN-5006 fixes this issue by strict validation for ApplicationStateData 
> length against 1MB(default max jute buffer) during application submission 
> only. There is possibility of thrashing as dead zone was not properly 
> defined/taken care.
> But it do not consider scenarios where ApplicationStateData can be increased 
> later point of time i.e 
> # If app is submitted with less than 1MB during submission, later updated 
> like queue name or life time value or priority is changed. The app update 
> call will be sent to statestore which cause same issue because 
> ApplicationStateData length has increased.
> # Consider there is no app update, but final state are stored in ZK. This 
> adds up several fields such finishTime, finalState, finalApplicationState. 
> This increases size of ApplicationStateData.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org