[ 
https://issues.apache.org/jira/browse/FLINK-24815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444375#comment-17444375
 ] 

Piotr Nowojski commented on FLINK-24815:
----------------------------------------

I don't know much about this part of the code so sorry for maybe a basic 
question, but how would you know the actual state size value to pass to that 
builder? Wouldn't you have to iterate over all state handles and ultimately do 
the same thing what the {{OperatorSubtaskState}} is already doing?

> Reduce the cpu cost of calculating stateSize during state allocation
> --------------------------------------------------------------------
>
>                 Key: FLINK-24815
>                 URL: https://issues.apache.org/jira/browse/FLINK-24815
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.14.0
>            Reporter: ming li
>            Priority: Major
>
> When the task failover, we will reassign the state for each subtask and 
> create a new {{OperatorSubtaskState}} object. At this time, the {{stateSize}} 
> field in the {{OperatorSubtaskState}} will be recalculated. When using 
> incremental {{{}Checkpoint{}}}, this field needs to traverse all shared 
> states and then accumulate the size of the state.
> Taking a job with 2000 parallelism and 100 share state for each task as an 
> example, it needs to traverse 2000 * 100 = 20w times. At this time, the cpu 
> of the JM scheduling thread will be full.
> I think we can try to provide a construction method with {{stateSize}} for 
> {{OperatorSubtaskState}} or delay the calculation of {{{}stateSize{}}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to