[ 
https://issues.apache.org/jira/browse/FLINK-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-11517:
-----------------------------------
    Labels: auto-deprioritized-major stale-minor  (was: 
auto-deprioritized-major)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issues has been marked as 
Minor but is unassigned and neither itself nor its Sub-Tasks have been updated 
for 180 days. I have gone ahead and marked it "stale-minor". If this ticket is 
still Minor, please either assign yourself or give an update. Afterwards, 
please remove the label or in 7 days the issue will be deprioritized.


> Inefficient window state access when using RocksDB state backend
> ----------------------------------------------------------------
>
>                 Key: FLINK-11517
>                 URL: https://issues.apache.org/jira/browse/FLINK-11517
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataStream
>            Reporter: Elias Levy
>            Priority: Minor
>              Labels: auto-deprioritized-major, stale-minor
>
> When using an aggregate function on a window with a process function and the 
> RocksDB state backend, state access is inefficient.
> The WindowOperator calls windowState.add to merge the new element using the 
> aggregate function.  The add method of RocksDBAggregatingState will read the 
> state, deserialize the state, call the aggregate function, deserialize the 
> state, and write it out.
> If the trigger decides the window must be fired, as the the windowState.add 
> does not return the state, the WindowOperator must call windowState.get to 
> get it and pass it to the window process function, resulting in another read 
> and deserialization.
> Finally, while the state is not passed in to the trigger, in some cases the 
> trigger may have a need to access the state.  That is our case.  As the state 
> is not passed to the trigger, we must read and deserialize the state one more 
> from within the trigger.
> Thus, state must be read and deserialized three times to process a single 
> element.  If the state is large, this can be quite costly.
>  
> Ideally  windowState.add would return the state, so that the WindowOperator 
> can pass it to the process function without having to read it again.  
> Additionally, the state would be made available to the trigger to enable more 
> use cases without having to go through the state descriptor again.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to