Hi Gabriele, I'd recommend extending the existing window function whenever possible, as Flink will automatically cover state management for you and no need to be concerned with state backend details. Incremental aggregation for reduce state size is also out of the box if your usage can be satisfied with the reduce/aggregate function pattern, which is important for large windows.
Best, Zhanghao Chen ________________________________ From: Gabriele Mencagli <gabriele.menca...@gmail.com> Sent: Monday, March 4, 2024 19:38 To: user@flink.apache.org <user@flink.apache.org> Subject: Question about time-based operators with RocksDB backend Dear Flink Community, I am using Flink with the DataStream API and operators implemented using RichedFunctions. I know that Flink provides a set of window-based operators with time-based semantics and tumbling/sliding windows. By reading the Flink documentation, I understand that there is the possibility to change the memory backend utilized for storing the in-flight state of the operators. For example, using RocksDB for this purpose to cope with a larger-than-memory state. If I am not wrong, to transparently change the backend (e.g., from in-memory to RocksDB) we have to use a proper API to access the state. For example, the Keyed State API with different abstractions such as ValueState<T>, ListState<T>, etc... as reported here<https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/fault-tolerance/state/>. My question is related to the utilization of time-based window operators with the RocksDB backend. Suppose for example very large temporal windows with a huge number of keys in the stream. I am wondering if there is a possibility to use the built-in window operators of Flink (e.g., with an AggregateFunction or a more generic ProcessWindowFunction as here<https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/operators/windows/>) transparently with RocksDB support as a state back-end, or if I have to develop the window operator in a raw manner using the Keyed State API (e.g., ListState, AggregateState) for this purpose by implementing the underlying window logic manually in the code of RichedFunction of the operator (e.g., a FlatMap). Thanks for your support, -- Gabriele Mencagli