Hi Flink community, We're building a Flink topology that aggregates events by key. When a key is seen for the first time, we load its base state from an external store. After performing some calculations, we emit the result to Kafka, and another process is responsible for writing it back to the store.
Our goals are: 1. *Keep per-key state alive for a few hours*, but *remove it if no new events arrive during that time*. 2. *Suppress frequent updates* and *emit output only every 5 seconds*, but *only if updates occurred* during that interval. I see two possible implementation strategies: 1. *Session Windows (processing time):* This would allow automatic state cleanup based on inactivity, but I'm concerned about *merging session states*, since our logic depends on having a well-defined "base state" and applying updates to it. I can not handle merge basically. 2. *KeyedProcessFunction with timers:* In this approach, we’d maintain keyed state, and use: - A *5-second periodic timer* to check for and emit updates (if any). - That same timer can also track *last-seen timestamps* to expire state after a few idle hours. *My questions:* Is the session window approach even suitable here, given the merging issue I have? Is the KeyedProcessFunction with a "single recurring timer" a recommended pattern in such use cases? Are there any caveats or better alternatives for managing "suppressed + expiring" keyed state? Thanks in advance! -- Ehud Lev, Staff Engineer email: ehud....@forter.com web: www.forter.com mobile: 052-5832253