It's not so easy to tell when we really need to buffer records until we actually get some records. This is a consequence (maybe a downside) of my choice to use `TimeDefinition` to use the window-end time as "now" and the grace period as the `suppressDuration`. Because of this, within the buffering context, even with a `suppressDuration` of 0, we might still need to buffer, as the effective timestamp is in the future.
Thinking through this, we could try instead using the window start as "now" and using the window size + grace period as the suppress duration, but offhand it seems this wouldn't work too well with SessionWindows (or other variable-sized windows). So instead what I chose to do is just do a lightweight check when I need the buffer and initialize it if it hasn't already been. I could even move the `if buffer == null` to right here, and jit branch prediction would ensure this lazy check is almost zero after buffer gets initialized. Some alternatives: 1. discard the optimization and just always initialize it, in case I need it. 2. junk the (maybe unnecessarily) flexible `TimeDefinition` function and instead just use a "time strategy" enum that tells the processor whether it should use record time or window-end time: In the former case, if the duration is zero, we know we'll never need a buffer. If it's > zero, we'll probably need one. In the latter case, we'll probably need a buffer, regardless of the suppression duration. WDYT? [ Full content available at: https://github.com/apache/kafka/pull/5693 ] This message was relayed via gitbox.apache.org for [email protected]
