Hi Xiao,
Thanks for reporting this.
You approach sounds good to me. But we have many similar problems in
existing streaming sql operator implementations.
So I think if State API / statebackend can provide a better state structure
to handle this situation would be great.
This is a similar problem
Example SQL:
SELECT *
FROM stream1 s1, stream2 s2
WHERE s1.id = s2.id AND s1.rowtime = s2.rowtime
And we have lots of messages in stream1 and stream2 share a same rowtime.
It runs fine when using heap as the state backend,
but requires lots of heap memory sometimes (when upstream out of sync, et