If multiple inputs of Flatten proceed at different speeds, should the Flatten operator cache tuples before emitting output watermarks? This can prevent a late tuple from becoming early. But if the watermark gap (i.e., cache size) becomes too large among inputs, can the application tell Beam/runner to emit output watermark anyway and consider slow input tuples as late?
Thanks, Shen