Dear devs,

today I'd like to start the discussion on the Sink API. I have drafted a
FLIP [1] with an accompanying PR [2].

This FLIP is a bit special as it's actually a few smaller Amend-FLIPs in
one. In this discussion, we should decide on the scope and cut out too
invasive steps if we can't reach an agreement.

Step 1 is to add a few more pieces of information to context objects.
That's non-breaking and needed for the async communication pattern in
FLIP-171 [3]. While we need to add a new Public API (MailboxExecutor), I
think that this should entail the least discussions.

Step 2 is to also offer the same context information to committers. Here we
can offer some compatibility methods to not break existing sinks. The main
use case would be some async exactly-once sink but I'm not sure if we would
use async communication patterns at all here (or simply wait for all async
requests to finish in a sync way). It may also help with async cleanup
tasks though.

While drafting Step 2, I noticed the big entanglement of the current API.
To figure out if there is a committer during the stream graph creation, we
actually need to create a committer which can have unforeseen consequences.
Thus, I spiked if we can disentangle the interface and have separate
interfaces for the different use cases. The resulting step 3 would be a
completely breaking change and thus is probably controversial. However, I'd
also see the disentanglement as a way to prepare to make Sinks more
expressive (write and commit coordinator) without completely overloading
the main interface.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
[2] https://github.com/apache/flink/pull/16399
[3] https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink

Reply via email to