Hi, Avrid
Thank you Avrid for perfecting Sink through this FLIP. I have two little
questions

1. What do you think of us directly providing an interface as follows? In
this way, there may be no need to expose the Mailbox to the user. We can
implement an `AsyncSinkWriterOperator` to control the length of the queue.
If it is too long, do not call SinkWriter::write.
public interface AsyncSinkWriter<InputT, CommT, WriterStateT>
        extends SinkWriter<Tuple2<InputT, XXXFuture<?>>, CommT,
WriterStateT> { //// please ignore the name of Tuple2 and XXXFuture at
first.
    int getQueueLimit();
}

2. Another question is: If users don't care about exactly once and the
unification of stream and batch, how about letting users use
`AsyncFunction` directly? I don’t have an answer either. I want to hear
your suggestions.

Best,
Guowei


On Mon, Jul 19, 2021 at 3:38 PM Arvid Heise <ar...@apache.org> wrote:

> Dear devs,
>
> today I'd like to start the discussion on the Sink API. I have drafted a
> FLIP [1] with an accompanying PR [2].
>
> This FLIP is a bit special as it's actually a few smaller Amend-FLIPs in
> one. In this discussion, we should decide on the scope and cut out too
> invasive steps if we can't reach an agreement.
>
> Step 1 is to add a few more pieces of information to context objects.
> That's non-breaking and needed for the async communication pattern in
> FLIP-171 [3]. While we need to add a new Public API (MailboxExecutor), I
> think that this should entail the least discussions.
>
> Step 2 is to also offer the same context information to committers. Here we
> can offer some compatibility methods to not break existing sinks. The main
> use case would be some async exactly-once sink but I'm not sure if we would
> use async communication patterns at all here (or simply wait for all async
> requests to finish in a sync way). It may also help with async cleanup
> tasks though.
>
> While drafting Step 2, I noticed the big entanglement of the current API.
> To figure out if there is a committer during the stream graph creation, we
> actually need to create a committer which can have unforeseen consequences.
> Thus, I spiked if we can disentangle the interface and have separate
> interfaces for the different use cases. The resulting step 3 would be a
> completely breaking change and thus is probably controversial. However, I'd
> also see the disentanglement as a way to prepare to make Sinks more
> expressive (write and commit coordinator) without completely overloading
> the main interface.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> [2] https://github.com/apache/flink/pull/16399
> [3]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink
>

Reply via email to