Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-26 Thread Thomas Weise
Hi Sebastian, Thank you for updating the FLIP page. It looks good and I think you can start a VOTE. Thomas On Tue, Jul 26, 2022 at 10:57 AM Sebastian Mattheis wrote: > > Hi everybody, > > I have updated FLIP-217 [1] and have implemented the respective changes in > [2]. Please review. If there

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-26 Thread Sebastian Mattheis
Hi everybody, I have updated FLIP-217 [1] and have implemented the respective changes in [2]. Please review. If there are no concerns, I would initiate the voting on Thursday. Best regards, Sebastian [1]

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-25 Thread Piotr Nowojski
Thanks for the update Sebastian :) Best, Piotrek pon., 25 lip 2022 o 08:12 Sebastian Mattheis napisaƂ(a): > Hi everybody, > > I discussed last week the semantics and an implementation stragegy of the > configuration parameter with Piotr and did the implementation and some > tests this weekend.

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-25 Thread Sebastian Mattheis
Hi everybody, I discussed last week the semantics and an implementation stragegy of the configuration parameter with Piotr and did the implementation and some tests this weekend. A short summary of what I discussed and recapped with Piotr: - The configuration parameter allows (and tolerates)

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-20 Thread Thomas Weise
Hi Sebastian, Thank you for updating the FLIP and sorry for my delayed response. As Piotr pointed out, we would need to incorporate the fallback flag into the design to reflect the outcome of the previous discussion. Based on the current FLIP and as detailed by Becket, the SourceOperator

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-13 Thread Becket Qin
Thanks for the explanation, Sebastian. I understand your concern now. 1. About the major concern. Personally I'd consider the coarse grained watermark alignment as a special case for backward compatibility. In the future, if for whatever reason we want to pause a split and that is not supported,

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-13 Thread Sebastian Mattheis
Hi Becket, Hi Thomas, Hi Piotrek, Thanks for the feedback. I would like to highlight some concerns: 1. Major: A configuration parameter like "allow coarse grained alignment" defines a semantic that mixes two contexts conditionally as follows: "ignore incapability to pause splits in

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-13 Thread Becket Qin
Hi Sebastian, Thanks for updating the FLIP wiki. Just to double confirm, I was thinking of a configuration like "allow.coarse.grained.watermark.alignment". This will allow the coarse grained watermark alignment as a fallback instead of bubbling up an exception if split pausing is not supported

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-13 Thread Sebastian Mattheis
Hi Piotrek, Sorry I've read it and forgot it when I was ripping out the supportsPauseOrResume method again. Thanks for pointing that out. I will add it as follows: The enables/disables split alignment in the SourceOperator where the default is that split alignment is enabled. (And I will add the

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-07-12 Thread Piotr Nowojski
Hi Sebastian, Thanks for picking this up. > 5. There is NO configuration option to enable watermark alignment of splits or disable pause/resume capabilities. Isn't this contradicting what we actually agreed on? > we are planning to have a configuration based way to revert to the previous

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-06-14 Thread Thomas Weise
Hi everyone, Thank you for all the effort that went into this discussion. The split level watermark alignment will be an important feature for Flink that will address operational problems for various use cases. From reading through this thread it appears that not too much remains to bring this

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-30 Thread Piotr Nowojski
Hi Becket, Thanks for summing this up. Just one correction: > Piotr prefers option 2, his opinions are: > e) It is OK that the code itself in option 2 indicates the developers that a feature is optional. We will rely on the documentation to correct that and clarify that the feature is actually

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-27 Thread Becket Qin
I had an offline discussion with Piotr and here is the summary. Please correct me if I miss something, Piotr. There are two things we would like to seek more opinions from the community, so we can make progress on this FLIP. 1. The General pattern to add obligatory features to existing

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-26 Thread Piotr Nowojski
Hi Becket, I still sustain what I wrote before: > I think I would still vote soft -1 on this option, but I wouldn't block it in case I am out-voted. > I think it might be helpful to agree on the definition of optional in our case. For me it doesn't matter whether a default method throwing an

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-11 Thread Becket Qin
Thanks for the clarification, Piotr and Sebastian. It looks like the key problem is still whether the implementation of pausable splits in the Sources should be optional or not. I think it might be helpful to agree on the definition of optional in our case. To me: Optional = "You CAN leave the

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-11 Thread Piotr Nowojski
Hi, Actually previously I thought about having a decorative interface and whenever watermark alignment is enabled, checking that the source implements the decorative interface. If not, throwing an exception. The option with default methods in the source interfaces throwing

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-11 Thread Sebastian Mattheis
Hi Becket, Thanks a lot for your fast and detailed response. For me, it converges and dropping the supportsX method sounds very reasonable to me. (Side note: With "pausable splits" enabled as "default" I think we misunderstood. As you described now "default" I understand as that it should be the

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-11 Thread Becket Qin
+dev Hi Sebastian, Thank you for the summary. Please see the detailed replies inline. As a recap of my suggestions. 1. Pausable splits API. a) Add default implementations to methods "pauseOrResumeSplits" in both SourceReader and SplitReader where both default implementations throw

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-11 Thread Becket Qin
Hi Sebastian, Thanks for the reply and patient discussion. I agree this is a tricky decision. > Nevertheless, Piotr has valid concerns about Option c) which I see as > follows: > (1) An interface with default NOOP implementation makes the implementation > optional. And in my opinion, a default

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-11 Thread Sebastian Mattheis
Hi Piotr, Hi Becket, Hi everybody, we, Dawid and I, discussed the various suggestions/options and we would be okay either way because we find neither solution is perfect just because of the already present complexity. Option c) Adding methods to the interfaces of SourceReader and SplitReader

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-05 Thread Piotr Nowojski
Hi Guowei, as Dawid wrote a couple of messages back: > This is covered in the previous FLIP[1] which has been already implemented in 1.15. In short, it must be enabled with the watermark strategy which also configures drift and update interval So by default watermark alignment is disabled,

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-05 Thread Guowei Ma
Hi, We know that in the case of Bounded input Flink supports the Batch execution mode. Currently in Batch execution mode, flink is executed on a stage-by-stage basis. In this way, perhaps watermark alignment might not gain much. So my question is: Is watermark alignment the default behavior(for

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-04 Thread Piotr Nowojski
Hi Becket and Dawid, > I feel that no matter which option we choose this can not be solved entirely in either of the options, because of the point above and because the signature of SplitReader#pauseOrResumeSplits and SourceReader#pauseOrResumeSplits are slightly different (one identifies splits

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-04 Thread Becket Qin
Thanks for the reply and patient discussion, Piotr and Dawid. Is there any reason for pausing reading from a split an optional feature, other than that this was not included in the original interface? To be honest I am really worried about the complexity of the user story here. Optional features

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-04 Thread Piotr Nowojski
Hi Becket, Is this feature really non-optional? If so, adding those methods with default implementation just defeats that purpose. On the other hand, if a Source doesn't support pausing splits, the system would work. Arguably watermark alignment would not, and maybe that would deserve logging

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-03 Thread Becket Qin
Hi Piotr, Thanks for the comment. Just to clarify, I am not against the decorative interfaces, but I do think we should use them with caution. The main argument for adding the methods to the SourceReader is that these methods are effectively NON-OPTIONAL to SourceReader impl, i.e. starting from

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-05-03 Thread Piotr Nowojski
Hi, Sorry for chipping in so late, but I was OoO for the last two weeks. Regarding the interfaces, I would be actually against adding those methods to the base interfaces for the reasons mentioned above. Clogging the base interface for new users with tons of methods that they do not need, do not

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-26 Thread Becket Qin
Thanks for the reply Sebastian and Dawid, I think Sebastion has a good summary. This is a really helpful discussion. Thinking a bit more, I feel that it might still be better to add the supportsXXX() method in the Source rather than SourceReader. Generally speaking, what we are trying to do

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-26 Thread Dawid Wysakowicz
Thanks @Sebastian for the nice summary. I think most of your points aligned with the suggestions I made to the FLIP, while you were writing your reply (I believe we hit enter nearly at the same time ;) ) Two points after we synced offline 1. I changed also the

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-26 Thread Sebastian Mattheis
Hi folks, Sorry for being a bit silent. Many thanks for all the input and suggestions. As I'm a bit new, I needed some time to catch up and structure (for myself) the discussion and I wanted to find a way to structure the conclusions. (Also because I had the feeling that some concerns got lost in

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-26 Thread Dawid Wysakowicz
@Arvid: While I also like Becket's capability approach, I fear that it doesn't work for this particular use case: Sources can always be aligned cross-task and this is just about intra-task alignment. So it's plausible to put sources into an alignment group even though they do not use

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-25 Thread Arvid Heise
Thanks for pushing this effort. I'd actually be in favor of 1b). I fully agree that decorator interfaces should be avoided but I'm also not a big fan of overloading the base interfaces (they are hard to implement as is). The usual feedback to Source-related interfaces are always that they are

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-25 Thread Dawid Wysakowicz
a) "MySourceReader implements SourceReader, WithSplitsAlignment", along with "MySplitReader implements SplitReader, WithSplitsAlignment", or b) "MySourceReader implements AlignedSourceReader" and "MySplitReader implements AlignedSplitReader", or c) "MySourceReader implements

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-25 Thread Becket Qin
Thanks for the comment, Jark. 3. Interface/Method Name. > Can the interface be used to align other things in the future? For example, > align read speed, I have > seen users requesting global rate limits. This feature may also need an > interface like this. > If we don't plan to extend this

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-25 Thread Jark Wu
Thank Dawid for the reminder on FLIP-182. Sorry I did miss it. I don't have other concerns then. Best, Jark On Mon, 25 Apr 2022 at 15:40, Dawid Wysakowicz wrote: > @Jark: > > 1. Will the framework always align with watermarks when the source > implements the interface? > I'm afraid not every

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-25 Thread Dawid Wysakowicz
@Jark: 1. Will the framework always align with watermarks when the source implements the interface? I'm afraid not every case needs watermark alignment even if Kafka implements the interface, and this will affect the throughput somehow. I agree with Becket we may need a

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-24 Thread Jark Wu
Thanks for the effort, Dawid and Sebastian! I just have some minor questions (maybe I missed something). 1. Will the framework always align with watermarks when the source implements the interface? I'm afraid not every case needs watermark alignment even if Kafka implements the interface, and

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-24 Thread Becket Qin
Hi Dawid, Thanks for the explanation. Apologies that I somehow misread a bunch of "align" and thought they were "assign". Regarding 1, by default implementation, I was thinking of the default no-op implementation. I am a little worried about the proliferation of decorative interfaces. I think

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-22 Thread Dawid Wysakowicz
@Konstantin: As part of this FLIP, the `AlignedSplitReader` interface (aka the stop & resume behavior) will be implemented for Kafka and Pulsar only, correct? Correct, as far as I know though, those are the only sources which consume concurrently from multiple splits and thus alignment

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-21 Thread Becket Qin
Thanks for driving the effort, Sebastion. I think the motivation makes a lot of sense. Just a few suggestions / questions. 1. I think watermark alignment is sort of a general use case, so should we just add the related methods to SourceReader directly instead of introducing the new interface of

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-21 Thread Steven Wu
> However, a single source operator may read data from multiple splits/partitions, e.g., multiple Kafka partitions, such that even with watermark alignment the source operator may need to buffer excessive amount of data if one split emits data faster than another. For this part from the

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-21 Thread Thomas Weise
Thanks for working on this! I wonder if "supporting" split alignment in SourceReaderBase and then doing nothing if the split reader does not implement AlignedSplitReader could be misleading? Perhaps WithSplitsAlignment can instead be added to the specific source reader (i.e. KafkaSourceReader) to

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-21 Thread Konstantin Knauf
Hi Sebastian, Hi Dawid, As part of this FLIP, the `AlignedSplitReader` interface (aka the stop & resume behavior) will be implemented for Kafka and Pulsar only, correct? +1 in general. I believe it is valuable to complete the watermark aligned story with this FLIP. Cheers, Konstantin On

Re: [DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-21 Thread Dawid Wysakowicz
To be explicit, having worked on it, I support it ;) I think we can start a vote thread soonish, as there are no concerns so far. Best, Dawid On 13/04/2022 11:27, Sebastian Mattheis wrote: Dear Flink developers, I would like to open a discussion on FLIP 217 [1] for an extension of Watermark

[DISCUSS] FLIP-217 Support watermark alignment of source splits

2022-04-13 Thread Sebastian Mattheis
Dear Flink developers, I would like to open a discussion on FLIP 217 [1] for an extension of Watermark Alignment to perform alignment also in SplitReaders. To do so, SplitReaders must be able to suspend and resume reading from split sources where the SourceOperator coordinates and controlls