Thanks Dong! Piotrek
wt., 18 lip 2023 o 06:04 Dong Lin <lindon...@gmail.com> napisał(a): > Hi all, > > I have updated FLIP-309 as suggested by Piotr to include a reference to > FLIP-328 in the future work section. > > Piotra, Stephan, and I discussed offline regarding the choice > between execution.checkpointing.max-interval and > execution.checkpointing.interval-during-backlog. > The advantage of using "max-interval" is that Flink runtime can have more > flexibility to decide when/how to adjust checkpointing intervals (based on > information other than backlog). The advantage of using > "interval-during-backlog" is that it is clearer to the user when/how this > configured interval is used. Since there is no immediate need for the extra > flexibility as of this FLIP, we agreed to go with interval-during-backlog > for now. And we can rename this config to e.g. > execution.checkpointing.max-interval when needed in the future. > > Thanks everyone for all the reviews and suggestions! And special thanks to > Piotr and Stephan for taking extra time to provide detailed reviews and > suggestions offline! > > Since there is no further comment, I will open the voting thread for this > FLIP. > > Cheers, > Dong > > > On Fri, Jul 14, 2023 at 11:39 PM Piotr Nowojski <piotr.nowoj...@gmail.com> > wrote: > > > Hi All, > > > > We had a lot of off-line discussions. As a result I would suggest > dropping > > the idea of introducing an end-to-end-latency concept, until > > we can properly implement it, which will require more designing and > > experimenting. I would suggest starting with a more manual solution, > > where the user needs to configure concrete parameters, like > > `execution.checkpointing.max-interval` or `execution.flush-interval`. > > > > FLIP-309 looks good to me, I would just rename > > `execution.checkpointing.interval-during-backlog` to > > `execution.checkpointing.max-interval`. > > > > I would also reference future work, that a solution that would allow set > > `isProcessingBacklog` for sources like Kafka will be introduced via > > FLIP-328 [1]. > > > > Best, > > Piotrek > > > > [1] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-328%3A+Allow+source+operators+to+determine+isProcessingBacklog+based+on+watermark+lag > > > > śr., 12 lip 2023 o 03:49 Dong Lin <lindon...@gmail.com> napisał(a): > > > > > Hi Piotr, > > > > > > I think I understand your motivation for suggeseting > > > execution.slow-end-to-end-latency now. Please see my followup comments > > > (after the previous email) inline. > > > > > > On Wed, Jul 12, 2023 at 12:32 AM Piotr Nowojski <pnowoj...@apache.org> > > > wrote: > > > > > > > Hi Dong, > > > > > > > > Thanks for the updates, a couple of comments: > > > > > > > > > If a record is generated by a source when the source's > > > > isProcessingBacklog is true, or some of the records used to > > > > > derive this record (by an operator) has isBacklog = true, then this > > > > record should have isBacklog = true. Otherwise, > > > > > this record should have isBacklog = false. > > > > > > > > nit: > > > > I think this conflicts with "Rule of thumb for non-source operators > to > > > set > > > > isBacklog = true for the records it emits:" > > > > section later on, when it comes to a case if an operator has mixed > > > > isBacklog = false and isBacklog = true inputs. > > > > > > > > > execution.checkpointing.interval-during-backlog > > > > > > > > Do we need to define this as an interval config parameter? Won't that > > add > > > > an option that will be almost instantly deprecated > > > > because what we actually would like to have is: > > > > execution.slow-end-to-end-latency and execution.end-to-end-latency > > > > > > > > > > I guess you are suggesting that we should allow users to specify a > higher > > > end-to-end latency budget for those records that are emitted by > two-phase > > > commit sink, than those records that are emitted by none-two-phase > commit > > > sink. > > > > > > My concern with this approach is that it will increase the complexity > of > > > the definition of "processing latency requirement", as well as the > > > complexity of the Flink runtime code that handles it. Currently, the > > > FLIP-325 defines end-to-end latency as an attribute of the records that > > is > > > statically assigned when the record is generated at the source, > > regardless > > > of how it will be emitted later in the topology. If we make the changes > > > proposed above, we would need to define the latency requirement w.r.t. > > the > > > attribute of the operators that it travels through before its result is > > > emitted, which is less intuitive and more complex. > > > > > > For now, it is not clear whether it is necessary to have two categories > > of > > > latency requirement for the same job. Maybe it is reasonable to assume > > that > > > if a job has two-phase commit sink and the user is OK to emit some > > results > > > at 1 minute interval, then more likely than not the user is also OK to > > emit > > > all results at 1 minute interval, include those that go through > > > none-two-phase commit sink? > > > > > > If we do want to support different end-to-end latency depending on > > whether > > > the operator is emitted by two-phase commit sink, I would prefer to > still > > > use execution.checkpointing.interval-during-backlog instead of > > > execution.slow-end-to-end-latency. This allows us to keep the concept > of > > > end-to-end latency simple. Also, by explicitly including "checkpointing > > > interval" in the name of the config that directly affects checkpointing > > > interval, we can make it easier and more intuitive for users to > > understand > > > the impact and set proper value for such configs. > > > > > > What do you think? > > > > > > Best, > > > Dong > > > > > > > > > > Maybe we can introduce only `execution.slow-end-to-end-latency` (% > > maybe > > > a > > > > better name), and for the time being > > > > use it as the checkpoint interval value during backlog? > > > > > > > > > > Or do you envision that in the future users will be configuring only: > > > > - execution.end-to-end-latency > > > > and only optionally: > > > > - execution.checkpointing.interval-during-backlog > > > > ? > > > > > > > > Best Piotrek > > > > > > > > PS, I will read the summary that you have just published later, but I > > > think > > > > we don't need to block this FLIP on the > > > > existence of that high level summary. > > > > > > > > wt., 11 lip 2023 o 17:49 Dong Lin <lindon...@gmail.com> napisał(a): > > > > > > > > > Hi Piotr and everyone, > > > > > > > > > > I have documented the vision with a summary of the existing work in > > > this > > > > > doc. Please feel free to review/comment/edit this doc. Looking > > forward > > > to > > > > > working with you together in this line of work. > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1CgxXvPdAbv60R9yrrQAwaRgK3aMAgAL7RPPr799tOsQ/edit?usp=sharing > > > > > > > > > > Best, > > > > > Dong > > > > > > > > > > On Tue, Jul 11, 2023 at 1:07 AM Piotr Nowojski < > > > piotr.nowoj...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > Hi All, > > > > > > > > > > > > Me and Dong chatted offline about the above mentioned issues > > (thanks > > > > for > > > > > > that offline chat > > > > > > I think it helped both of us a lot). The summary is below. > > > > > > > > > > > > > Previously, I thought you meant to add a generic logic in > > > > > > SourceReaderBase > > > > > > > to read existing metrics (e.g. backpressure) and emit the > > > > > > > IsProcessingBacklogEvent to SourceCoordinator. I am sorry if I > > have > > > > > > > misunderstood your suggetions. > > > > > > > > > > > > > > After double-checking your previous suggestion, I am wondering > if > > > you > > > > > are > > > > > > > OK with the following approach: > > > > > > > > > > > > > > - Add a job-level config > > > > > execution.checkpointing.interval-during-backlog > > > > > > > - Add an API SourceReaderContext#setProcessingBacklog(boolean > > > > > > > isProcessingBacklog). > > > > > > > - When this API is invoked, it internally sends an > > > > > > > internal SourceReaderBacklogEvent to SourceCoordinator. > > > > > > > - SourceCoordinator should keep track of the latest > > > > isProcessingBacklog > > > > > > > status from all its subtasks. And for now, we will hardcode the > > > logic > > > > > > such > > > > > > > that if any source reader says it is under backlog, then > > > > > > > execution.checkpointing.interval-during-backlog is used. > > > > > > > > > > > > > > This approach looks good to me as it can achieve the same > > > performance > > > > > > with > > > > > > > the same number of public APIs for the target use-case. And I > > > suppose > > > > > in > > > > > > > the future we might be able to re-use this API for source > reader > > to > > > > set > > > > > > its > > > > > > > backlog status based on its backpressure metrics, which could > be > > an > > > > > extra > > > > > > > advantage over the current approach. > > > > > > > > > > > > > > Do you think we can agree to adopt the approach described > above? > > > > > > > > > > > > Yes, I think that's a viable approach. I would be perfectly fine > to > > > not > > > > > > introduce > > > > > > `SourceReaderContext#setProcessingBacklog(boolean > > > > isProcessingBacklog).` > > > > > > and sending the `SourceReaderBacklogEvent` from SourceReader to > JM > > > > > > in this FLIP. It could be implemented once we would decide to add > > > some > > > > > more > > > > > > generic > > > > > > ways of detecting backlog/backpressure on the SourceReader level. > > > > > > > > > > > > I think we could also just keep the current proposal of adding > > > > > > `SplitEnumeratorContext#setIsProcessingBacklog`, and use it in > the > > > > > sources > > > > > > that > > > > > > can set it on the `SplitEnumerator` level. Later we could merge > > this > > > > with > > > > > > another > > > > > > mechanisms of detecting "isProcessingBacklog", like based on > > > watermark > > > > > lag, > > > > > > backpressure, etc, via some component running on the JM. > > > > > > > > > > > > At the same time I'm fine with having the "isProcessingBacklog" > > > concept > > > > > to > > > > > > switch > > > > > > runtime back and forth between high and low latency modes instead > > of > > > > > > "backpressure". In FLIP-325 I have asked: > > > > > > > > > > > > > I think there is one thing that hasn't been discussed neither > > here > > > > nor > > > > > in > > > > > > FLIP-309. Given that we have > > > > > > > three dimensions: > > > > > > > - e2e latency/checkpointing interval > > > > > > > - enabling some kind of batching/buffering on the operator > level > > > > > > > - how much resources we want to allocate to the job > > > > > > > > > > > > > > How do we want Flink to adjust itself between those three? For > > > > example: > > > > > > > a) Should we assume that given Job has a fixed amount of > assigned > > > > > > resources and make it paramount that > > > > > > > Flink doesn't exceed those available resources? So in case of > > > > > > backpressure, we > > > > > > > should extend checkpointing intervals, emit records less > > > frequently > > > > > and > > > > > > in batches. > > > > > > > b) Or should we assume that the amount of resources is flexible > > (up > > > > to > > > > > a > > > > > > point?), and the desired e2e latency > > > > > > > is the paramount aspect? So in case of backpressure, we > should > > > > still > > > > > > adhere to the configured e2e latency, > > > > > > > and wait for the user or autoscaler to scale up the job? > > > > > > > > > > > > > > In case of a), I think the concept of "isProcessingBacklog" is > > not > > > > > > needed, we could steer the behaviour only > > > > > > > using the backpressure information. > > > > > > > > > > > > > > On the other hand, in case of b), "isProcessingBacklog" > > information > > > > > might > > > > > > be helpful, to let Flink know that > > > > > > > we can safely decrease the e2e latency/checkpoint interval even > > if > > > > > there > > > > > > is no backpressure, to use fewer > > > > > > > resources (and let the autoscaler scale down the job). > > > > > > > > > > > > > > Do we want to have both, or only one of those? Do a) and b) > > > > complement > > > > > > one another? If job is backpressured, > > > > > > > we should follow a) and expose to autoscaler/users information > > > "Hey! > > > > > I'm > > > > > > barely keeping up! I need more resources!". > > > > > > > While, when there is no backpressure and latency doesn't matter > > > > > > (isProcessingBacklog=true), we can limit the resource > > > > > > > usage > > > > > > > > > > > > After thinking this over: > > > > > > - the case that we don't have "isProcessingBacklog" information, > > but > > > > the > > > > > > source operator is > > > > > > back pressured, must be intermittent. EIther back pressure will > > go > > > > > away, > > > > > > or shortly we should > > > > > > reach the "isProcessingBacklog" state anyway > > > > > > - and even if we implement some back pressure detecting algorithm > > to > > > > > switch > > > > > > the runtime into the > > > > > > "high latency mode", we can always report that as > > > > "isProcessingBacklog" > > > > > > anyway, as runtime should > > > > > > react the same way in both cases (backpressure and > > > > > "isProcessingBacklog > > > > > > states). > > > > > > > > > > > > =============== > > > > > > > > > > > > With a common understanding of the final solution that we want to > > > have > > > > in > > > > > > the future, I'm pretty much fine with the current > > > > > > FLIP-309 proposal, with a couple of remarks: > > > > > > 1. Could you include in the FLIP-309 the long term solution as we > > > have > > > > > > discussed. > > > > > > a) Would be nice to have some diagram showing how the > > > > > > "isProcessingBacklog" information would be travelling, > > > > > > being aggregated and what will be done with that > > > > > information. > > > > > > (from SourceReader/SplitEnumerator to some > > > > > > "component" aggregating it, and then ... ?) > > > > > > 2. For me "processing backlog" doesn't necessarily equate to > > > > > "backpressure" > > > > > > (HybridSource can be > > > > > > both NOT backpressured and processing backlog at the same > > time). > > > If > > > > > you > > > > > > think the same way, can you include that > > > > > > definition of "processing backlog" in the FLIP including its > > > > relation > > > > > > to the backpressure state? If not, we need to align > > > > > > on that definition first :) > > > > > > > > > > > > Also I'm missing a big picture description, that would show what > > are > > > > you > > > > > > trying to achieve and what's the overarching vision > > > > > > behind all of the current and future FLIPs that you are planning > in > > > > this > > > > > > area (FLIP-309, FLIP-325, FLIP-327, FLIP-331, ...?). > > > > > > Or was it described somewhere and I've missed it? > > > > > > > > > > > > Best, > > > > > > Piotrek > > > > > > > > > > > > > > > > > > > > > > > > czw., 6 lip 2023 o 06:25 Dong Lin <lindon...@gmail.com> > > napisał(a): > > > > > > > > > > > > > Hi Piotr, > > > > > > > > > > > > > > I am sorry if you feel unhappy or upset with us for not > > > > > following/fixing > > > > > > > your proposal. It is not my intention to give you this feeling. > > > After > > > > > > all, > > > > > > > we are all trying to make Flink better, to support more > use-case > > > with > > > > > the > > > > > > > most maintainable code. I hope you understand that just like > > you, I > > > > > have > > > > > > > also been doing my best to think through various design options > > and > > > > > > taking > > > > > > > time to evalute the pros/cons. Eventually, we probably still > need > > > to > > > > > > reach > > > > > > > consensus by clearly listing and comparing the objective > > pros/cons > > > of > > > > > > > different proposals and identifying the best choice. > > > > > > > > > > > > > > Regarding your concern (or frustration) that we are always > > finding > > > > > issues > > > > > > > in your proposal, I would say it is normal (and probably > > necessary) > > > > for > > > > > > > developers to find pros/cons in each other's solutions, so that > > we > > > > can > > > > > > > eventually pick the right one. I will appreciate anyone who can > > > > > correctly > > > > > > > pinpoint the concrete issue in my proposal so that I can > improve > > it > > > > or > > > > > > > choose an alternative solution. > > > > > > > > > > > > > > Regarding your concern that we are not spending enough effort > to > > > find > > > > > > > solutions and that the problem in your solution can be solved > in > > a > > > > > > minute, > > > > > > > I would like to say that is not true. For each of your previous > > > > > > proposals, > > > > > > > I typically spent 1+ hours thinking through your proposal to > > > > understand > > > > > > > whether it works and why it does not work, and another 1+ hour > to > > > > write > > > > > > > down the details and explain why it does not work. And I have > > had a > > > > > > variety > > > > > > > of offline discussions with my colleagues discussing various > > > > proposals > > > > > > > (including yours) with 6+ hours in total. Maybe I am not > capable > > > > enough > > > > > > to > > > > > > > fix those issues in one minute or so so. If you think your > > proposal > > > > can > > > > > > be > > > > > > > easily fixed in one minute or so, I would really appreciate it > if > > > you > > > > > can > > > > > > > think through your proposal and fix it in the first place :) > > > > > > > > > > > > > > For your information, I have had several long discussions with > my > > > > > > > colleagues at Alibaba and also Becket on this FLIP. We have > > > seriously > > > > > > > considered your proposals and discussed in detail what are the > > > > > pros/cons > > > > > > > and whether we can improve these solutions. The initial version > > of > > > > this > > > > > > > FLIP (which allows the source operator to specify checkpoint > > > > intervals) > > > > > > > does not get enough support due to concerns of not being > generic > > > > (i.e. > > > > > > > users need to specify checkpoint intervals on a per-source > > basis). > > > It > > > > > is > > > > > > > only after I updated the FLIP to use the job-level > > > > > > > execution.checkpointing.interval-during-backlog, then they > agree > > to > > > > > give > > > > > > +1 > > > > > > > to the FLIP. What I want to tell you is that your suggestions > > have > > > > been > > > > > > > taken seriously, and the quality of the FLIP has been taken > > > seriously > > > > > > > by all those who have voted. As a result of taking your > > suggestion > > > > > > > seriously and trying to find improvements, we updated the FLIP > to > > > use > > > > > > > isProcessingBacklog. > > > > > > > > > > > > > > I am wondering, do you think it will be useful to discuss > > > > face-to-face > > > > > > via > > > > > > > video conference call? It is not just between you and me. We > can > > > > invite > > > > > > the > > > > > > > developers who are interested to join and help with the > > discussion. > > > > > That > > > > > > > might improve communication efficiency and help us understand > > each > > > > > other > > > > > > > better :) > > > > > > > > > > > > > > I am writing this long email to hopefully get your > > understanding. I > > > > > care > > > > > > > much more about the quality of the eventual solution rather > than > > > who > > > > > > > proposed the solution. Please bear with me and see my comments > > > > inline, > > > > > > with > > > > > > > an explanation of the pros/cons of these proposals. > > > > > > > > > > > > > > > > > > > > > On Wed, Jul 5, 2023 at 11:06 PM Piotr Nowojski < > > > > > piotr.nowoj...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Guys, > > > > > > > > > > > > > > > > I would like to ask you again, to spend a bit more effort on > > > trying > > > > > to > > > > > > > find > > > > > > > > solutions, not just pointing out problems. For 1.5 months, > > > > > > > > the discussion doesn't go in circle, but I'm suggesting a > > > solution, > > > > > you > > > > > > > are > > > > > > > > trying to undermine it with some arguments, I'm coming > > > > > > > > back with a fix, often an extremely easy one, only for you to > > try > > > > to > > > > > > find > > > > > > > > yet another "issue". It doesn't bode well, if you are finding > > > > > > > > a "problem" that can be solved with a minute or so of > thinking > > or > > > > > even > > > > > > > has > > > > > > > > already been solved. > > > > > > > > > > > > > > > > I have provided you so far with at least three distinct > > solutions > > > > > that > > > > > > > > could address your exact target use-case. Two [1][2] generic > > > > > > > > enough to be probably good enough for the foreseeable future, > > one > > > > > > > > intermediate and not generic [3] but which wouldn't > > > > > > > > require @Public API changes or some custom hidden interfaces. > > > > > > > > > > > > > > > > > > > > > > All in all: > > > > > > > > - [1] with added metric hints like "isProcessingBacklog" > solves > > > > your > > > > > > > target > > > > > > > > use case pretty well. Downside is having to improve > > > > > > > > how JM is collecting/aggregating metrics > > > > > > > > > > > > > > > > > > > > > > Here is my analysis of this proposal compared to the current > > > approach > > > > > in > > > > > > > the FLIP-309. > > > > > > > > > > > > > > pros: > > > > > > > - No need to add the public API > > > > > > > SplitEnumeratorContext#setIsProcessingBacklog. > > > > > > > cons: > > > > > > > - Need to add a public API that subclasses of SourceReader can > > use > > > to > > > > > > > specify its IsProcessingBacklog metric value. > > > > > > > - Source Coordinator needs to periodically pull the > > > > isProcessingBacklog > > > > > > > metrics from all TMs throughout the job execution. > > > > > > > > > > > > > > Here is why I think the cons outweigh the pros: > > > > > > > 1) JM needs to collect/aggregate metrics with extra runtime > > > overhead, > > > > > > which > > > > > > > is not necessary for the target use-case with the push-based > > > approach > > > > > in > > > > > > > FLIP-309. > > > > > > > 2) For the target use-case, it is simpler and more intuitive > for > > > > source > > > > > > > operators (e.g. HybridSource, MySQL CDC source) to be able to > set > > > its > > > > > > > isProcessingBacklog status in the SplitEnumerator. This is > > because > > > > the > > > > > > > switch between bounded/unbounded stages happens in their > > > > > SplitEnumerator. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - [2] is basically an equivalent of [1], replacing metrics > with > > > > > events. > > > > > > > It > > > > > > > > also is a superset of your proposal > > > > > > > > > > > > > > > > > > > > > > Previously, I thought you meant to add a generic logic in > > > > > > SourceReaderBase > > > > > > > to read existing metrics (e.g. backpressure) and emit the > > > > > > > IsProcessingBacklogEvent to SourceCoordinator. I am sorry if I > > have > > > > > > > misunderstood your suggetions. > > > > > > > > > > > > > > After double-checking your previous suggestion, I am wondering > if > > > you > > > > > are > > > > > > > OK with the following approach: > > > > > > > > > > > > > > - Add a job-level config > > > > > execution.checkpointing.interval-during-backlog > > > > > > > - Add an API SourceReaderContext#setProcessingBacklog(boolean > > > > > > > isProcessingBacklog). > > > > > > > - When this API is invoked, it internally sends an > > > > > > > internal SourceReaderBacklogEvent to SourceCoordinator. > > > > > > > - SourceCoordinator should keep track of the latest > > > > isProcessingBacklog > > > > > > > status from all its subtasks. And for now, we will hardcode the > > > logic > > > > > > such > > > > > > > that if any source reader says it is under backlog, then > > > > > > > execution.checkpointing.interval-during-backlog is used. > > > > > > > > > > > > > > This approach looks good to me as it can achieve the same > > > performance > > > > > > with > > > > > > > the same number of public APIs for the target use-case. And I > > > suppose > > > > > in > > > > > > > the future we might be able to re-use this API for source > reader > > to > > > > set > > > > > > its > > > > > > > backlog status based on its backpressure metrics, which could > be > > an > > > > > extra > > > > > > > advantage over the current approach. > > > > > > > > > > > > > > Do you think we can agree to adopt the approach described > above? > > > > > > > > > > > > > > > > > > > > > - [3] yes, it's hacky, but it's a solution that could be thrown > > > away > > > > > once > > > > > > > > we implement [1] or [2] . The only real theoretical > > > > > > > > downside is that it cannot control the long checkpoint > > exactly > > > > > (short > > > > > > > > checkpoint interval has to be a divisor of the long > checkpoint > > > > > > > > interval, but I simply can not imagine a practical use > where > > > that > > > > > > would > > > > > > > > be a blocker for a user. Please..., someone wanting to set > > > > > > > > short checkpoint interval to 3min and long to 7 minutes, > and > > > that > > > > > > > someone > > > > > > > > can not accept the long interval to be 9 minutes? > > > > > > > > And that's even ignoring the fact that if someone has an > > issue > > > > with > > > > > > > the 3 > > > > > > > > minutes checkpoint interval, I can hardly think that merely > > > > > > > > doubling the interval to 7 minutes would significantly > solve > > > any > > > > > > > problem > > > > > > > > for that user. > > > > > > > > > > > > > > > > > > > > > > Yes, this is a fabricated example that shows > > > > > > > execution.checkpointing.interval-during-backlog might not be > > > > accurately > > > > > > > enforced with this option. I think you are probably right that > it > > > > might > > > > > > not > > > > > > > matter that much. I just think we should try our best to make > > Flink > > > > > > public > > > > > > > API's semantics (including configuration) clear, simple, and > > > > > enforceable. > > > > > > > If we can make the user-facing configuration enforceable at the > > > cost > > > > of > > > > > > an > > > > > > > extra developer facing API (i.e. setProcessingBacklog(...)), I > > > would > > > > > > prefer > > > > > > > to do this. > > > > > > > > > > > > > > It seems that we both agree that option [2] is better than > [3]. I > > > > will > > > > > > skip > > > > > > > the further comments for this option and we can probably focus > on > > > > > > > option [2] :) > > > > > > > > > > > > > > > > > > > > > > Dong a long time ago you wrote: > > > > > > > > > Sure. Then let's decide the final solution first. > > > > > > > > > > > > > > > > Have you thought about that? Maybe I'm wrong but I don't > > remember > > > > you > > > > > > > > describing in any of your proposals how they could be > > > > > > > > extended in the future, to cover more generic cases. > Regardless > > > if > > > > > you > > > > > > > > either don't believe in the generic solution or struggle to > > > > > > > > > > > > > > > > > > > > > > Yes, I have thought about the plan to extend the current FLIP > to > > > > > support > > > > > > > metrics (e.g. backpressure) based solution you described > earlier. > > > > > > Actually, > > > > > > > I mentioned multiple times in the earlier email that your > > > suggestion > > > > of > > > > > > > using metrics is valuable and I will do this in a follow-up > FLIP. > > > > > > > > > > > > > > Here are my comments from the previous email: > > > > > > > - See "I will add follow-up FLIPs to make use of the event-time > > > > metrics > > > > > > and > > > > > > > backpressure metrics" from Jul 3, 2023, 6:39 PM > > > > > > > - See "I agree it is valuable" from Jul 1, 2023, 11:00 PM > > > > > > > - See "we will create a followup FLIP (probably in FLIP-328)" > > from > > > > Jun > > > > > > 29, > > > > > > > 2023, 11:01 AM > > > > > > > > > > > > > > Frankly speaking, I think the idea around using the > backpressure > > > > > metrics > > > > > > > still needs a bit more thinking before we can propose a FLIP. > > But I > > > > am > > > > > > > pretty sure we can make use of the watermark/event-time to > > > determine > > > > > the > > > > > > > backlog status. > > > > > > > > > > > > > > grasp it, if you can come back with something that can be > easily > > > > > extended > > > > > > > > in the future, up to a point where one could implement > > > > > > > > something similar to this backpressure detecting algorithm > > that I > > > > > > > mentioned > > > > > > > > many times before, I would be happy to discuss and > > > > > > > > support it. > > > > > > > > > > > > > > > > > > > > > > Here is my idea of extending the source reader to support > > > > > > event-time-based > > > > > > > backlog detecting algorithms: > > > > > > > > > > > > > > - Add a job-level config such as > > > watermark-lag-threshold-for-backlog. > > > > > If > > > > > > > any source reader determines that the event-timestamp is > > available > > > > and > > > > > > the > > > > > > > system-time - watermark exceeds this threshold, then the source > > > > reader > > > > > > > considers its isProcessingBacklog=true. > > > > > > > - The source reader can send an event to the source > coordinator. > > > Note > > > > > > that > > > > > > > this might be doable in the SourceReaderBase without adding any > > > > public > > > > > > API > > > > > > > which the concrete SourceReader subclass needs to explicitly > > > invoke. > > > > > > > - And in the future if FLIP-325 is accepted, insteading of > > sending > > > > the > > > > > > > event to SourceCoordinator and let SourceCoordinator inform the > > > > > > checkpoint > > > > > > > coordinator, the source reader might just emit the information > as > > > > part > > > > > of > > > > > > > the RecordAttributes and let the two-phase commit sink inform > the > > > > > > > checkpoint coordinator. > > > > > > > > > > > > > > Note that this is a sketch of the idea and it might need > further > > > > > > > improvement. I just hope you understand that we have thought > > about > > > > this > > > > > > > idea and did quite a lot of thinking for these design options. > If > > > it > > > > is > > > > > > OK > > > > > > > with you, I hope we can make incremental progress and discuss > the > > > > > > > metrics-based solution separately in a follow-up FLIP. > > > > > > > > > > > > > > Last but not least, thanks for taking so much time to leave > > > comments > > > > > and > > > > > > > help us improve the FLIP. Please kindly bear with us in this > > > > > discussion. > > > > > > I > > > > > > > am looking forward to collaborating with you to find the best > > > design > > > > > for > > > > > > > the target use-cases. > > > > > > > > > > > > > > Best, > > > > > > > Dong > > > > > > > > > > > > > > > > > > > > > > Hang, about your points 1. and 2., do you think those > problems > > > are > > > > > > > > insurmountable and blockers for that counter proposal? > > > > > > > > > > > > > > > > > 1. It is hard to find the error checkpoint. > > > > > > > > > > > > > > > > No it's not, please take a look at what I exactly proposed > and > > > > maybe > > > > > at > > > > > > > the > > > > > > > > code. > > > > > > > > > > > > > > > > > 2. (...) The failed checkpoint may make them think the job > is > > > > > > > unhealthy. > > > > > > > > > > > > > > > > Please read again what I wrote in [3]. I'm mentioning there a > > > > > solution > > > > > > > for > > > > > > > > this exact "problem". > > > > > > > > > > > > > > > > About the necessity of the config value, I'm still not > > convinced > > > > > that's > > > > > > > > needed from the start, but yes we can add some config option > > > > > > > > if you think otherwise. This option, if named properly, could > > be > > > > > > re-used > > > > > > > in > > > > > > > > the future for different solutions, so that's fine by me. > > > > > > > > > > > > > > > > Best, > > > > > > > > Piotrek > > > > > > > > > > > > > > > > [1] Introduced in my very first e-mail from 23 maj 2023, > 16:26, > > > and > > > > > > > refined > > > > > > > > later with point "2." in my e-mail from 16 June 2023, 17:58 > > > > > > > > [2] Section "2. ===============" in my e-mail from 30 June > > 2023, > > > > > 16:34 > > > > > > > > [3] Section "3. ===============" in my e-mail from 30 June > > 2023, > > > > > 16:34 > > > > > > > > > > > > > > > > All times in CEST. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >