curcur edited a comment on pull request #13648:
URL: https://github.com/apache/flink/pull/13648#issuecomment-716973831
Hey @rkhachatryan and @pnowojski , thanks for the response.
> We realized that if we check view reference in subpartition (or null out
view.parent) then the downstream which view was overwritten by an older thread,
will sooner or later be fenced by the upstream (or checkpoint will time out).
> FLINK-19774 then becomes only an optimization.
1. Do you have preferences on `check view reference` vs `null out
view.parent`?
`null out parent` may have a problems of race condition.
Notice that not just `pollNext()` needs the check, I think everything that
touches view.parent needs a check.
2. How this can solve the problem of a new view is replaced by an old one?
Let's say downstream reconnects, asking for a new view; the new view is
created; replaced by an old view that is triggered by an old handler event. The
new view's parent is null-out. Then what will happen? I do not think "fenced by
the upstream (or checkpoint will time out)" can **solve** this problem, it just
ends with more failures caused by this problem.
> So we can prevent most of the issues without touching deployment
descriptors.
> Therefore, I think it makes sense to implement it in this PR. Sorry for
changing the decision.
I am overall fine to put the check within this PR. However, may I ask how it
is different from we have this PR as it is, and I do a follow-up one to
null-out the parent?
In this PR, my main purpose is to introduce a different result partition
type, and scheduler changes are based upon this new type. That's the main
reason I prefer to do it in a follow-up PR otherwise the scheduler part is
blocked.
It is easier to review as well.
> Another concern is a potential resource leak if downstream continuously
fail without notifying the upstream (instances of
CreditBasedSequenceNumberingViewReader will accumulate). Can you create a
follow-up ticket for that?
> This can be addressed by firing user event (see
PartitionRequestQueue.userEventTriggered).
Yes, that's a good point.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]