AlanConfluent commented on PR #28061:
URL: https://github.com/apache/flink/pull/28061#issuecomment-4433799847
> Hey @AlanConfluent, thanks for the clean PR. I added some comments in the
first pass and I'm thinking of how we can make this more complete. We're only
using the traits but not using some internal planner medatada. Some things to
check:
>
> 1 - requireDeterminismExcludeUpsertKey is being skipped. Every other
recursive visitor pushes input requirements through visitInputs, which calls
requireDeterminismExcludeUpsertKey(input, requireDeterminism). Take a look if
that makes sense here
Done! I think it makes sense to use requireDeterminismExcludeUpsertKey so
as not to impose redundant requirements. Good catch.
> 2 - getNonDeterministicCallName doesn't catch dynamic functions.
FlinkRexUtil.getNonDeterministicCallName recurses into operands but only checks
!operator.isDeterministic. It does not check isDynamicFunction() which is done
in other parts here. Check if that is relevant
I think due to `FlinkSqlOperatorTable.validateNoDynamicFunction`, it should
essentially prevent Flink registered operators from being dynamic and
deterministic, so I don't think there is a gap since the only things allowed to
be dynamic are non-deterministic as well. And my use of
`getNonDeterministicCallName` is consistent with the rest of
`StreamNonDeterministicUpdatePlanVisitor`, so I think it's not required and it
makes sense to be consistent. Sound reasonable?
>
> And one observation to the logic that you might want to adjust in the
comments. Per SUPPORT_UPDATES doc (lines 167-172):
>
> > "The function receives {+I,+U,-D} if the input table is upserting using
the same upsert key as the partition key. Otherwise, retractions {+I,-U,+U,-D}
(i.e. including UPDATE_BEFORE) enter the function."
>
> So even without REQUIRE_UPDATE_BEFORE declared, the function physically
receives UB when partition_key ⊄ upsert_key. The visitor only consults the
static trait. Practically I believe this doesn't affect correctness (the
function isn't required to consume UB just because it arrives). So we can
assume the UB is not relevant but there **might** be UBs.
I updated the comment. It's a little verbose, but explains clearly. Please
make any suggestions to it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]