gustavodemorais commented on code in PR #28403:
URL: https://github.com/apache/flink/pull/28403#discussion_r3412558585
##########
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/optimize/program/FlinkChangelogModeInferenceProgram.scala:
##########
@@ -114,9 +114,24 @@ class FlinkChangelogModeInferenceProgram extends
FlinkOptimizeProgram[StreamOpti
// step4: sanity check and return non-empty root
if (finalRoot.isEmpty) {
- val plan = FlinkRelOptUtil.toString(root, withChangelogTraits = true)
- throw new TableException(
- "Can't generate a valid execution plan for the given query:\n" + plan)
+ val errorMessage =
+ if (containsUpdates(rootWithModifyKindSet)) {
+ // Point at the failing node and its inputs instead of dumping the
whole plan.
+ val conflict = new
StringBuilder(describeChangelog(rootWithModifyKindSet))
+ rootWithModifyKindSet.getInputs.foreach(
+ input => conflict.append("\n +-
").append(describeChangelog(input)))
+ "Can't generate a valid execution plan for the given query.\n\n" +
+ "There is a changelog mismatch between two operators. One produces
an upsert " +
+ "changelog (UPDATE_AFTER without UPDATE_BEFORE). The other
requires a retract " +
+ "changelog (UPDATE_BEFORE and UPDATE_AFTER), for example a sink
without a primary " +
+ "key. To resolve it, declare a PRIMARY KEY on the sink, or make
the input produce " +
Review Comment:
Otherwise it sounds imperative and like it's a certain fix
```suggestion
"key. In such cases, declare a PRIMARY KEY on the sink, or make
the input produce " +
```
##########
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/stream/sql/ProcessTableFunctionTest.java:
##########
@@ -513,6 +517,11 @@ private static Stream<ErrorSpec> errorSpecs() {
"SELECT * FROM f(r => TABLE t_watermarked PARTITION BY
name, on_time => DESCRIPTOR(ts))",
Review Comment:
Try a test like "SELECT name, SUM(count) OVER (PARTITION BY name ORDER BY
name) FROM f(...)" and check if the error message makes sense. I think it
output no changelog mismatch
##########
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/optimize/program/FlinkChangelogModeInferenceProgram.scala:
##########
@@ -114,9 +114,24 @@ class FlinkChangelogModeInferenceProgram extends
FlinkOptimizeProgram[StreamOpti
// step4: sanity check and return non-empty root
if (finalRoot.isEmpty) {
- val plan = FlinkRelOptUtil.toString(root, withChangelogTraits = true)
- throw new TableException(
- "Can't generate a valid execution plan for the given query:\n" + plan)
+ val errorMessage =
+ if (containsUpdates(rootWithModifyKindSet)) {
+ // Point at the failing node and its inputs instead of dumping the
whole plan.
Review Comment:
I find it hard to grasp that all the error here are related to upsert vs
retract mismatches. It first reads like, were there any errors? See if there
was an update in the plan and say that's the issue. Not related to the PR but
the code itself. Maybe we can add a comment since the code is not super
intuitive
##########
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/stream/sql/ProcessTableFunctionTest.java:
##########
@@ -513,6 +517,11 @@ private static Stream<ErrorSpec> errorSpecs() {
"SELECT * FROM f(r => TABLE t_watermarked PARTITION BY
name, on_time => DESCRIPTOR(ts))",
"Time operations using the `on_time` argument are
currently not supported for "
+ "PTFs that consume or produce updates."),
+ ErrorSpec.ofInsertInto(
+ "upsert output into sink without primary key",
+ UpdatingUpsertFunction.class,
+ "INSERT INTO t_no_pk_sink SELECT * FROM f(r => TABLE
t_updating PARTITION BY name)",
+ "declare a PRIMARY KEY on the sink"),
Review Comment:
Can you assert the rendered conflict block, not just one substring? "The
conflict is at:\n..."
##########
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/stream/sql/ProcessTableFunctionTest.java:
##########
@@ -513,6 +517,11 @@ private static Stream<ErrorSpec> errorSpecs() {
"SELECT * FROM f(r => TABLE t_watermarked PARTITION BY
name, on_time => DESCRIPTOR(ts))",
"Time operations using the `on_time` argument are
currently not supported for "
+ "PTFs that consume or produce updates."),
+ ErrorSpec.ofInsertInto(
Review Comment:
Can you add a more complex test to see if things make sense? Add a case with
the conflict ≥2 levels down (e.g. over-aggregate buried under a
projection/filter/join)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]