wenlong88 commented on a change in pull request #18017:
URL: https://github.com/apache/flink/pull/18017#discussion_r769442384
##########
File path:
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/operations/MergeTableLikeUtilTest.java
##########
@@ -126,6 +126,72 @@ public void mergeWithIncludeFailsOnDuplicateColumn() {
null);
}
+ @Test
+ public void mergeWithIncludeFailsOnDuplicateRegularColumn() {
+ TableSchema sourceSchema =
+ TableSchema.builder().add(TableColumn.physical("one",
DataTypes.INT())).build();
+
+ List<SqlNode> derivedColumns =
+ Arrays.asList(
+ regularColumn("two", DataTypes.INT()),
+ regularColumn("two", DataTypes.INT()),
+ regularColumn("four", DataTypes.STRING()));
+
+ thrown.expect(ValidationException.class);
+ thrown.expectMessage("A regular Column named 'two' already exists in
the table.");
+ util.mergeTables(
+ getDefaultMergingStrategies(),
+ sourceSchema,
+ derivedColumns,
+ Collections.emptyList(),
+ null);
+ }
+
+ @Test
+ public void
mergeWithIncludeFailsOnDuplicateRegularColumnAndComputeColumn() {
+ TableSchema sourceSchema =
+ TableSchema.builder().add(TableColumn.physical("one",
DataTypes.INT())).build();
+
+ List<SqlNode> derivedColumns =
+ Arrays.asList(
+ regularColumn("two", DataTypes.INT()),
+ computedColumn("three", plus("two", "3")),
+ regularColumn("three", DataTypes.INT()),
+ regularColumn("four", DataTypes.STRING()));
+
+ thrown.expect(ValidationException.class);
+ thrown.expectMessage("A column named 'three' already exists in the
table.");
+ util.mergeTables(
+ getDefaultMergingStrategies(),
+ sourceSchema,
+ derivedColumns,
+ Collections.emptyList(),
+ null);
+ }
+
+ @Test
+ public void
mergeWithIncludeFailsOnDuplicateRegularColumnAndMetadataColumn() {
+ TableSchema sourceSchema =
+ TableSchema.builder().add(TableColumn.physical("one",
DataTypes.INT())).build();
+
+ List<SqlNode> derivedColumns =
+ Arrays.asList(
+ metadataColumn("two", DataTypes.INT(), true),
+ computedColumn("three", plus("two", "3")),
+ regularColumn("two", DataTypes.INT()),
+ regularColumn("four", DataTypes.STRING()));
+
+ thrown.expect(ValidationException.class);
+ thrown.expectMessage(
Review comment:
in this case, I think it would be better to throw with information that
there are duplicate column with name 'two' in metadata column and regular
column?
##########
File path:
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/operations/MergeTableLikeUtil.java
##########
@@ -426,6 +433,14 @@ private void appendDerivedColumns(
}
}
+ Set<String> physicalFieldNames =
physicalFieldNamesToTypes.keySet();
+ Set<String> metadataFieldNames =
metadataFieldNamesToTypes.keySet();
+ final Set<String> result = new
LinkedHashSet<>(physicalFieldNames);
+ result.retainAll(metadataFieldNames);
+ if (!result.isEmpty()) {
+ throw new ValidationException(
+ "A field name conflict exists between a field
of the regular type and a field of the Metadata type.");
+ }
Review comment:
hi, @jelly-1203 thanks for the update and analysis, I agree that it not
possible to add unified check at the end. but I still think that the check
here can be improved a bit:
1. it seems that the check here is not relevant to the new added
ComputingColumn, if you want to check the duplication, it is better to add it
when updating metadataFieldNamesToTypes.
2. according to current implementation, regular columns have top priority(we
collect collectPhysicalFieldsTypes at the beginning), so we may also need to
check if there is duplicated name in physicalFieldNamesToTypes when trying to a
metadata column or computed column. If we add such check, the check in 1 is not
necessary any more.
what do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]