dawidwys commented on a change in pull request #10213:
[FLINK-12846][table-common] Carry primary key information in TableSchema
URL: https://github.com/apache/flink/pull/10213#discussion_r346829178
##########
File path:
flink-table/flink-table-common/src/main/java/org/apache/flink/table/api/TableSchema.java
##########
@@ -22,32 +22,57 @@
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.api.common.typeutils.CompositeType;
import org.apache.flink.table.types.DataType;
-import org.apache.flink.table.types.FieldsDataType;
-import org.apache.flink.table.types.logical.LogicalType;
import org.apache.flink.table.types.utils.TypeConversions;
import org.apache.flink.types.Row;
import org.apache.flink.util.Preconditions;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
-import java.util.HashMap;
import java.util.List;
-import java.util.Map;
import java.util.Objects;
import java.util.Optional;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
+import static java.util.Collections.emptyList;
import static org.apache.flink.table.api.DataTypes.FIELD;
import static org.apache.flink.table.api.DataTypes.Field;
import static org.apache.flink.table.api.DataTypes.ROW;
-import static
org.apache.flink.table.types.logical.LogicalTypeRoot.TIMESTAMP_WITHOUT_TIME_ZONE;
import static
org.apache.flink.table.types.utils.TypeConversions.fromDataTypeToLegacyInfo;
import static
org.apache.flink.table.types.utils.TypeConversions.fromLegacyInfoToDataType;
+import static
org.apache.flink.table.utils.TableSchemaValidation.validateNameTypeNumberEqual;
+import static
org.apache.flink.table.utils.TableSchemaValidation.validateSchema;
/**
- * A table schema that represents a table's structure with field names and
data types.
+ * A table schema that represents a table's structure with field names, data
types and
+ * constraint information (e.g. primary key, unique key).
+ *
+ * <p>Concepts about primary key and unique key:</p>
+ * <ul>
+ * <li>
+ * Primary key and unique key can consist of single or multiple
columns (fields).
+ * </li>
+ * <li>
+ * A primary key or unique key on source will be simply trusted, we
won't validate the
+ * constraint. The primary key and unique key information will then be
used for query
+ * optimization. If a bounded or unbounded table source defines any
primary key or
+ * unique key, it must contain a unique value for each row of data.
You cannot have
+ * two records having the same value of that field(s). Otherwise, the
result of query
+ * might be wrong.
+ * </li>
+ * <li>
+ * A primary key or unique key on sink is a weak constraint.
Currently, we won't validate
+ * the constraint, but we may add some check in the future to validate
whether the
+ * primary/unique key of the query matches the primary/unique key of
the sink during
+ * compile time.
+ * </li>
+ * <li>
+ * The difference between primary key and unique key is that there can
be only one primary
+ * key and there can be more than one unique key. And a primary key
doesn't need to be
+ * declared in unique key list again.
Review comment:
```suggestion
* declared in the unique key list again.
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services