dawidwys commented on a change in pull request #10213: 
[FLINK-12846][table-common] Carry primary key information in TableSchema
URL: https://github.com/apache/flink/pull/10213#discussion_r346827569
 
 

 ##########
 File path: 
flink-table/flink-table-common/src/main/java/org/apache/flink/table/api/TableSchema.java
 ##########
 @@ -22,32 +22,57 @@
 import org.apache.flink.api.common.typeinfo.TypeInformation;
 import org.apache.flink.api.common.typeutils.CompositeType;
 import org.apache.flink.table.types.DataType;
-import org.apache.flink.table.types.FieldsDataType;
-import org.apache.flink.table.types.logical.LogicalType;
 import org.apache.flink.table.types.utils.TypeConversions;
 import org.apache.flink.types.Row;
 import org.apache.flink.util.Preconditions;
 
 import java.util.ArrayList;
 import java.util.Arrays;
 import java.util.Collections;
-import java.util.HashMap;
 import java.util.List;
-import java.util.Map;
 import java.util.Objects;
 import java.util.Optional;
 import java.util.stream.Collectors;
 import java.util.stream.IntStream;
 
+import static java.util.Collections.emptyList;
 import static org.apache.flink.table.api.DataTypes.FIELD;
 import static org.apache.flink.table.api.DataTypes.Field;
 import static org.apache.flink.table.api.DataTypes.ROW;
-import static 
org.apache.flink.table.types.logical.LogicalTypeRoot.TIMESTAMP_WITHOUT_TIME_ZONE;
 import static 
org.apache.flink.table.types.utils.TypeConversions.fromDataTypeToLegacyInfo;
 import static 
org.apache.flink.table.types.utils.TypeConversions.fromLegacyInfoToDataType;
+import static 
org.apache.flink.table.utils.TableSchemaValidation.validateNameTypeNumberEqual;
+import static 
org.apache.flink.table.utils.TableSchemaValidation.validateSchema;
 
 /**
- * A table schema that represents a table's structure with field names and 
data types.
+ * A table schema that represents a table's structure with field names, data 
types and
+ * constraint information (e.g. primary key, unique key).
+ *
+ * <p>Concepts about primary key and unique key:</p>
+ * <ul>
+ *     <li>
+ *         Primary key and unique key can consist of single or multiple 
columns (fields).
+ *     </li>
+ *     <li>
+ *         A primary key or unique key on source will be simply trusted, we 
won't validate the
+ *         constraint. The primary key and unique key information will then be 
used for query
+ *         optimization. If a bounded or unbounded table source defines any 
primary key or
+ *         unique key, it must contain a unique value for each row of data. 
You cannot have
+ *         two records having the same value of that field(s). Otherwise, the 
result of query
+ *         might be wrong.
 
 Review comment:
   Maybe?: 
   "The primary and unique keys' constraints of a source table are not 
validated. They are assumed to be correct and used for query optimization. If a 
bounded or unbounded table source defines any primary or
   unique key, it must contain a unique value for each row of data. You cannot 
have
   two records having the same value of that field(s). Otherwise, the result of 
the query
   might be wrong."
   
   What about retract/changelog streams? Do we just assume that changeflag + 
timestamp are part of every key? I understand this will be handled correctly in 
the runtime, but I think this should be also explained in this javadoc.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to