[GitHub] [iceberg] rdblue commented on a change in pull request #1393: Flink: Support creating table and altering table in Flink SQL

GitBox Fri, 28 Aug 2020 15:51:32 -0700


rdblue commented on a change in pull request #1393:
URL: https://github.com/apache/iceberg/pull/1393#discussion_r479565256




##########
File path: flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java
##########
@@ -320,19 +338,158 @@ public void renameTable(ObjectPath tablePath, String 
newTableName, boolean ignor
     }
   }
 
-  /**
-   * TODO Add partitioning to the Flink DDL parser.
-   */
   @Override
   public void createTable(ObjectPath tablePath, CatalogBaseTable table, 
boolean ignoreIfExists)
-      throws CatalogException {
-    throw new UnsupportedOperationException("Not support createTable now.");
+      throws CatalogException, TableAlreadyExistException {
+    validateFlinkTable(table);
+
+    Schema icebergSchema = FlinkSchemaUtil.convert(table.getSchema());
+    PartitionSpec spec = toPartitionSpec(((CatalogTable) 
table).getPartitionKeys(), icebergSchema);
+    Map<String, String> options = Maps.newHashMap(table.getOptions());
+
+    try {
+      icebergCatalog.createTable(
+          toIdentifier(tablePath),
+          icebergSchema,
+          spec,
+          options.get("location"),
+          options);
+    } catch (AlreadyExistsException e) {
+      throw new TableAlreadyExistException(getName(), tablePath, e);
+    }
   }
 
   @Override
   public void alterTable(ObjectPath tablePath, CatalogBaseTable newTable, 
boolean ignoreIfNotExists)
-      throws CatalogException {
-    throw new UnsupportedOperationException("Not support alterTable now.");
+      throws CatalogException, TableNotExistException {
+    validateFlinkTable(newTable);
+    Table icebergTable = getIcebergTable(tablePath);
+    CatalogTable table = toCatalogTable(icebergTable);
+
+    // Currently, Flink SQL only support altering table properties.

Review comment:
       I should also note that support for adding/removing/renaming columns 
cannot be done by comparing `CatalogTable` instances, unless the Flink schema 
contains Iceberg column IDs.
   
   The problem is clear when you consider a simple example:
   * Iceberg table schema: `id bigint, a float, b float`
   * Flink table schema: `id bigint, x float, y float`
   
   There are two ways to get the Flink schema: rename a -> x and b -> y, or 
drop a, drop b, add x, add y. Guessing which one was intended by the user is 
not okay because it would corrupt data. If the values from a are read when 
projecting x after a was actually dropped, then this is a serious correctness 
bug.
   
   Also note that there are some transformations that can't be detected. For 
example, drop a then add a. The result should be that all values of column a 
are discarded. This happens when the wrong data was written to a column but the 
column is still needed for newer data.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a change in pull request #1393: Flink: Support creating table and altering table in Flink SQL

Reply via email to