Copilot commented on code in PR #10088:
URL: https://github.com/apache/gravitino/pull/10088#discussion_r2887247073


##########
catalogs-contrib/catalog-jdbc-clickhouse/src/main/java/org/apache/gravitino/catalog/clickhouse/operations/ClickHouseTableOperations.java:
##########
@@ -168,13 +168,15 @@ protected String generateCreateTableSql(
     // Add Create table clause
     appendCreateTableClause(notNullProperties, sqlBuilder, tableName);
 
-    // Add columns
-    buildColumnsDefinition(columns, sqlBuilder);
+    // Add columns, if columns are empty, it means it's a distributed table.
+    if (columns.length > 0) {
+      buildColumnsDefinition(columns, sqlBuilder);

Review Comment:
   This comment is misleading: an empty `columns` array doesn’t inherently mean 
“distributed table”; it only becomes valid when combined with 
`engine=Distributed` and an `AS remote_db.remote_table` clause. As written, 
creating a non-Distributed table with empty columns would now generate invalid 
ClickHouse SQL (no column list and no AS/LIKE). Please either (a) tighten the 
condition based on the engine/properties, and/or (b) throw an 
IllegalArgumentException when `columns` is empty for non-Distributed engines, 
and adjust the comment accordingly.



##########
catalogs-contrib/catalog-jdbc-clickhouse/src/main/java/org/apache/gravitino/catalog/clickhouse/operations/ClickHouseTableOperations.java:
##########
@@ -343,21 +345,31 @@ private void handleDistributeTable(
     Preconditions.checkArgument(
         StringUtils.isNotBlank(shardingKey), "Sharding key must be specified 
for Distributed");
 
-    List<String> shardingColumns = 
ClickHouseTableSqlUtils.extractShardingKeyColumns(shardingKey);
-    if (CollectionUtils.isNotEmpty(shardingColumns)) {
-      for (String columnName : shardingColumns) {
-        JdbcColumn shardingColumn = findColumn(columns, columnName);
-        Preconditions.checkArgument(
-            shardingColumn != null,
-            "Sharding key column %s must be defined in the table",
-            columnName);
+    // Users have defined the columns explicitly for the distributed table, we 
will check the
+    // columns should contain the sharding key, as clickhouse requires the 
sharding key must be
+    // defined in the columns of the distributed table, and the sharding key 
column must be integer
+    // and not nullable.
+    if (ArrayUtils.isNotEmpty(columns)) {
+      List<String> shardingColumns = 
ClickHouseTableSqlUtils.extractShardingKeyColumns(shardingKey);
+      if (CollectionUtils.isNotEmpty(shardingColumns)) {
+        for (String columnName : shardingColumns) {
+          JdbcColumn shardingColumn = findColumn(columns, columnName);
+          Preconditions.checkArgument(
+              shardingColumn != null,
+              "Sharding key column %s must be defined in the table",
+              columnName);
+        }
       }
     }

Review Comment:
   The added comment claims the sharding-key column “must be integer and not 
nullable” and implies this method checks that, but the code only verifies that 
the referenced column names exist (and skips all validation when `columns` is 
empty). Please align the comment with the actual validation performed, or add 
the missing type/nullability checks (since `JdbcColumn` includes `dataType()` 
and `nullable()`).



##########
catalogs-contrib/catalog-jdbc-clickhouse/src/test/java/org/apache/gravitino/catalog/clickhouse/operations/TestClickHouseTableOperationsCluster.java:
##########
@@ -189,7 +193,28 @@ void 
testGenerateCreateTableSqlWithDistributedEngineWithoutOnCluster() {
         ops.buildCreateSql(
             "tbl", columns, "comment", props, null, Distributions.NONE, new 
Index[0], null);
 
-    Assertions.assertTrue(sql.startsWith("CREATE TABLE `tbl` ("));
+    Assertions.assertTrue(sql.startsWith("CREATE TABLE `tbl`"));
+    Assertions.assertTrue(
+        sql.contains("ENGINE = 
Distributed(`ck_cluster`,`remote_db`,`remote_table`,`user_id`)"));
+  }
+
+  @Test
+  void testDistributedTableWithEmptyColumns() {
+    JdbcColumn[] columns = new JdbcColumn[] {};
+
+    Map<String, String> props = new HashMap<>();
+    props.put(ClusterConstants.CLUSTER_NAME, "ck_cluster");
+    props.put(GRAVITINO_ENGINE_KEY, "Distributed");
+    props.put(DistributedTableConstants.REMOTE_DATABASE, "remote_db");
+    props.put(DistributedTableConstants.REMOTE_TABLE, "remote_table");
+    props.put(DistributedTableConstants.SHARDING_KEY, "user_id");
+
+    String sql =
+        ops.buildCreateSql(
+            "tbl", columns, "comment", props, null, Distributions.NONE, new 
Index[0], null);
+
+    Assertions.assertTrue(sql.startsWith("CREATE TABLE `tbl`"));
+    Assertions.assertEquals(sql.contains("AS `remote_db`.`remote_table`"), 
true);

Review Comment:
   Use Assertions.assertTrue(...) (or assertFalse) instead of 
assertEquals(booleanExpr, true). The current assertion is harder to read and 
produces less helpful failure messages.
   ```suggestion
       Assertions.assertTrue(sql.contains("AS `remote_db`.`remote_table`"));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to