Copilot commented on code in PR #10088:
URL: https://github.com/apache/gravitino/pull/10088#discussion_r2878249536
##########
docs/jdbc-clickhouse-catalog.md:
##########
@@ -191,6 +191,7 @@ Other ClickHouse types are exposed as [External
Type](./manage-relational-metada
:::note
- `settings.*` keys are passed to the ClickHouse `SETTINGS` clause verbatim.
- The `engine` value is immutable after creation.
+- When loading the table meta, Gravitino can't distinguish whether it's a
cluster table or a local table as properties like `cluster-name` and
`on-cluster` can't be fetched from the JDBC metadata currently.
Review Comment:
In this note, “table meta” should be “table metadata”, and consider using
“cannot” instead of “can't” to keep docs tone consistent and formal. The
sentence could also be tightened for readability (it’s currently a bit
long/run-on).
```suggestion
- When loading table metadata, Gravitino cannot determine whether it is a
cluster table or a local table, because properties such as `cluster-name` and
`on-cluster` are not available from the JDBC metadata.
```
##########
catalogs-contrib/catalog-jdbc-clickhouse/src/main/java/org/apache/gravitino/catalog/clickhouse/operations/ClickHouseTableOperations.java:
##########
@@ -343,22 +345,20 @@ private void handleDistributeTable(
Preconditions.checkArgument(
StringUtils.isNotBlank(shardingKey), "Sharding key must be specified
for Distributed");
- List<String> shardingColumns =
ClickHouseTableSqlUtils.extractShardingKeyColumns(shardingKey);
- if (CollectionUtils.isNotEmpty(shardingColumns)) {
- for (String columnName : shardingColumns) {
- JdbcColumn shardingColumn = findColumn(columns, columnName);
- Preconditions.checkArgument(
- shardingColumn != null,
- "Sharding key column %s must be defined in the table",
- columnName);
- }
- }
+ // When creating a distributed table, the columns should be empty as the
distributed table will
+ // reference the remote table's columns, and the sharding key must be
defined in the columns.
+ Preconditions.checkArgument(
+ ArrayUtils.isEmpty(columns),
+ "Columns should be empty when creating a distributed table, but got:
%s",
+ Arrays.toString(columns));
Review Comment:
The inline comment is contradictory: it says the sharding key must be
defined in the columns, but this method now enforces `columns` to be empty for
distributed tables. Please update/remove that part of the comment so it matches
the new behavior.
##########
catalogs-contrib/catalog-jdbc-clickhouse/src/main/java/org/apache/gravitino/catalog/clickhouse/operations/ClickHouseTableOperations.java:
##########
@@ -343,22 +345,20 @@ private void handleDistributeTable(
Preconditions.checkArgument(
StringUtils.isNotBlank(shardingKey), "Sharding key must be specified
for Distributed");
- List<String> shardingColumns =
ClickHouseTableSqlUtils.extractShardingKeyColumns(shardingKey);
- if (CollectionUtils.isNotEmpty(shardingColumns)) {
- for (String columnName : shardingColumns) {
- JdbcColumn shardingColumn = findColumn(columns, columnName);
- Preconditions.checkArgument(
- shardingColumn != null,
- "Sharding key column %s must be defined in the table",
- columnName);
- }
- }
+ // When creating a distributed table, the columns should be empty as the
distributed table will
+ // reference the remote table's columns, and the sharding key must be
defined in the columns.
+ Preconditions.checkArgument(
+ ArrayUtils.isEmpty(columns),
+ "Columns should be empty when creating a distributed table, but got:
%s",
+ Arrays.toString(columns));
Review Comment:
Enforcing distributed-table creation to require an empty `columns` array
means `JdbcCatalogOperations.createTable(...)` will return a `Table` with an
empty schema (it returns the input `columns` without re-loading from the
database). This is a behavior change for API consumers and can be surprising.
Consider either (a) allowing columns to be provided (and just skipping index/PK
generation for Distributed), or (b) making the catalog operation load the
created table from ClickHouse when engine=Distributed so the returned `Table`
has the real columns.
```suggestion
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]