FANNG1 opened a new issue, #10368:
URL: https://github.com/apache/gravitino/issues/10368

   Flink Paimon cannot effectively persist distribution today.
   
   Background:
   - Flink createTable always sends `Distributions.NONE` and empty sort orders:
     - 
/Users/fanng/opensource/gravitino/flink-connector/flink/src/main/java/org/apache/gravitino/flink/connector/catalog/BaseCatalog.java:307
   - Paimon table builder removes `bucket`/`bucket-key` from properties, then 
only writes them back from `distribution`:
     - 
/Users/fanng/opensource/gravitino/catalogs/catalog-lakehouse-paimon/src/main/java/org/apache/gravitino/catalog/lakehouse/paimon/GravitinoPaimonTable.java:89
     - 
/Users/fanng/opensource/gravitino/catalogs/catalog-lakehouse-paimon/src/main/java/org/apache/gravitino/catalog/lakehouse/paimon/GravitinoPaimonTable.java:91
   
   So with Flink path setting `Distributions.NONE`, distribution info is lost.
   
   Suggested SQL for validation:
   
   ```sql
   USE CATALOG paimon_catalog;
   CREATE DATABASE IF NOT EXISTS dist_db;
   USE dist_db;
   
   CREATE TABLE dist_tbl (
     id BIGINT,
     name STRING,
     PRIMARY KEY (id) NOT ENFORCED
   ) WITH (
     'bucket' = '4',
     'bucket-key' = 'id'
   );
   
   SHOW CREATE TABLE dist_tbl;
   ```
   
   How should we improve?
   - In Flink connector, parse Paimon bucket options (`bucket`, `bucket-key`) 
into Gravitino `Distribution` for createTable.
   - Keep behavior backward-compatible when bucket options are absent.
   - Add Flink Paimon integration test to assert distribution is persisted (via 
Gravitino table metadata).
   - Update Flink Paimon docs with distribution example and limitation notes 
(HASH only).
   
   Out of scope:
   - Sort orders.
   - Row-level operations.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to