LiebingYu commented on issue #846: URL: https://github.com/apache/fluss/issues/846#issuecomment-3400363626
> [@luoyuxia](https://github.com/luoyuxia) If no one is working on it, I’m willing to take it on. > > I plan to add a new interface in `LakeCatalog`. Thus each lake plugin can impement their own logic and in `CoordinatorService` we can check if the exist `TableDescriptor` of lake table is compatible with the Fluss table's `TableDescriptor`. > > @PublicEvolving > public interface LakeCatalog extends AutoCloseable { > > /** > * Get a table in lake. > * > * @param tablePath path of the table to be created > * @throws TableNotExistException if the table not exists > */ > TableDescriptor getTable(TablePath tablePath) throws TableNotExistException; > } After some attempts, I found it's difficult to rebuild `TableDescriptor` from Paimon `Table`. For example: ```sql -- create a fluss table CREATE TABLE `fluss_catalog`.`fluss`.`fluss_t1` ( `a` VARCHAR(2147483647), `b` VARCHAR(2147483647) ) WITH ( 'table.replication.factor' = '1', 'table.datalake.format' = 'paimon', 'table.datalake.freshness' = '30s', 'table.datalake.paimon.metastore' = 'filesystem', 'table.datalake.enabled' = 'true', 'bucket.num' = '1', 'table.datalake.paimon.warehouse' = '/tmp/paimon', 'bootstrap.servers' = 'localhost:9123', 'lookup.max-retries' = '3' ); -- get lake table -- will have extra options: bucket, path -- In addition, there are options such as bucket-key and branch. Attempting to exhaustively enumerate all possible options that Paimon might add is error-prone. CREATE TABLE `paimon`.`fluss`.`fluss_t1` ( `a` VARCHAR(2147483647), `b` VARCHAR(2147483647), `__bucket` INT, `__offset` BIGINT, `__timestamp` TIMESTAMP(6) WITH LOCAL TIME ZONE ) WITH ( 'bucket' = '-1', 'fluss.lookup.max-retries' = '3', 'path' = 'file:/tmp/paimon/fluss.db/fluss_t1', 'fluss.table.replication.factor' = '1', 'fluss.table.datalake.enabled' = 'true', 'fluss.bucket.num' = '1', 'fluss.table.datalake.format' = 'paimon', 'fluss.table.datalake.freshness' = '30s' ) ``` Therefore, my point is that in `LakeCatalog#createTable`, if an existing table is encountered, we should directly compare the schemas of the two Paimon tables to check for consistency. Of course, this brings about an issue: if the newly created Fluss table modifies an property of an existing table—even if that property is allowed to be changed—an exception will still be thrown. How do you think about it? CC @luoyuxia @wuchong -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
