[
https://issues.apache.org/jira/browse/HUDI-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-7242:
---------------------------------
Labels: pull-request-available (was: )
> Avoid unnecessary bigquery table update when using sync tool
> ------------------------------------------------------------
>
> Key: HUDI-7242
> URL: https://issues.apache.org/jira/browse/HUDI-7242
> Project: Apache Hudi
> Issue Type: Bug
> Components: meta-sync
> Reporter: Jinpeng Zhou
> Priority: Minor
> Labels: pull-request-available
> Fix For: 0.14.0, 0.14.1
>
>
> The [PR]([https://github.com/apache/hudi/pull/9482)] added a table schema
> update step for bigquery sync tool. It seems there're two issues:
> 1. When it reform the schema which is then compared to the bq table schema,
> the reformed schema puts partition fields in the beginning, while the bq
> table schema by default has partition fields at the end. So it unnecessarily
> triggers a schema update due to to order difference.
> 2. Though the sync tool for 0.14.0 does not support big lake connection id
> (there's a recent PR last month adding this support), the user can still
> recreate their table manually by adding connection id. The table update is
> adding the new schema into. external table definition. This does not work for
> biglake tables, and will cause error: "Schema can be specified only on the
> Table.Schema field for external tables with an associated connection_id but
> schema was provided on Table.Externaldataconfig.Schema".
--
This message was sent by Atlassian Jira
(v8.20.10#820010)