[ 
https://issues.apache.org/jira/browse/HUDI-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7242:
---------------------------------
    Labels: pull-request-available  (was: )

> Avoid unnecessary bigquery table update when using sync tool
> ------------------------------------------------------------
>
>                 Key: HUDI-7242
>                 URL: https://issues.apache.org/jira/browse/HUDI-7242
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: meta-sync
>            Reporter: Jinpeng Zhou
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 0.14.0, 0.14.1
>
>
> The [PR]([https://github.com/apache/hudi/pull/9482)] added a table schema 
> update step for bigquery sync tool. It seems there're two issues:
> 1. When it reform the schema which is then compared to the bq table schema, 
> the reformed schema puts partition fields in the beginning, while the bq 
> table schema by default has partition fields at the end. So it unnecessarily 
> triggers a schema update due to to order difference.
> 2. Though the sync tool for 0.14.0 does not support big lake connection id 
> (there's a recent PR last month adding this support), the user can still 
> recreate their table manually by adding connection id. The table update is 
> adding the new schema into. external table definition. This does not work for 
> biglake tables, and will cause error: "Schema can be specified only on the 
> Table.Schema field for external tables with an associated connection_id but 
> schema was provided on Table.Externaldataconfig.Schema". 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to