Armelabdelkbir opened a new issue, #11803:
URL: https://github.com/apache/hudi/issues/11803
**Describe the problem you faced**
hello i try to test several schema evolution usecases using hudi 0.15 and
spark3.5 using hms 4
first test: Adding column in PG --> debezium / schema registry ok --> hudi
(MOR) hivesyncTool KO
```
org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:250)
at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:193)
at
org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:167)
... 69 more
Caused by: InvalidOperationException(message:The following columns have
types incompatible with the existing columns in their respective positions :
username)
```
2. type promotion following this doc:
https://hudi.apache.org/docs/schema_evolution/#type-promotions in PG , Double
to String ok --> debezium / schema registy ok --> hudi (MOR) hivesyncTool KO
`Caused by: org.apache.hudi.hive.HoodieHiveSyncException: Could not convert
field Type from DOUBLE to string for field salary`
3. DROP column in PG --> debezium / schema registy ok --> hudi (MOR)
hivesyncTool job not failed but i see always the column but with null values
for newest inserts
my configuration is :
```
"hoodie.datasource.hive_sync.ignore_exceptions" -> "true",
"hoodie.write.set.null.for.missing.columns" -> "true",
"hoodie.schema.on.read.enable" -> "true"
hive.metastore.disallow.incompatible.col.type.changes"-> "false"
```
when i drop tables (ro / rt ) and i restart my job is creating new tables
correctly, but is not production way to handle schemas evolutions
**To Reproduce**
Steps to reproduce the behavior:
1. on the PG source side:
`cdc_hudi=> ALTER TABLE employees ADD COLUMN test_str VARCHAR ;
ALTER TABLE
cdc_hudi=> INSERT INTO employees (name, department, username, test_str)
VALUES ('armel011', 'Engineering', 'arm23220', 'teststr');
INSERT 0 1`
2.debezium / schema registry ok ( latest version contains added columns)

3.restart spark job
**Expected behavior**
Case 1. add column
Case 2. change datatype double to string
Case 3. drop column
**Environment Description**
* Hudi version : 0.15.0
* Spark version : 3.5.1
* Hive version : 4.0.0
* Hadoop version : 3.4
* Storage (HDFS/S3/GCS..) : S3
* Running on Docker? (yes/no) : no
**Stacktrace**
```Caused by: org.apache.hudi.exception.HoodieException: Got runtime
exception when hive syncing employees
at
org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:170)
at
org.apache.hudi.sync.common.util.SyncUtilHelpers.runHoodieMetaSync(SyncUtilHelpers.java:79)
... 68 more
Caused by: org.apache.hudi.hive.HoodieHiveSyncException: Failed to update
table for employees_ro
at
org.apache.hudi.hive.ddl.HMSDDLExecutor.updateTableDefinition(HMSDDLExecutor.java:162)
at
org.apache.hudi.hive.HoodieHiveSyncClient.updateTableSchema(HoodieHiveSyncClient.java:205)
at org.apache.hudi.hive.HiveSyncTool.syncSchema(HiveSyncTool.java:347)
at
org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:250)
at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:193)
at
org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:167)
... 69 more
Caused by: InvalidOperationException(message:The following columns have
types incompatible with the existing columns in their respective positions :
test_str)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_with_environment_context_result$alter_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:59744)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_with_environment_context_result$alter_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:59730)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_with_environment_context_result.read(ThriftHiveMetastore.java:59672)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_with_environment_context(ThriftHiveMetastore.java:1693)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_with_environment_context(ThriftHiveMetastore.java:1677)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:373)
at
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:322)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]