voonhous commented on PR #10951:
URL: https://github.com/apache/hudi/pull/10951#issuecomment-2034230672

   We did some tests today and found out why 
`hoodie.datasource.hive_sync.support_timestamp=true`. 
   
   When performing an alter schema, and hive sync is performed via spark's 
external catalogue in: 
   `org.apache.spark.sql.hudi.command.AlterTableCommand#commitWithSchema`, 
Spark syncs TIMESTAMP types as TIMESTAMP. 
   
   ```scala
   sparkSession.sessionState.catalog
         .externalCatalog
         .alterTableDataSchema(db, tableName, dataSparkSchema)
   ```
   
   If this is defaulted to `false`, after altering the schema of a table 
containing a TIMESTAMP column, the type on HMS will change, this will cause 
subsequent hive-syncs to fail, which is not ideal.
   
   I think it's best that we ensure consistency between Spark, i will submit 
another PR to change the default back to `true`, I will then add a 
documentation there to explain why. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to