CTTY commented on PR #9816: URL: https://github.com/apache/hudi/pull/9816#issuecomment-1744120561
Hi @danny0405 , here is a detailed explanation: 1. There are two database configs: `hoodie.datasource.hive_sync.database` and `hoodie.database.name`, `hoodie.datasource.hive_sync.database`'s infer function would use `hoodie.database.name` to infer its value 2. `hoodie.database.name` accepts empty values and `hoodie.datasource.hive_sync.database` doesn't 3. This [LOC](https://github.com/apache/hudi/blob/b77286f176f1a606c807139042c2bd1f56883016/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala#L184) would set `hoodie.database.name` to empty by default, and it would be persisted to `hoodie.properties` file. 4. Once the empty `hoodie.database.name` is persisted to table config, `hoodie.datasource.hive_sync.database` would always use the persisted empty value as its default value due to the infer function, which is not acceptable to hive metastore The root cause for this issue is that there is discrepency between `hoodie.datasource.hive_sync.database` and `hoodie.database.name` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
