nada-attia opened a new issue, #18008: URL: https://github.com/apache/hudi/issues/18008
### Task Description **Why this task is needed:** Currently, Hudi performs HMS schema sync as a post-commit operation. This creates a critical failure scenario: if a writer successfully commits data with an evolved schema but the subsequent HMS sync fails, the Hudi table schema and HMS schema diverge. This divergence causes query failures for downstream consumers (Spark, Presto) that rely on HMS for schema metadata, and requires manual intervention to reconcile the schemas (i.e. rollback the commits which introduced schema changes). **What needs to be done:** To prevent this issue, Deltastreamer and Datasource writers should perform HMS schema sync before creating a commit when schema changes are detected and hoodie.datasource.hive_sync.enable=true. If the pre-commit HMS sync fails, the write operation should fail without creating a commit, ensuring that the Hudi table schema and HMS schema always remain consistent. This approach provides fail-fast behavior and eliminates the schema divergence window entirely. ### Task Type Code improvement/refactoring ### Related Issues **Parent feature issue:** (if applicable ) **Related issues:** NOTE: Use `Relationships` button to add parent/blocking issues after issue is created. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
