yihua commented on PR #9876: URL: https://github.com/apache/hudi/pull/9876#issuecomment-1778076328
I discussed the comments with @danny0405 offline. Two things to address in this PR: (1) Instead of putting both partial and full schemas in the log block header, when partial updates are enabled, only the partial schema is added to the log block header in the same `SCHEMA` header and the full schema for snapshot reads is always going to be passed in from the table schema. To indicate the schema is partial, a new log block header `IS_PARTIAL` should be added. (2) We should let users in the MERGE INTO statement to specify if they want partial updates in the log files in MOR tables, e.g., using sth like `col = EXISTING` to indicate that the column values should be kept as is. We may not support this in the PR, but instead we should have an interim write config to control this behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
