yihua commented on code in PR #9876:
URL: https://github.com/apache/hudi/pull/9876#discussion_r1367806982
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##########
@@ -652,6 +660,16 @@ private static Map<HeaderMetadataType, String>
getUpdatedHeader(Map<HeaderMetada
if (addBlockIdentifier &&
!HoodieTableMetadata.isMetadataTable(config.getBasePath())) { // add block
sequence numbers only for data table.
updatedHeader.put(HeaderMetadataType.BLOCK_IDENTIFIER, attemptNumber +
"," + blockSequenceNumber);
}
+ if (config.shouldWritePartialUpdates()) {
+ // When enabling writing partial updates to the data blocks, the full
schema is also written
+ // to the block header so that the reader can differentiate partial
updates vs schema
+ // evolution, based on the "SCHEMA" which contains the partial schema
and the "FULL_SCHEMA"
+ // which contains the full schema of the table at this time.
+ updatedHeader.put(
+ HeaderMetadataType.FULL_SCHEMA,
+ HoodieAvroUtils.addMetadataFields(
+ getWriteSchema(config),
config.allowOperationMetadataField()).toString());
Review Comment:
`getWriteSchema(config)` fetches the full schema at this particular commit.
For schema evolution, I think it's better to keep the full schema in each log
block, as the schema can evolve across log blocks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]