mzheng-plaid commented on issue #11865: URL: https://github.com/apache/hudi/issues/11865#issuecomment-2322512573
Ok I think the root cause is because the upgrade silently turned on schema validation with https://github.com/apache/hudi/blob/release-0.14.1/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java#L822 ``` boolean shouldValidate = config.shouldValidateAvroSchema(); boolean allowProjection = config.shouldAllowAutoEvolutionColumnDrop(); if ((!shouldValidate && allowProjection) || getActiveTimeline().getCommitsTimeline().filterCompletedInstants().empty() || StringUtils.isNullOrEmpty(config.getSchema()) ) { // Check not required return; } ``` Previously in 0.12.2 https://github.com/apache/hudi/blob/release-0.12.2/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java#L749C1-L754C6 ``` if (!config.getAvroSchemaValidate() || getActiveTimeline().getCommitsTimeline().filterCompletedInstants().empty()) { // Check not required return; } ``` Questions: 1. Whys there a coupling with `hoodie.datasource.write.schema.allow.auto.evolution.column.drop` and disabling schema validation? Why is schema validation silently turned _on_ by default now? 2. Did some table upgrade silently not run successfully causing the schemas to be broken? The namespace/name change seems like an internal detail. 3. It seems like we can just turn off schema validation **but** is there a way to fix the schemas of our tables without a re-write? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
