mzheng-plaid commented on issue #11865:
URL: https://github.com/apache/hudi/issues/11865#issuecomment-2322512573

   Ok I think the root cause is because the upgrade silently turned on schema 
validation with 
   
https://github.com/apache/hudi/blob/release-0.14.1/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java#L822
   
   ```
       boolean shouldValidate = config.shouldValidateAvroSchema();
       boolean allowProjection = config.shouldAllowAutoEvolutionColumnDrop();
       if ((!shouldValidate && allowProjection)
           || 
getActiveTimeline().getCommitsTimeline().filterCompletedInstants().empty()
           || StringUtils.isNullOrEmpty(config.getSchema())
       ) {
         // Check not required
         return;
       }
   ```
   
   Previously in 0.12.2 
https://github.com/apache/hudi/blob/release-0.12.2/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java#L749C1-L754C6
   ```
       if (!config.getAvroSchemaValidate() || 
getActiveTimeline().getCommitsTimeline().filterCompletedInstants().empty()) {
         // Check not required
         return;
       }
   ```
   
   Questions:
   
   1. Whys there a coupling with 
`hoodie.datasource.write.schema.allow.auto.evolution.column.drop` and disabling 
schema validation? Why is schema validation silently turned _on_ by default 
now? 
   2. Did some table upgrade silently not run successfully causing the schemas 
to be broken? The namespace/name change seems like an internal detail.
   3. It seems like we can just turn off schema validation **but** is there a 
way to fix the schemas of our tables without a re-write?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to