trushev commented on PR #5830: URL: https://github.com/apache/hudi/pull/5830#issuecomment-1314898583
> Actually i'm confused totally by these schema use cases, can we list a summary here, in which case we use writer /reader schema, for schema evolution enabled/disabled ? Schema evolution **disabled**: Reader schema example [FormatUtils](https://github.com/apache/hudi/blob/6b0b03b12b5b35efd16eb976d48edba876803ca0/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/FormatUtils.java#L173-L189) Writer schema example [HoodieLogCompactionPlanGenerator](https://github.com/apache/hudi/blob/6b0b03b12b5b35efd16eb976d48edba876803ca0/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/plan/generators/HoodieLogCompactionPlanGenerator.java#L85-L95) Schema evolution **enabled**: Writer schema is used in all previous cases(as schema evolution disabled), as well as we can find appropriate `internalSchema`. For example, [HoodieCompactor](https://github.com/apache/hudi/blob/6b0b03b12b5b35efd16eb976d48edba876803ca0/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/HoodieCompactor.java#L163-L204). **Why we use writer schema here?** The main idea is to read log block as is. Then we "cast" log block using `HoodieAvroUtils.rewriteRecordWithNewSchema` here [AbstractHoodieLogRecordReader](https://github.com/apache/hudi/blob/6b0b03b12b5b35efd16eb976d48edba876803ca0/hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java#L634) Reader schema: the rest cases > Isn't this is a prove that two schema may case bug in corner cases ? Agree. Mb we should create another `AbstractHoodieEvolveLogReader` with `InternalSchema` only. It looks like a lot of changes not related to flink but the core of schema evolution and the other engines in hudi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
