kbuci commented on issue #18711: URL: https://github.com/apache/hudi/issues/18711#issuecomment-4423834271
@cshuo > The Hudi read path should use the writer/table HoodieSchema to read the file and then project/convert to the requested schema. Thanks for the clarification! - when I first encountered this issue my thought process was to just check if we can make sure generally all HUDI Flink read/write code paths "infer" with the HoodieSchema. But I think we should instead follow the model you highlighted - this schema inference logic should only need to happen at "read" time. Since thats anyway the current precedent in Spark and Flink. I'm updating my existing Flink variant PR to go with this approach (which is still similar to approach A in essence). https://github.com/apache/hudi/pull/18539 Since once this is wired-up, Blob and Vector implementations should be easier to understand/add -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
