[GitHub] [hudi] xiarixiaoyao commented on pull request #4910: [RFC-33] [HUDI-2429][Stacked on HUDI-2560] Support full Schema evolution for Spark

GitBox Mon, 14 Mar 2022 02:16:57 -0700


xiarixiaoyao commented on pull request #4910:
URL: https://github.com/apache/hudi/pull/4910#issuecomment-1066547940



   @YannByron 
   1） yes
   2） yes， we save the lastestSchemm for performance reasons, 
      of course we can remove lastestSchemm form metaFile, 
      we save all history schemas in .schema directory, but when the schema is 
too larger, search from .schema will be slow
   3) 4) pls see the detail design for RFC-33,  We hava already discussed those 
problem
   5) yes, This PR has nothing to do with metatable,
   According to the discussion results, we will store the lasttestschema in the 
metatable later
   6) Different versions of spark have different logic in parquet reading. 
   I don't want to change these logic, which will cause the user's expected 
results to be inconsistent with the actual results


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] xiarixiaoyao commented on pull request #4910: [RFC-33] [HUDI-2429][Stacked on HUDI-2560] Support full Schema evolution for Spark

Reply via email to