[GitHub] [hudi] nbalajee commented on pull request #2309: [HUDI-1441] - HoodieAvroUtils - rewrite() is not handling evolution o…

GitBox Fri, 11 Dec 2020 11:08:00 -0800


nbalajee commented on pull request #2309:
URL: https://github.com/apache/hudi/pull/2309#issuecomment-743371048



   > @nbalajee Can you please explain why do we need this ? If the latest 
schema is passed (which is the case for Hudi now) is this still a problem ?
   > @bvaradar can you please take a look at this one ?
   
   @n3nash  - Correct. When reading the parquet files, Hudi uses the writer 
schema (evolved schema with added fields) so that optional fields are 
automatically populated with null (native schema evolution).   For the 
rewrite(), Hudi use-cases always pass the writerSchema, so we don't run into 
this issue.
   
   Added advantage of fixing this the correct way is that Hudi will be able to 
support  "external schema evolution".  (Read parquet using the reader schema, 
then rewrite the records using the evolved schema). 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] nbalajee commented on pull request #2309: [HUDI-1441] - HoodieAvroUtils - rewrite() is not handling evolution o…

Reply via email to