nsivabalan edited a comment on pull request #2012:
URL: https://github.com/apache/hudi/pull/2012#issuecomment-835429596


   yes, thanks for clarifying. I guess, embedding schema in every payload might 
be detrimental as you have experienced. So, have thought of a diff approach to 
regenerate records w/ new schema at spark datasource layer. Only the batch that 
is getting ingested w/ old schema after table's schema got evolved will take a 
hit with this conversion. 
   
   https://github.com/apache/hudi/pull/2927
   
   Also, as I have mentioned earlier, if others (@n3nash , @bvaradar ) confirm 
that schema post processor is not required as a mandatory step with this 
[fix](https://github.com/apache/hudi/pull/2765) for default vals, we don't need 
any changes in delta streamer as such, just 
https://github.com/apache/hudi/pull/2927 would suffice. 
   
   @n3nash is doing more testing around this as well. So, will wait for him to 
comment on the patch as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to