nsivabalan commented on pull request #2012:
URL: https://github.com/apache/hudi/pull/2012#issuecomment-834312959


   > I spent sometime to understand this PR. thanks for putting it up 
@sathyaprakashg. I have few clarifications.
   > 
   > 1. Can you fix the description wrt latest status. I don't see 
SchemaBasedSchemaProvider etc.
   > 2. FYI We landed a [fix](https://github.com/apache/hudi/pull/2765) wrt 
default vals and null in unions. If incase, the schema post processing is not 
required at all w/ this fix, it would simplify things. Guess the namespace fix 
in this PR may not be required if the post processing step is not required. 
@bvaradar @n3nash : can you folks chime in here please. another related [fixed 
datatype jira](https://issues.apache.org/jira/browse/HUDI-1607). the backwards 
incompatibility may not be an issue if we go this route. 
   > 3. Also, I pulled the test locally and was trying to verify things. Looks 
like the test is not generating records as intended in 3rd step. Here is what 
is happening.
   >    
   >    * TestDataSource generates data w/ intended schema(old)
   >    * But in SourceFormatAdapter, when we do 
AvroConversionUtils.createDataFrame(...), evolved schema is passed in. and so 
InputBatch<Dataset> returned from here has new column set to null for all 
records.
   >    * I also verified this from within the IdentityTransformer which was 
showing evolved schema and record having new column as well.
   >      so, essentially the test also need to be fixed.
   
   @vinothchandar : We need to iron out the perf issue. But these were my 
comments earlier. it could simplify the backwards compatibility issue which was 
being discussed. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to