nsivabalan edited a comment on pull request #2012:
URL: https://github.com/apache/hudi/pull/2012#issuecomment-825077766


   I spent sometime to understand this PR. thanks for putting it up 
@sathyaprakashg. I have few clarifications. 
   
   1. Can you fix the description wrt latest status. I don't see 
SchemaBasedSchemaProvider etc. 
   2. FYI We landed a [fix](SchemaBasedSchemaProvider) wrt default vals and 
null in unions. If incase, the schema post processing is not required at all w/ 
this fix, it would simplify things. Guess the namespace fix in this PR may not 
be required if the post processing step is not required. @bvaradar @n3nash : 
can you folks chime in here please. [fixed datatype 
jira](https://issues.apache.org/jira/browse/HUDI-1607).
   3. Also, I pulled the test locally and was trying to verify things. Looks 
like the test is not generating records as intended in 3rd step. Here is what 
is happening. 
       - TestDataSource generates data w/ intended schema(old)
       - But in SourceFormatAdapter, when we do 
AvroConversionUtils.createDataFrame(...), evolved schema is passed in. and so 
InputBatch<Dataset<Row>> returned from here has new column set to null for all 
records. 
       - I also verified this from within the IdentityTransformer which was 
showing evolved schema and record having new column as well. 
   so, essentially the test also need to be fixed. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to