[GitHub] [hudi] nsivabalan commented on issue #2675: [SUPPORT] Unable to query MOR table after schema evolution

GitBox Tue, 30 Mar 2021 07:08:55 -0700


nsivabalan commented on issue #2675:
URL: https://github.com/apache/hudi/issues/2675#issuecomment-810288191



   there are two code paths in HoodieSparkSqlWriter. 
   (1) AvroConversionUtils.convertStructTypeToAvroSchema(df.schema, structName, 
nameSpace)
   (2) HoodieSparkUtils.createRdd(df, schema, structName, nameSpace)
   
   (1) uses SchemaConverters.toAvroType(...)
   (2) uses our custom converter function (createConverterToAvro) in 
AvroConversionHelper.
   
   What I meant is, (1) is strictly needed which is what I tried out. (2) is 
not strictly required since that schema does not get serialized in commit 
metadata. But yeah, we can try to keep both in sync. I am all for it. 
   
   Wrt testing:
   - You can run usual unit tests and integration tests. 
[this](https://github.com/apache/hudi) should have details on running tests.
   - I assume you will write tests covering schema evolution to test the new 
code to put up.
   - For testing schema evolution, you can try out the steps you used to report 
this issue. We don't have end to end schema evolution tests for MOR as you 
might have realized with this issue. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] nsivabalan commented on issue #2675: [SUPPORT] Unable to query MOR table after schema evolution

Reply via email to