voonhous commented on issue #7444:
URL: https://github.com/apache/hudi/issues/7444#issuecomment-1353610958

   @codope While this issue can be fixed with the 2 parameters provided above, 
there is a possibility that implicit schema changes can still be with the 
default parameter values (2 parameters set to false).
   
   I do believe this is not a "proper" fix for such cases. Say if these 
implicit schema changes have already been written to the table, there might not 
be any recourse that users can do to "fix" the table.
   
   I believe the proper way of fixing this issue is to:
   
   1. Enable these 2 parameters by default (Requires #6358 and it's 
accompanying fixes)
   2. Should there be any implicit schema changes detected, enable these 2 
parameters (Requires #6358 and it's accompanying fixes)
   3. Prevent implicit changes if these 2 parameters are not enabled (Requires 
#6358 and it's accompanying fixes)
   4. Modify SparkXXParquetFileFormat.scala to handle these type changes when 
reading
   
   I currently using approach (4) and will raise a PR for review for it 
tomorrow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to