attilapiros opened a new pull request #31501:
URL: https://github.com/apache/spark/pull/31501


   ### What changes were proposed in this pull request?
   
   With https://github.com/apache/spark/pull/31133 Avro schema evolution is 
introduce for partitioned hive tables where the schema is given by 
`avro.schema.literal`. 
   Here that functionality is extended to support schema evolution where the 
schema is defined via `avro.schema.url`. 
   
   ### Why are the changes needed?
   
   Without this PR the problem described in 
https://github.com/apache/spark/pull/31133 can be reproduced by tables where 
`avro.schema.url` is used. As in this case always the property value given at 
partition level is used for the `avro.schema.url`.
   
   So for example when a new column (with a default value) is added to the 
table then one the following problem happens:
   -  when the new field is added after the last one the cell values will be 
null values instead of the default value
   -  when the schema is extended somewhere before the last field then values 
will be listed for the wrong column positions
   
   Similar error will happen when one of the field is removed from the schema.
   
   For details please check the attached unit tests where both cases are 
checked.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Fixes the potential value error. 
   
   ### How was this patch tested?
   
   The existing unit tests for schema evolution is generalized and reused. 
   New tests:
   - SPARK-34370: support Avro schema evolution (add column with 
avro.schema.url)
   - SPARK-34370: support Avro schema evolution (remove column with 
avro.schema.url)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to