steven-aerts opened a new pull request, #36506:
URL: https://github.com/apache/spark/pull/36506

   Add the capability to write complex unions, next to reading them.
   Complex unions map to struct types where field names are member0, member1, 
etc.
   This is consistent with the behavior in SchemaConverters for reading them
   and when converting between Avro and Parquet.
   
   
   ### What changes were proposed in this pull request?
   Spark was able to read complex unions already but not write them. 
   Now it is possible to also write them.  If you have a schema with a complex 
union the following code is now working:
   
   ```scala
   spark
     .read.format("avro").option("avroSchema", avroSchema).load(path)
     .write.format("avro").option("avroSchema", avroSchema).save("/tmp/b")
   ```
   While before this patch it would throw `Unsupported Avro UNION type` when 
writing.
   
   ### Why are the changes needed?
   Fixes SPARK-25050, lines up read and write compatibility.
   
   
   ### Does this PR introduce _any_ user-facing change?
   The behaviour improved of course, this is as far as I could see not 
impacting any customer facing API's or documentation.
   
   
   ### How was this patch tested?
   - Added extra unit tests.
   - Updated existing unit tests for improved behaviour.
   - Validated manually with an internal corpus of avro files if they now could 
be read and written without problems.  Which was not before this patch.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to