steven-aerts opened a new pull request, #36506:
URL: https://github.com/apache/spark/pull/36506
Add the capability to write complex unions, next to reading them.
Complex unions map to struct types where field names are member0, member1,
etc.
This is consistent with the behavior in SchemaConverters for reading them
and when converting between Avro and Parquet.
### What changes were proposed in this pull request?
Spark was able to read complex unions already but not write them.
Now it is possible to also write them. If you have a schema with a complex
union the following code is now working:
```scala
spark
.read.format("avro").option("avroSchema", avroSchema).load(path)
.write.format("avro").option("avroSchema", avroSchema).save("/tmp/b")
```
While before this patch it would throw `Unsupported Avro UNION type` when
writing.
### Why are the changes needed?
Fixes SPARK-25050, lines up read and write compatibility.
### Does this PR introduce _any_ user-facing change?
The behaviour improved of course, this is as far as I could see not
impacting any customer facing API's or documentation.
### How was this patch tested?
- Added extra unit tests.
- Updated existing unit tests for improved behaviour.
- Validated manually with an internal corpus of avro files if they now could
be read and written without problems. Which was not before this patch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]