Github user lindblombr commented on the issue: https://github.com/apache/spark/pull/21847 @gengliangwang I think I agree with your take on the multi-type unions in principal. The only issue is that avro itself does support that as a valid use case; however, when I think about the behavior... If we read in A to SparkSQL where a is type `["null", "int", "long"]`, Spark will automatically up-convert to a long. This will mean that even though the original data may have had a combination of records stored as either "int" or "long", any attempt to write out that same data with user-specified schema will convert all ints to longs, resulting in a slightly different dataset. This side-effect may be undesirable in some cases (maybe). I personally don't have a use-case for this, except the test data in this module itself includes a multi-type union of this nature, and I wanted this functionality to work for as much of the test data as was available. If, for the sake of simplicity, we'd like to restrict user-specified schemas to two-type unions only, it would still work for my use cases.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org