mzabaluev opened a new issue, #9304:
URL: https://github.com/apache/arrow-rs/issues/9304

   **Describe the bug**
   [The 
logic](https://github.com/apache/arrow-rs/blob/2c0eba46ec447d815ddc7f8185edbedd2ae3596b/arrow-avro/src/codec.rs#L312)
 of handling a `null` default value in a field schema where the type is a union 
only allows the null variant to be listed first. The comment above claims this 
to be per-spec, but the specification says only this, as of version 1.12.0:
   
   > Default values for union fields correspond to the first schema that 
matches in the union.
   
   There is no restriction that the default value must match the first schema 
variant.
   
   **To Reproduce**
   
   An application uses an Avro schema (e.g. as the reader schema) specifying a 
field like:
   
   ```json
   {
     "name": "optional_int",
     "type": ["int", "null"],
     "default": null
   }
   ``` 
   
   Building the reader fails with an error:
   > Schema error: JSON null default is only valid for `null` type or for a 
union whose first branch is `null`
   
   **Expected behavior**
   The default should be accepted because it matches one of the union schemas, 
as per [the Avro 
spec](https://avro.apache.org/docs/1.12.0/specification/#schema-record) version 
1.12.0.
   
   **Additional context**
   An example of this order preference is 
[found](https://github.com/apache/spark/blob/v3.5.8/connector/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala#L277)
 in Spark 3.5.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to