tfeda opened a new issue, #1637:
URL: https://github.com/apache/arrow-rs/issues/1637

   **Describe the bug**
   The Union DataType produced by UnionBuilder has non-nullable children Fields 
after appending nulls in the builder.
   
   **To Reproduce**
   Steps to reproduce the behavior: Try the following code
   ```
   let mut builder = UnionBuilder::new_dense(4);
   builder.append::<Int32Type>("a", 1).unwrap();
   builder.append::<Float64Type>("b", 3.0).unwrap();
   builder.append_null::<Float64Type>("b").unwrap();
   builder.append_null::<Int32Type>("a").unwrap();
   let union = builder.build().unwrap();
   
   let schema = Schema::new(vec![
       Field::new(
           "Teamsters",
           DataType::Union(
               vec![
                   Field::new("a", DataType::Int32, true),
                   Field::new("b", DataType::Float64, true),
               ],
               UnionMode::Dense,
           ),
            false,
       ),
   ]); 
   
   let batch = RecordBatch::try_new(
       Arc::new(schema),
       vec![Arc::new(union)]
   ).unwrap();
   ```
   This code panics:
   
   InvalidArgumentError("column types must match schema types, expected 
   Union([
       Field {  name: \"a\", data_type: **Int32, nullable: true**, dict_id: 0, 
dict_is_ordered: false, metadata: None }, 
       Field { name: \"b\", data_type: **Float64, nullable: true**, dict_id: 0, 
dict_is_ordered: false, metadata: None }
        ], Dense
   ) but found Union([
       Field { name: \"a\", data_type: **Int32, nullable: false**, dict_id: 0, 
dict_is_ordered: false, metadata: None }, 
       Field { name: \"b\", data_type: **Float64, nullable: false**, dict_id: 
0, dict_is_ordered: false, metadata: None }
       ], Dense) 
   at column index 0")
   
   **Expected behavior**
   
   **Depending on the interpretation of the specification, one of 2 things 
should happen:**
   *A `Union`'s children `Field`s should inherit its nullabillity (i.e. always 
be false):*  Then I think this should error when executing `Field::new()` with 
a bad `DataType`.
   
   *A child should be nullable if it is capable of returning None to the parent 
when `unionArray.value(index)` is called*: This code should run just fine then.
   
   **Additional context**
   I ran into this when working on #1594. I think it's a simple fix: track the 
nullablility of the `UnionBuilder` fields rather than always hardcode the child 
`Field`s nullability to be false. That being said, I'm not sure if that's the 
correct understanding of the specification. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to