[ 
https://issues.apache.org/jira/browse/ARROW-11263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qingyou Meng closed ARROW-11263.
--------------------------------
    Resolution: Incomplete

> [Rust] problem of Field nullable
> --------------------------------
>
>                 Key: ARROW-11263
>                 URL: https://issues.apache.org/jira/browse/ARROW-11263
>             Project: Apache Arrow
>          Issue Type: Bug
>            Reporter: Qingyou Meng
>            Priority: Major
>
> Quoting from section *Schema message*
>  
> [https://github.com/apache/arrow/blob/master/docs/source/format/Columnar.rst#schema-message]
> {noformat}
> Whether the field is semantically nullable. While this has no bearing on the 
> array's physical layout, many systems distinguish nullable and non-nullable 
> fields and we want to allow them to preserve this metadata to enable faithful 
> schema round trips.{noformat}
> This can be read as: for a field with nullable set as true, when encounters 
> null array data from the field, data processor CAN continue or refuse to 
> process.
> In current rust implementation, apart from read Fields from schema, we also 
> construct `Field` with datafusion and`Field::new`in 
> arrow::array::*StructArray*.
>  * in datafusion, the nullable is determined by DF schema
>  * in arrow::array::StructArray::try_from(values: Vec<(&str, ArrayRef)>) , 
> the nullable is determined actual data. This is error-prone if ArrayRef's 
> null buffer are all 1s (built by builder). The following test shows a bug:
> {noformat}
>     #[test]
>     fn test_struct_bug() {
>         let ints: ArrayRef = Arc::new(Int32Array::from(vec![
>             Some(1),
>             Some(2),
>             Some(3),
>         ]));
>         let array = StructArray::try_from(vec![("f1", ints.clone())])         
>    .unwrap()
>             .data();
>         let arrays = vec![array.as_ref()];
>         let mut mutable = MutableArrayData::new(arrays, false, 0);
>         mutable.extend(0, 1, 3);
>         let data = mutable.freeze();
>         let array = StructArray::from(Arc::new(data));
>         let expected = StructArray::try_from(vec![
>             ("f1", ints.slice(1, 2)),
>         ])
>         .unwrap();
>         assert_eq!(array, expected);
>     }{noformat}
> Conclusions:
>  * It's questionable to set Field's nullable according to data.
>  * Perhaps builders should set null buffer back to None when the buffer has 
> all bits set.
>  * StructArray::
> try_from(values: Vec<(&str, ArrayRef)>) incorrectly sets nullable when null 
> buffer is Some with all bits set.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to