helgikrs opened a new issue #1208:
URL: https://github.com/apache/arrow-rs/issues/1208


   **Describe the bug**
   Two string arrays containing only empty strings, where one has a validity 
map and the other doesn't (but no logical null values in either buffer) do not 
compare as equal.
   
   **To Reproduce**
   ```rust
   let s = StringArray::from(vec![Some(""), Some(""), Some("")]);
   
   let string1 = s.data();
   
   let string2 = ArrayData::builder(DataType::Utf8)
       .len(string1.len())
       .buffers(string1.buffers().to_vec())
       .build()
       .unwrap();
   
   // string2 is identical to string1 except that it has no validity buffer
   // but since there are no nulls, string1 and string2 are equal
   assert_eq!(string1, string2);
   ```
   
   This behavior happens in both IPC reader and parquet reader. If there are no 
nulls in the source buffer, the validity bitmap is skipped.
   Consequently, a round-trip of empty strings fails equality like above.
   
   **Expected behavior**
   These arrays should be considered equivalent.
   
   **Additional context**
   Binary arrays and the large variants of strings and binary arrays exhibit 
the same behavior.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to