Tobias Zagorni created ARROW-17956:
--------------------------------------

             Summary: [C++] RandomArrayGenerator does not properly generate 
ListArrays with Nulls
                 Key: ARROW-17956
                 URL: https://issues.apache.org/jira/browse/ARROW-17956
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
            Reporter: Tobias Zagorni
            Assignee: Tobias Zagorni


There are multiple problems with the {{OffsetsFromLengthsArray}} method:
 * There is an assumption that the first and last length value in the input are 
never null. This is not true at all for the usage of this method in 
GENERATE_LIST_CASE, where the input is completely randomly generated, 
respecting null_probability: 
[https://github.com/apache/arrow/blob/ed36fcd218d381bd7420f1b762a28c5feea4665f/cpp/src/arrow/testing/random.cc#L730]
 * The SetBit call for non-null items is off-by-one. The index variable 
represents the index of the next offset, which is based of the current elements 
length. But the validity bit should still be set for the current element
 *  I don't see what effect the {{force_empty_nulls}} argument should have. I 
think the desired effect that Null items also have a zero length is always 
given, based on how the method is implemented. Please correct me if I'm wrong.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to