[ 
https://issues.apache.org/jira/browse/ARROW-17956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17956:
-----------------------------------
    Labels: pull-request-available  (was: )

> [C++] RandomArrayGenerator does not properly generate ListArrays with Nulls
> ---------------------------------------------------------------------------
>
>                 Key: ARROW-17956
>                 URL: https://issues.apache.org/jira/browse/ARROW-17956
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Tobias Zagorni
>            Assignee: Tobias Zagorni
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are multiple problems with the {{OffsetsFromLengthsArray}} method:
>  * There is an assumption that the first and last length value in the input 
> are never null. This is not true at all for the usage of this method in 
> GENERATE_LIST_CASE, where the input is completely randomly generated, 
> respecting null_probability: 
> [https://github.com/apache/arrow/blob/ed36fcd218d381bd7420f1b762a28c5feea4665f/cpp/src/arrow/testing/random.cc#L730]
>  * The SetBit call for non-null items is off-by-one. The index variable 
> represents the index of the next offset, which is based of the current 
> elements length. But the validity bit should still be set for the current 
> element
>  *  I don't see what effect the {{force_empty_nulls}} argument should have. I 
> think the desired effect that Null items also have a zero length is always 
> given, based on how the method is implemented. Please correct me if I'm wrong.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to