[
https://issues.apache.org/jira/browse/ARROW-17956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated ARROW-17956:
-----------------------------------
Labels: pull-request-available (was: )
> [C++] RandomArrayGenerator does not properly generate ListArrays with Nulls
> ---------------------------------------------------------------------------
>
> Key: ARROW-17956
> URL: https://issues.apache.org/jira/browse/ARROW-17956
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Tobias Zagorni
> Assignee: Tobias Zagorni
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> There are multiple problems with the {{OffsetsFromLengthsArray}} method:
> * There is an assumption that the first and last length value in the input
> are never null. This is not true at all for the usage of this method in
> GENERATE_LIST_CASE, where the input is completely randomly generated,
> respecting null_probability:
> [https://github.com/apache/arrow/blob/ed36fcd218d381bd7420f1b762a28c5feea4665f/cpp/src/arrow/testing/random.cc#L730]
> * The SetBit call for non-null items is off-by-one. The index variable
> represents the index of the next offset, which is based of the current
> elements length. But the validity bit should still be set for the current
> element
> * I don't see what effect the {{force_empty_nulls}} argument should have. I
> think the desired effect that Null items also have a zero length is always
> given, based on how the method is implemented. Please correct me if I'm wrong.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)