[
https://issues.apache.org/jira/browse/ARROW-10411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated ARROW-10411:
-----------------------------------
Labels: pull-request-available (was: )
> [C++] Fix incorrect child array lengths for Concatenate of FixedSizeList
> ------------------------------------------------------------------------
>
> Key: ARROW-10411
> URL: https://issues.apache.org/jira/browse/ARROW-10411
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Affects Versions: 2.0.0, 3.0.0
> Reporter: Johan Peltenburg
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When attempting to CombineChunks() on an arrow::Table containing a
> FixedSizeList type Array, the child arrays of the FixedSizeLists are not
> properly concatenated. The lengths of the child array being set incorrectly.
> I ran into this when trying to ToString() the combined RecordBatch.
> This seems to be because this function in :
> cpp/arrow/array/concatenate.cc
> {code:java}
> Result<std::vector<std::shared_ptr<const ArrayData>>> ChildData(size_t index)
> {code}
> ... used to calculate offsets and slice lengths before actual concatenation
> doesn't take the list lengths into account.
> The bug can be reproduced by adding the following unit test to:
> cpp/arrow/array/concatenate_test.cc
>
> {code:java}
> TEST_F(ConcatenateTest, FixedSizeListType) {
> Check([this](int32_t size, double null_probability, std::shared_ptr<Array>*
> out) {
> auto list_size = 3;
> auto values_size = size * list_size;
> auto values = this->GeneratePrimitive<Int8Type>(values_size,
> null_probability);
> ASSERT_OK_AND_ASSIGN(*out, FixedSizeListArray::FromArrays(values,
> list_size));
> ASSERT_OK((**out).ValidateFull());
> });
> }
> {code}
> One possible approach to fix this would be to add another ChildData overload
> to ConcatenateImpl with a multiplier parameter, and multiply the offset and
> length of the slice by the multiplier. This function can be called by the
> FixedSizeList Visitor and be supplied with the list length as multiplier.
> I have this fix ready but would like to know if this would be the right
> approach.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)