[ 
https://issues.apache.org/jira/browse/ARROW-10411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Peltenburg updated ARROW-10411:
-------------------------------------
    Summary: [C++] Fix incorrect child array lengths for Concatenate of 
FixedSizeList  (was: [C++] Fix incorrect child array lengths for Concatenate of 
FixedSizeArray)

> [C++] Fix incorrect child array lengths for Concatenate of FixedSizeList
> ------------------------------------------------------------------------
>
>                 Key: ARROW-10411
>                 URL: https://issues.apache.org/jira/browse/ARROW-10411
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 2.0.0, 3.0.0
>            Reporter: Johan Peltenburg
>            Priority: Major
>
> When attempting to CombineChunks() on an arrow::Table containing a 
> FixedSizeList type Array, the child arrays of the FixedSizeLists are not 
> properly concatenated. The lengths of the child array being set incorrectly. 
> I ran into this when trying to ToString() the combined RecordBatch.
> This seems to be because this function in :
> cpp/arrow/array/concatenate.cc
> {code:java}
> Result<std::vector<std::shared_ptr<const ArrayData>>> ChildData(size_t index)
> {code}
> ... used to calculate offsets and slice lengths before actual concatenation 
> doesn't take the list lengths into account.
> The bug can be reproduced by adding the following unit test to:
> cpp/arrow/array/concatenate_test.cc
>  
> {code:java}
> TEST_F(ConcatenateTest, FixedSizeListType) {
>   Check([this](int32_t size, double null_probability, std::shared_ptr<Array>* 
> out) {
>     auto list_size = 3;
>     auto values_size = size * list_size;
>     auto values = this->GeneratePrimitive<Int8Type>(values_size, 
> null_probability);
>     ASSERT_OK_AND_ASSIGN(*out, FixedSizeListArray::FromArrays(values, 
> list_size));
>     ASSERT_OK((**out).ValidateFull());
>   });
> }
> {code}
> One possible approach to fix this would be to add another ChildData overload 
> to ConcatenateImpl with a multiplier parameter, and multiply the offset and 
> length of the slice by the multiplier. This function can be called by the 
> FixedSizeArray Visitor and be supplied with the list length as multiplier.
> I have this fix ready but would like to know if this would be the right 
> approach.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to