[
https://issues.apache.org/jira/browse/ARROW-8508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089198#comment-17089198
]
Mark Hildreth commented on ARROW-8508:
--------------------------------------
I believe there are a few things going on here:
1.) I wouldn't consider myself an expert on these APIs, but it seems like the
builders are being used correctly.
2.) The debug output definitely appears broken. I opened a [PR to fix
this|https://github.com/apache/arrow/pull/7006], which puts it more in line
with how the non-fixed size *ListArray* works. This should fix the *value()*
method on the FixedSizeListArray to properly take the offset into the child
array into account.
3.) As for the asserts that fail, this I'm less certain on. The values from
these asserts are taken from the *values()* method, which seems to just return
the underlying array without taking offsets into account. This seems to be
similar to how other arrays work (including primitives), so my guess it is by
design. I don't have an explanation for a better way of using the API, so maybe
someone else can provide input.
> [Rust] ListBuilder of FixedSizeListBuilder creates wrong offsets
> ----------------------------------------------------------------
>
> Key: ARROW-8508
> URL: https://issues.apache.org/jira/browse/ARROW-8508
> Project: Apache Arrow
> Issue Type: Bug
> Components: Rust
> Affects Versions: 0.16.0
> Reporter: Christian Beilschmidt
> Priority: Major
> Labels: pull-request-available
>
> I created an example of storing multi points with Arrow.
> # A coordinate consists of two floats (Float64Builder)
> # A multi point consists of one or more coordinates (FixedSizeListBuilder)
> # A list of multi points consists of multiple multi points (ListBuilder)
> This is the corresponding code snippet:
> {code:java}
> let float_builder = arrow::array::Float64Builder::new(0);
> let coordinate_builder =
> arrow::array::FixedSizeListBuilder::new(float_builder, 2);
> let mut multi_point_builder =
> arrow::array::ListBuilder::new(coordinate_builder);
> multi_point_builder
> .values()
> .values()
> .append_slice(&[0.0, 0.1])
> .unwrap();
> multi_point_builder.values().append(true).unwrap();
> multi_point_builder
> .values()
> .values()
> .append_slice(&[1.0, 1.1])
> .unwrap();
> multi_point_builder.values().append(true).unwrap();
> multi_point_builder.append(true).unwrap(); // first multi point
> multi_point_builder
> .values()
> .values()
> .append_slice(&[2.0, 2.1])
> .unwrap();
> multi_point_builder.values().append(true).unwrap();
> multi_point_builder
> .values()
> .values()
> .append_slice(&[3.0, 3.1])
> .unwrap();
> multi_point_builder.values().append(true).unwrap();
> multi_point_builder
> .values()
> .values()
> .append_slice(&[4.0, 4.1])
> .unwrap();
> multi_point_builder.values().append(true).unwrap();
> multi_point_builder.append(true).unwrap(); // second multi point
> let multi_point = dbg!(multi_point_builder.finish());
> let first_multi_point_ref = multi_point.value(0);
> let first_multi_point: &arrow::array::FixedSizeListArray =
> first_multi_point_ref.as_any().downcast_ref().unwrap();
> let coordinates_ref = first_multi_point.values();
> let coordinates: &Float64Array =
> coordinates_ref.as_any().downcast_ref().unwrap();
> assert_eq!(coordinates.value_slice(0, 2 * 2), &[0.0, 0.1, 1.0, 1.1]);
> let second_multi_point_ref = multi_point.value(1);
> let second_multi_point: &arrow::array::FixedSizeListArray =
> second_multi_point_ref.as_any().downcast_ref().unwrap();
> let coordinates_ref = second_multi_point.values();
> let coordinates: &Float64Array =
> coordinates_ref.as_any().downcast_ref().unwrap();
> assert_eq!(coordinates.value_slice(0, 2 * 3), &[2.0, 2.1, 3.0, 3.1, 4.0,
> 4.1]);
> {code}
> The second assertion fails and the output is {{[0.0, 0.1, 1.0, 1.1, 2.0,
> 2.1]}}.
> Moreover, the debug output produced from {{dbg!}} confirms this:
> {noformat}
> [
> FixedSizeListArray<2>
> [
> PrimitiveArray<Float64>
> [
> 0.0,
> 0.1,
> ],
> PrimitiveArray<Float64>
> [
> 1.0,
> 1.1,
> ],
> ],
> FixedSizeListArray<2>
> [
> PrimitiveArray<Float64>
> [
> 0.0,
> 0.1,
> ],
> PrimitiveArray<Float64>
> [
> 1.0,
> 1.1,
> ],
> PrimitiveArray<Float64>
> [
> 2.0,
> 2.1,
> ],
> ],
> ]{noformat}
> The second list should contain the values 2-4.
>
> So either I am using the builder wrong or there is a bug with the offsets. I
> used {{0.16}} as well as the current {{master}} from GitHub.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)