jorisvandenbossche commented on code in PR #15210:
URL: https://github.com/apache/arrow/pull/15210#discussion_r1064847072
##########
cpp/src/arrow/array/array_list_test.cc:
##########
@@ -509,6 +509,18 @@ class TestListArray : public ::testing::Test {
ASSERT_RAISES(Invalid, ValidateOffsets(2, {0, 7, 4}, values));
}
+ void TestSliced() {
+ auto arr = ArrayFromJSON(list(int16()), "[[1, 2], [3, 4, 5], [6], [7,
8]]");
+
+ auto arr_sliced = arr->Slice(0, 2);
+ auto expected_sliced = ArrayFromJSON(list(int16()), "[[1, 2], [3, 4, 5]]");
+ AssertArraysEqual(*expected_sliced, *arr_sliced);
+
+ auto values = checked_cast<ListArray*>(arr_sliced.get())->values();
+ auto expected_values = ArrayFromJSON(int16(), "[1, 2, 3, 4, 5]");
+ AssertArraysEqual(*expected_values, *values);
Review Comment:
Yes, but your example uses an actual null inside a list, not a "null list".
For example, here the first null is not in the flattened output:
```
In [3]: arr = pa.array([[1, 2], None, [3, None]])
In [4]: arr.flatten()
Out[4]:
<pyarrow.lib.Int64Array object at 0x7f8b1fbcaa40>
[
1,
2,
3,
null
]
```
And I suppose behind that null can be any data (the default when constructed
above is that there is no data behind, so the offset doesn't increment for that
list element):
```
In [5]: arr.offsets
Out[5]:
<pyarrow.lib.Int32Array object at 0x7f8b4f0f5a80>
[
0,
2,
2,
4
]
```
It's a bit tricky to construct manually, but something like:
```
In [10]: arr = pa.ListArray.from_arrays(pa.array([0, 2, 4, 6]), pa.array([1,
2, 99, 99, 3, None]), mask=pa.array([False, True, False]))
In [11]: arr
Out[11]:
<pyarrow.lib.ListArray object at 0x7f8b1fbcba60>
[
[
1,
2
],
null,
[
3,
null
]
]
In [12]: arr.flatten()
Out[12]:
<pyarrow.lib.Int64Array object at 0x7f8b4f065960>
[
1,
2,
3,
null
]
In [13]: arr.values
Out[13]:
<pyarrow.lib.Int64Array object at 0x7f8b4f065780>
[
1,
2,
99,
99,
3,
null
]
In [14]: arr.offsets
Out[14]:
<pyarrow.lib.Int32Array object at 0x7f8b1fbc9300>
[
0,
2,
4,
6
]
```
But, so I am not sure if for this case you actually need to flattened (with
nulls removed) or not for this case of converting to numpy.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]