jorisvandenbossche commented on a change in pull request #10101:
URL: https://github.com/apache/arrow/pull/10101#discussion_r618363682
##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -688,6 +688,69 @@ def test_dictionary_from_numpy():
assert d2[i].as_py() == dictionary[indices[i]]
+def test_dictionary_to_numpy():
+ expected = pa.array(
+ ["foo", "bar", None, "foo"]
+ ).to_numpy(zero_copy_only=False)
+ a = pa.DictionaryArray.from_arrays(
+ pa.array([0, 1, None, 0]),
+ pa.array(['foo', 'bar'])
+ )
+ assert (a.to_numpy(zero_copy_only=False) == expected).all()
Review comment:
I am not sure if we are very consistent about it, but we also use
`np.testing.assert_array_equal` for asserting numpy arrays
##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -688,6 +688,69 @@ def test_dictionary_from_numpy():
assert d2[i].as_py() == dictionary[indices[i]]
+def test_dictionary_to_numpy():
+ expected = pa.array(
+ ["foo", "bar", None, "foo"]
+ ).to_numpy(zero_copy_only=False)
+ a = pa.DictionaryArray.from_arrays(
+ pa.array([0, 1, None, 0]),
+ pa.array(['foo', 'bar'])
+ )
+ assert (a.to_numpy(zero_copy_only=False) == expected).all()
+
+ with pytest.raises(pa.ArrowInvalid):
+ # to_numpy takes for granted that when zero_copy_only=True
+ # there will be no nulls.
+ # If that changes, nulls handling will have to be updated in to_numpy
+ # as we won't be able to rely anymore on decoding the DictionaryArray
+ # to handle nulls.
Review comment:
I don't fully understand this comment about nulls. For this specific
case isn't it just testing that conversion to numpy can never be done zero-copy
for dictionary (nulls or no nulls) and is thus raising an error? (this already
worked correctly on master as well)
##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -688,6 +688,69 @@ def test_dictionary_from_numpy():
assert d2[i].as_py() == dictionary[indices[i]]
+def test_dictionary_to_numpy():
+ expected = pa.array(
+ ["foo", "bar", None, "foo"]
+ ).to_numpy(zero_copy_only=False)
+ a = pa.DictionaryArray.from_arrays(
+ pa.array([0, 1, None, 0]),
+ pa.array(['foo', 'bar'])
+ )
+ assert (a.to_numpy(zero_copy_only=False) == expected).all()
+
+ with pytest.raises(pa.ArrowInvalid):
+ # to_numpy takes for granted that when zero_copy_only=True
+ # there will be no nulls.
+ # If that changes, nulls handling will have to be updated in to_numpy
+ # as we won't be able to rely anymore on decoding the DictionaryArray
+ # to handle nulls.
+ a.to_numpy(zero_copy_only=True)
+
+ anonulls = pa.DictionaryArray.from_arrays(
+ pa.array([0, 1, 1, 0]),
+ pa.array(['foo', 'bar'])
+ )
+ expected = pa.array(
+ ["foo", "bar", "bar", "foo"]
+ ).to_numpy(zero_copy_only=False)
Review comment:
```suggestion
expected = np.array(["foo", "bar", "bar", "foo"], dtype=object)
```
I think we can just define the expected here? (it's also a bit simpler in
code)
(and the same for the others)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]