jorisvandenbossche commented on a change in pull request #12178:
URL: https://github.com/apache/arrow/pull/12178#discussion_r796442124



##########
File path: python/pyarrow/tests/test_pandas.py
##########
@@ -4082,6 +4082,66 @@ def test_array_to_pandas():
         # tm.assert_series_equal(result, expected)
 
 
+def test_to_pandas_types_mapper():

Review comment:
       ```suggestion
   def test_array_to_pandas_types_mapper():
   ```
   
   (to differentiate from table.to_pandas)

##########
File path: python/pyarrow/tests/test_pandas.py
##########
@@ -4082,6 +4082,66 @@ def test_array_to_pandas():
         # tm.assert_series_equal(result, expected)
 
 
+def test_to_pandas_types_mapper():
+    # https://issues.apache.org/jira/browse/ARROW-9664
+    if Version(pd.__version__) < Version("1.0.0"):
+        pytest.skip("ExtensionDtype to_pandas method missing")
+
+    data = pa.array([1, 2, 3], pa.int64())
+
+    # Test with mapper function
+    types_mapper = {pa.int64(): pd.Int64Dtype()}.get
+    result = data.to_pandas(types_mapper=types_mapper)
+    assert result.dtype == types_mapper(data.type)
+
+    # Test mapper function returning None
+    types_mapper = {pa.int64(): None}.get
+    result = data.to_pandas(types_mapper=types_mapper)
+    assert result.dtype == data.type.to_pandas_dtype()
+
+    # Test mapper function not containing the dtype
+    types_mapper = {pa.float64(): pd.Float64Dtype()}.get
+    result = data.to_pandas(types_mapper=types_mapper)
+    assert result.dtype == data.type.to_pandas_dtype()
+
+    # Test for the interval extension dtype
+    # -> ignores mapping and uses default conversion
+    types_mapper = {pa.float64(): pd.IntervalDtype()}.get
+    result = data.to_pandas(types_mapper=types_mapper)

Review comment:
       For this test, I think it would be good to see if we can actually 
roundtrip a pandas intervaldtype (now this test is basically the same as the 
case above). So if we start from 
   
   ```
   interval = pd.Series(pd.interval_range(0, 5, 5))
   data = pa.array(interval)
   ```
   
   Can we do `data.to_pandas(..)` in some way to get back the pandas `interval` 
series? 
   
   This might actually not be super straightforward, as you need to know the 
exact struct type that has been created .. (which makes me wonder if we should 
change the interface a bit: while `types_mapper` makes sense for Table 
conversion where you can have many columns of a certain type, for Array there 
is simply one result dtype you might want to enforce).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to