thisisnic commented on PR #45818:
URL: https://github.com/apache/arrow/pull/45818#issuecomment-2823427616

   @pitrou - I've implemented most of the changes suggested in your original 
issue, but the one I'm stuck on is the example you gave:
   
   ```
   >>> a, b = pc.min_max([1,2,3])
   >>> a  # expecting 1
   'min'
   >>> b  # expecting 3
   'max'
   ```
   
   This happens because the compute function returns a StructScalar and in the 
`__iter__()` method for StructScalars, we return keys and not values.  This 
seems analagous to how Python dictionaries work, e.g.
   
   ```
   >>> a, b = {'min': 1, 'max': 3}
   >>> a
   'min'
   >>> b
   'max'
   ```
   
   I tried updating it to return values instead of keys, but this introduces a 
load of breaking changes in the tests as we're now fundamentally changing how 
StructScalars work.
   
   ```
   1. FAILED 
pyarrow/tests/test_scalars.py::test_basics[builtin_pickle-value35-None-StructScalar]
 - TypeError: Expected integer or string index
   2. FAILED pyarrow/tests/test_scalars.py::test_map[builtin_pickle] - 
ValueError: Converting to Python dictionary is not supported when duplicate 
field names are present
   3. FAILED pyarrow/tests/test_scalars.py::test_struct - AssertionError: 
assert [2, 3.5] == ['x', 'y']
   4. FAILED pyarrow/tests/test_scalars.py::test_struct_duplicate_fields - 
AssertionError: assert [1, 2.0, 3] == ['x', 'y', 'x']
   5. FAILED 
pyarrow/tests/test_scalars.py::test_nested_map_types_with_maps_as_pydicts - 
TypeError: Expected integer or string index
   ```
   
   It feels like this could have significant user impact.  Is it desirable to 
make such an in-depth change to how StructScalars work, or do you reckon we 
should just document that users should do something like use `items()` to get 
these values, like this?
   
   ```
   >>> a, b = pc.min_max([1,2,3]).items()
   >>> a 
   ('min', <pyarrow.Int64Scalar: 1>)
   >>> b 
   ('max', <pyarrow.Int64Scalar: 3>)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to