thisisnic commented on PR #45818: URL: https://github.com/apache/arrow/pull/45818#issuecomment-2823427616
@pitrou - I've implemented most of the changes suggested in your original issue, but the one I'm stuck on is the example you gave: ``` >>> a, b = pc.min_max([1,2,3]) >>> a # expecting 1 'min' >>> b # expecting 3 'max' ``` This happens because the compute function returns a StructScalar and in the `__iter__()` method for StructScalars, we return keys and not values. This seems analagous to how Python dictionaries work, e.g. ``` >>> a, b = {'min': 1, 'max': 3} >>> a 'min' >>> b 'max' ``` I tried updating it to return values instead of keys, but this introduces a load of breaking changes in the tests as we're now fundamentally changing how StructScalars work. ``` 1. FAILED pyarrow/tests/test_scalars.py::test_basics[builtin_pickle-value35-None-StructScalar] - TypeError: Expected integer or string index 2. FAILED pyarrow/tests/test_scalars.py::test_map[builtin_pickle] - ValueError: Converting to Python dictionary is not supported when duplicate field names are present 3. FAILED pyarrow/tests/test_scalars.py::test_struct - AssertionError: assert [2, 3.5] == ['x', 'y'] 4. FAILED pyarrow/tests/test_scalars.py::test_struct_duplicate_fields - AssertionError: assert [1, 2.0, 3] == ['x', 'y', 'x'] 5. FAILED pyarrow/tests/test_scalars.py::test_nested_map_types_with_maps_as_pydicts - TypeError: Expected integer or string index ``` It feels like this could have significant user impact. Is it desirable to make such an in-depth change to how StructScalars work, or do you reckon we should just document that users should do something like use `items()` to get these values, like this? ``` >>> a, b = pc.min_max([1,2,3]).items() >>> a ('min', <pyarrow.Int64Scalar: 1>) >>> b ('max', <pyarrow.Int64Scalar: 3>) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org