gabrielsimoes opened a new pull request, #48813:
URL: https://github.com/apache/arrow/pull/48813

   ### Rationale for this change
   
   Fixes https://github.com/apache/arrow/issues/40053
   
   When converting Python dictionaries to PyArrow arrays, struct fields are 
sorted alphabetically instead of preserving the original dictionary key 
insertion order. Since Python 3.7+, dictionaries maintain insertion order, and 
users expect this order to be preserved.
   
   ```python
   >>> import pyarrow as pa
   >>> pa.array([{"b": 2, "a": 1}]).type
   struct<a: int64, b: int64>
   ```
   
   Expected: `struct<b: int64, a: int64>`
   
   ### What changes are included in this PR?
   
   Replace `std::map<std::string, TypeInferrer>` with 
`std::vector<std::pair<std::string, TypeInferrer>>` + 
`std::unordered_map<std::string, size_t>` in the type inference code. This 
follows the same pattern used in Arrow's JSON parser 
(`cpp/src/arrow/json/parser.cc`) for the same problem.
   
   ### Are these changes tested?
   
   Yes. Added `test_struct_from_dicts_field_order` test and updated existing 
tests to verify field ordering.
   
   ### Are there any user-facing changes?
   
   Yes. Struct field order now matches dictionary key insertion order instead 
of being sorted alphabetically. This is a behavioral change but aligns with 
user expectations and Python semantics.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to