landlord-matt commented on code in PR #43871:
URL: https://github.com/apache/spark/pull/43871#discussion_r1399028056
##########
python/pyspark/sql/functions.py:
##########
@@ -14102,7 +14102,7 @@ def sort_array(col: "ColumnOrName", asc: bool = True)
-> Column:
Collection function: sorts the input array in ascending or descending
order according
to the natural ordering of the array elements. Null elements will be
placed at the beginning
of the returned array in ascending order or at the end of the returned
array in descending
- order.
+ order. The natural order of a struct is the natural order of the first
field in the struct schema.
Review Comment:
Maybe you are right and StackOverflow is wrong. I'm not so good at Scala,
but I think it is decided [here
](https://github.com/apache/spark/blob/398bff77b1c837aed55f4f10ff1157f7da813570/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala#L58)
or perhaps
[here](https://github.com/apache/spark/blob/398bff77b1c837aed55f4f10ff1157f7da813570/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala#L318)
Maybe it is sorting first on the first field, then on second and so on.
A more correct description would in this case be
```suggestion
order. The natural order of a struct is the natural order of the
underlying fields of the struct schema in positional order.
```
I'm not 100 % sure this is true either. Are we sure that we get the field in
positional order when it iterates through the fields?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]