[GitHub] [spark] brandondahler opened a new pull request, #37361: [SPARK-39925][SQL] Add array_sort(column, comparator) overload to DataFrame operations

GitBox Mon, 01 Aug 2022 05:59:23 -0700


brandondahler opened a new pull request, #37361:
URL: https://github.com/apache/spark/pull/37361


   ### What changes were proposed in this pull request?
   Adding a new `array_sort` overload to `org.apache.spark.sql.functions` that 
matches the new overload defined in 
[SPARK-29020](https://issues.apache.org/jira/browse/SPARK-29020) and added via 
#25728.
   
   ### Why are the changes needed?
   Adds access to the new overload for users of the DataFrame API so that they 
don't need to use the `expr` escape hatch.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, now allows users to optionally provide a comparator function to the 
`array_sort`, which opens up the ability to sort descending as well as sort 
items that aren't naturally orderable.
   
   #### Example:
   Old:
   ```
   df.selectExpr("array_sort(a, (x, y) -> cardinality(x) - cardinality(y))");
   ```
   
   Added:
   ```
   df.select(array_sort(col("a"), (x, y) => size(x) - size(y)));
   ```
   
   ### How was this patch tested?
   Unit tests update to validate that the overload matches the expression's 
behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] brandondahler opened a new pull request, #37361: [SPARK-39925][SQL] Add array_sort(column, comparator) overload to DataFrame operations

Reply via email to