[ 
https://issues.apache.org/jira/browse/SPARK-46349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Majid Hajiheidari updated SPARK-46349:
--------------------------------------
    Summary: Prevent Multiple SortOrders for an Expression  (was: Prevent 
SortOrder from Accepting Nested SortOrder Instances)

> Prevent Multiple SortOrders for an Expression
> ---------------------------------------------
>
>                 Key: SPARK-46349
>                 URL: https://issues.apache.org/jira/browse/SPARK-46349
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 4.0.0
>            Reporter: Majid Hajiheidari
>            Priority: Minor
>
> Hello everyone,
> This is my first contribution to the project. I welcome any feedback and 
> edits to improve this pull request.Currently, it's possible to create 
> redundant sort expressions in both Scala and Python APIs, leading to 
> potentially incorrect and confusing SQL statements. For example:
> Scala:
> {code:java}
> spark.range(10).orderBy($"id".desc.asc).show(){code}
> Python:
> {code:java}
> spark.range(10).orderBy(f.desc('id'), ascending=False).show(){code}
>  
> Such usage generates SQL like order by id DESC NULLS LAST DESC NULLS LAST, 
> causing non-descriptive error messages.
> I created a pull request for handling the issue. This pull request introduces 
> a constraint in the SortOrder class, ensuring that its child cannot be 
> another instance of SortOrder. This change prevents the creation of nested, 
> redundant sort expressions.
> Additionally, in PySpark's DataFrame.sort, there's an ascending keyword 
> argument that could conflict with already sorted expressions. I've added an 
> exception handler to generate more descriptive error messages in such cases.
> A test case has been added to verify that no double ordering occurs after 
> this fix.
>  
> I look forward to your feedback and thank you for considering this 
> contribution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to