[ 
https://issues.apache.org/jira/browse/SPARK-44240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dzcxzl updated SPARK-44240:
---------------------------
    Attachment: topKSortFallbackThreshold.png

> Setting the topKSortFallbackThreshold value may lead to inaccurate results
> --------------------------------------------------------------------------
>
>                 Key: SPARK-44240
>                 URL: https://issues.apache.org/jira/browse/SPARK-44240
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0, 3.0.0, 3.1.0, 3.2.0, 3.3.0, 3.4.0
>            Reporter: dzcxzl
>            Priority: Minor
>         Attachments: topKSortFallbackThreshold.png
>
>
>  
> {code:java}
> set spark.sql.execution.topKSortFallbackThreshold=10000;
> SELECT min(id) FROM ( SELECT id FROM range(999999999) ORDER BY id LIMIT 
> 10000) a; {code}
>  
> If GlobalLimitExec is not the final operator, shuffle read does not guarantee 
> the order, which leads to the limit read data that may be random.
> TakeOrderedAndProjectExec has ordering, so there is no such problem.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to