dzcxzl created SPARK-44240:
------------------------------

             Summary: Setting the topKSortFallbackThreshold value may lead to 
inaccurate results
                 Key: SPARK-44240
                 URL: https://issues.apache.org/jira/browse/SPARK-44240
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.4.0, 3.3.0, 3.2.0, 3.1.0, 3.0.0, 2.4.0
            Reporter: dzcxzl


 
{code:java}
set spark.sql.execution.topKSortFallbackThreshold=10000;
SELECT min(id) FROM ( SELECT id FROM range(999999999) ORDER BY id LIMIT 10000) 
a; {code}
 

 

If GlobalLimitExec is not the final operator, shuffle read does not guarantee 
the order, which leads to the limit read data that may be random.

TakeOrderedAndProjectExec has ordering, so there is no such problem.

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to