[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...

cloud-fan Mon, 07 May 2018 00:45:52 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21252#discussion_r186349524
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -1238,6 +1238,14 @@ object SQLConf {
           .booleanConf
           .createWithDefault(true)
     
    +  val SORT_IN_MEM_FOR_LIMIT_THRESHOLD =
    +    buildConf("spark.sql.limit.sortInMemThreshold")
    +      .internal()
    +      .doc("In sql like 'select x from t order by y limit m', if m is 
under this threshold, " +
    +          "sort in memory, otherwise do a global sort with disk.")
    +      .intConf
    +      .createWithDefault(2000)
    --- End diff --
    
    I would suggest `Int.Max` as the default value, which preserves the 
previous behavior. Users can tune it w.r.t. their workload.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...

Reply via email to