[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...

cloud-fan Mon, 07 May 2018 00:28:36 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21252#discussion_r186345913
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -1238,6 +1238,14 @@ object SQLConf {
           .booleanConf
           .createWithDefault(true)
     
    +  val SORT_IN_MEM_FOR_LIMIT_THRESHOLD =
    +    buildConf("spark.sql.limit.sortInMemThreshold")
    +      .internal()
    +      .doc("In sql like 'select x from t order by y limit m', if m is 
under this threshold, " +
    +          "sort in memory, otherwise do a global sort with disk.")
    +      .intConf
    +      .createWithDefault(2000)
    --- End diff --
    
    what if users only have a few queries which have large limit and they want 
to disable the top n sort? I feel this config is more flexible than a boolean 
flag.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21252: [SPARK-24193] Sort by disk when number of limit i...

Reply via email to