Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/17164
  
    This pr added an new SQL option `spark.sql.aggregate.preferSortAggregate` 
to preferably select `SortAggregate` for easy-to-test in 
`DataFrameAggregateSuite.scala`. In some cases (e.g., input data is already 
sorted in cache), sort aggregate is faster than hash one (See: 
https://issues.apache.org/jira/browse/SPARK-18591). But, you know, the current 
spark  does not adaptively select sort aggregate in these cases. So, I probably 
think this option is some useful to control aggregate strategies by user. What 
do u think? cc: @hvanhovell  If yes, I'd like to make another pr to add this 
option before this pr reviewed. 
https://github.com/apache/spark/compare/master...maropu:SPARK-16844-3


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to