cloud-fan commented on a change in pull request #34270:
URL: https://github.com/apache/spark/pull/34270#discussion_r727865330
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -1705,11 +1705,21 @@ object SQLConf {
.doc("Enable two-level aggregate hash map. When enabled, records will
first be " +
"inserted/looked-up at a 1st-level, small, fast map, and then fallback
to a " +
"2nd-level, larger, slower map when 1st level is full or keys cannot
be found. " +
- "When disabled, records go directly to the 2nd level.")
+ "When disabled, records go directly to the 2nd level. Enable for
partial aggregate only.")
.version("2.3.0")
.booleanConf
.createWithDefault(true)
+ val ENABLE_TWOLEVEL_FINAL_AGG_MAP =
+ buildConf("spark.sql.codegen.aggregate.final.map.twolevel.enabled")
+ .internal()
+ .doc("Enable two-level aggregate hash map for final aggregate as well.
Disable by default " +
+ "because final aggregate might get more distinct keys compared to
partial aggregate. " +
+ "Overhead of looking up 1st-level map might dominate when having a lot
of distinct keys.")
+ .version("3.2.0")
Review comment:
3.2.1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]