[ https://issues.apache.org/jira/browse/SPARK-21870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan reassigned SPARK-21870: ----------------------------------- Assignee: Takeshi Yamamuro > Split codegen'd aggregation code into small functions for the HotSpot > --------------------------------------------------------------------- > > Key: SPARK-21870 > URL: https://issues.apache.org/jira/browse/SPARK-21870 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.2.0 > Reporter: Takeshi Yamamuro > Assignee: Takeshi Yamamuro > Priority: Minor > > In SPARK-21603, we got performance regression if the HotSpot didn't compile > too long functions (the limit is 80000 in bytecode size). > I checked and I found the codegen of `HashAggregateExec` frequently goes over > the limit, for example: > {code} > spark.range(10000000).selectExpr("id % 1024 AS a", "id AS > b").write.saveAsTable("t") > sql("SELECT a, KURTOSIS(b)FROM t GROUP BY a") > {code} > This query goes over the limit and the actual bytecode size is `12356`. > So, it might be better to split the aggregation code into piecies. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org