[ 
https://issues.apache.org/jira/browse/SPARK-54000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18032413#comment-18032413
 ] 

lifulong edited comment on SPARK-54000 at 10/23/25 8:51 AM:
------------------------------------------------------------

!https://wiki.in.zhihu.com/download/attachments/640447372/image2025-10-2_13-32-26.png?version=1&modificationDate=1759383146774&api=v2!

from flame graph we can see, most time cost is setNullAt call when enable whole 
stage code gen and not add -XX:-TieredCompilation jvm parameter


was (Author: lifulong):
!https://wiki.in.zhihu.com/download/attachments/640447372/image2025-10-2_11-52-55.png?version=1&modificationDate=1759377176124&api=v2!

> Complex sql with expand operator and code gen enabled, very slow
> ----------------------------------------------------------------
>
>                 Key: SPARK-54000
>                 URL: https://issues.apache.org/jira/browse/SPARK-54000
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.5.2
>         Environment: spark sql 3.5.2
>            Reporter: lifulong
>            Priority: Major
>
> Complex sql with expand operator and code gen enabled, very slow
> sql format like select keya,keyb,count(distinct case when),...,count(distinct 
> case when),sum(a),sum(b) from x group by keya,keyb
> when disable whole stage code gen, run will speed up 20x times
> when add executor jvm parameter -XX:-TieredCompilation, run will speed up 20x 
> times
> reduce select column count, such as 28 -> 27, can speed up 10x times



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to