[ 
https://issues.apache.org/jira/browse/SPARK-56908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang reopened SPARK-56908:
------------------------------------

> Reduce generated Java size in whole-stage codegen
> -------------------------------------------------
>
>                 Key: SPARK-56908
>                 URL: https://issues.apache.org/jira/browse/SPARK-56908
>             Project: Spark
>          Issue Type: Umbrella
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: Gengliang Wang
>            Priority: Major
>              Labels: pull-request-available
>
> Whole-stage codegen generates a fresh Java class per stage. Across many 
> operators the generated source contains (a) boilerplate that is 
> type-independent across stages and can be deduplicated into static Java 
> helpers, and (b) branches or variables that are statically dead at codegen 
> time but emitted anyway.
> These patterns cost us in three places:
> - JVM 64KB method-size and constant-pool limits, which force interpreted 
> fallback on deep query plans.
> - Janino compile time per stage.
> - JIT compile work (each stage class has its own bodies).
> This umbrella tracks small, behavior-preserving cleanups across the generated 
> Java to address these issues. Each subtask is independently PR-able; behavior 
> is preserved end-to-end and verified by the relevant operator's existing test 
> suite with {{spark.sql.codegen.wholeStage}} forced both on and off.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to