[
https://issues.apache.org/jira/browse/SPARK-56908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gengliang Wang reopened SPARK-56908:
------------------------------------
> Reduce generated Java size in whole-stage codegen
> -------------------------------------------------
>
> Key: SPARK-56908
> URL: https://issues.apache.org/jira/browse/SPARK-56908
> Project: Spark
> Issue Type: Umbrella
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Gengliang Wang
> Priority: Major
> Labels: pull-request-available
>
> Whole-stage codegen generates a fresh Java class per stage. Across many
> operators the generated source contains (a) boilerplate that is
> type-independent across stages and can be deduplicated into static Java
> helpers, and (b) branches or variables that are statically dead at codegen
> time but emitted anyway.
> These patterns cost us in three places:
> - JVM 64KB method-size and constant-pool limits, which force interpreted
> fallback on deep query plans.
> - Janino compile time per stage.
> - JIT compile work (each stage class has its own bodies).
> This umbrella tracks small, behavior-preserving cleanups across the generated
> Java to address these issues. Each subtask is independently PR-able; behavior
> is preserved end-to-end and verified by the relevant operator's existing test
> suite with {{spark.sql.codegen.wholeStage}} forced both on and off.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]