[
https://issues.apache.org/jira/browse/SPARK-57027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-57027:
-----------------------------------
Labels: pull-request-available (was: )
> SortMergeJoinExec: skip statically-dead branches in codegen
> -----------------------------------------------------------
>
> Key: SPARK-57027
> URL: https://issues.apache.org/jira/browse/SPARK-57027
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 5.0.0
> Reporter: Gengliang Wang
> Priority: Major
> Labels: pull-request-available
>
> Two statically-dead patterns in {{SortMergeJoinExec}} codegen:
> 1. {{genComparison}} emits {{comp = 0; if (comp == 0) { comp = compare(k1); }
> ...}}. The first {{if (comp == 0)}} is always true. Emit {{comp =
> compare(k1);}} directly; only wrap subsequent keys. {{genComparison}} is
> called 5x per SMJ stage (twice in {{genScanner}}, three times in
> {{codegenFullOuter}}). For single-key joins (common), each call collapses to
> one line.
> 2. {{genScanner}} and {{codegenFullOuter}} emit {{if (k1IsNull || k2IsNull ||
> ...) { handler }}}. When all key {{ExprValue}}s have {{isNull ==
> FalseLiteral}}, the disjunction is statically false and the whole block
> (including its {{handleStreamedAnyNull}} / "join with null row" handler) is
> dead. Detect this and omit the block. Hits fact/dimension joins on numeric
> keys where Spark has already proved non-nullability.
> Behavior preserved -- JIT eliminates the dead code at runtime; the win is
> smaller generated source, more 64KB method-limit headroom, and slightly
> faster Janino compile.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]