GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/20599
[SPARK-23407][SQL] add a config to try to inline all mutable states during
codegen
## What changes were proposed in this pull request?
This is a followup of https://github.com/apache/spark/pull/19811 .
In #19811, we picked a sub-optimal solution that always compact
non-primitive mutable states to arrays, to make primitive mutable states more
likely to get inlined.
This PR introduces a new config to not treat primitive states specially and
try to inline all states, to avoid any potential perf regression in Spark 2.3.
By default it's false.
In the future, we can remove this config, and dynamically decide which
states to inline. For example, we can use placeholders during codegen, and
analysis all the mutable states at the end and replace the placeholders.
Note that there are no known regression cases, so this is not a blocker for
Spark 2.3
## How was this patch tested?
a new test.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark codegen
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20599.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20599
----
commit 013c02f215de85d50b4c7125ee571b14801bdb47
Author: Wenchen Fan <wenchen@...>
Date: 2018-02-13T12:23:47Z
add a config to try to inline all mutable states during codegen
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]