Wan Kun created SPARK-44773:
-------------------------------
Summary: Code-gen CodegenFallback expression in WholeStageCodegen
if possible
Key: SPARK-44773
URL: https://issues.apache.org/jira/browse/SPARK-44773
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.5.0
Reporter: Wan Kun
Now both WholeStageCodegen framework and SubExpressionElimination framework
does not support CodegenFallback expression, but the CodegenFallback expression
which contains nullSafeEval method could gen-code just like common expressions,
now they are always be executed in a new SpecificUnsafeProjection class, and we
can not eliminate the sub expressions.
For example:
SQL:
{code:sql}
SELECT from_json(regexp_replace(s, 'a', 'x'), 'x INT, b DOUBLE').x,
from_json(regexp_replace(s, 'a', 'x'), 'x INT, b DOUBLE').b
FROM values('{"a":1, "b":0.8}') t(s)
{code}
plan:
{code:java}
*(1) Project [from_json(StructField(x,IntegerType,true), regexp_replace(s#218,
a, x, 1), Some(America/Los_Angeles)).x AS from_json(regexp_replace(s, a, x,
1)).x#219, from_json(StructField(b,DoubleType,true), regexp_replace(s#218, a,
x, 1), Some(America/Los_Angeles)).b AS from_json(regexp_replace(s, a, x,
1)).b#220]
+- *(1) LocalTableScan [s#218]
{code}
Due to expression org.apache.spark.sql.catalyst.expressions.JsonToStructs is
CodegenFallback expression, so we can not reuse the result of
{*}regexp_replace(s, 'a', 'x'){*}.
We can support expression
org.apache.spark.sql.catalyst.expressions.JsonToStructs code-gen in
WholeStageCodegen framework, and then reuse the result of {*}regexp_replace(s,
'a', 'x'){*}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]