Bruce Robbins created SPARK-41991:
-------------------------------------

             Summary: Interpreted mode subexpression elimination can throw 
exception during insert
                 Key: SPARK-41991
                 URL: https://issues.apache.org/jira/browse/SPARK-41991
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.4.0
            Reporter: Bruce Robbins


Example:
{noformat}
drop table if exists tbl1;
create table tbl1 (a int, b int) using parquet;

set spark.sql.codegen.wholeStage=false;
set spark.sql.codegen.factoryMode=NO_CODEGEN;

insert into tbl1
select id as a, id as b
from range(1, 5);
{noformat}
This results in the following exception:
{noformat}
java.lang.ClassCastException: 
org.apache.spark.sql.catalyst.expressions.ExpressionProxy cannot be cast to 
org.apache.spark.sql.catalyst.expressions.Cast
        at 
org.apache.spark.sql.catalyst.expressions.CheckOverflowInTableInsert.withNewChildInternal(Cast.scala:2514)
        at 
org.apache.spark.sql.catalyst.expressions.CheckOverflowInTableInsert.withNewChildInternal(Cast.scala:2512)
{noformat}
The query produces 2 bigint values, but the table's schema expects 2 int 
values, so Spark wraps each output field with a {{Cast}}.

Later, in {{InterpretedUnsafeProjection}}, {{prepareExpressions}} tries to wrap 
the two {{Cast}} expressions with an {{ExpressionProxy}}. However, the parent 
expression of each {{Cast}} is a {{CheckOverflowInTableInsert}} expression, 
which does not accept {{ExpressionProxy}} as a child.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to