Bruce Robbins created SPARK-41991: ------------------------------------- Summary: Interpreted mode subexpression elimination can throw exception during insert Key: SPARK-41991 URL: https://issues.apache.org/jira/browse/SPARK-41991 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.4.0 Reporter: Bruce Robbins
Example: {noformat} drop table if exists tbl1; create table tbl1 (a int, b int) using parquet; set spark.sql.codegen.wholeStage=false; set spark.sql.codegen.factoryMode=NO_CODEGEN; insert into tbl1 select id as a, id as b from range(1, 5); {noformat} This results in the following exception: {noformat} java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.ExpressionProxy cannot be cast to org.apache.spark.sql.catalyst.expressions.Cast at org.apache.spark.sql.catalyst.expressions.CheckOverflowInTableInsert.withNewChildInternal(Cast.scala:2514) at org.apache.spark.sql.catalyst.expressions.CheckOverflowInTableInsert.withNewChildInternal(Cast.scala:2512) {noformat} The query produces 2 bigint values, but the table's schema expects 2 int values, so Spark wraps each output field with a {{Cast}}. Later, in {{InterpretedUnsafeProjection}}, {{prepareExpressions}} tries to wrap the two {{Cast}} expressions with an {{ExpressionProxy}}. However, the parent expression of each {{Cast}} is a {{CheckOverflowInTableInsert}} expression, which does not accept {{ExpressionProxy}} as a child. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org