Github user maropu commented on a diff in the pull request:
https://github.com/apache/spark/pull/22512#discussion_r222530205
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
---
@@ -238,7 +262,9 @@ object DecimalLiteral {
/**
* In order to do type checking, use Literal.create() instead of
constructor
*/
-case class Literal (value: Any, dataType: DataType) extends LeafExpression
{
+case class Literal(value: Any, dataType: DataType) extends LeafExpression {
+
+ Literal.validateLiteralValue(value, dataType)
--- End diff --
I'm not sure though, is it ok for `Literal` to have a different Scala-typed
value for the corresponding `dataType`? , e.g., `new Literal(1 /* int value */,
LongType)`? In the current master, there are some places to do so, e.g.,
https://github.com/apache/spark/blob/927e527934a882fab89ca661c4eb31f84c45d830/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala#L213
https://github.com/apache/spark/blob/927e527934a882fab89ca661c4eb31f84c45d830/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala#L796
In the codegen path, this is ok because we add a correct literal suffix in
`Literal.doGenCode` (e.g., `1L` for `new Literal(1, LongType)`);
https://github.com/apache/spark/blob/927e527934a882fab89ca661c4eb31f84c45d830/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala#L294
But, in the non-codegen path (e.g.,
`spark.sql.codegen.factoryMode=NO_CODEGEN` and `ConstantFolding`), this case
throws an exception ;
```
scala> import org.apache.spark.sql.Column
scala> import org.apache.spark.sql.catalyst.expressions.Literal
scala> import org.apache.spark.sql.types._
scala> val intOne: Int = 1
scala> val lit = Literal.create(intOne, LongType)
scala> spark.range(1).select(struct(new Column(lit))).collect
18/10/04 11:35:56 ERROR Executor: Exception in task 3.0 in stage 0.0 (TID 3)
java.lang.ClassCastException: java.lang.Integer cannot be cast to
java.lang.Long
at scala.runtime.BoxesRunTime.unboxToLong(BoxesRunTime.java:105)
at
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getLong(rows.scala:42)
at
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getLong(rows.scala:195)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
Source)
at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at
org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:619)
...
```
WDYT? cc: @gatorsmile @cloud-fan
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]