Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22512#discussion_r222530205
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
 ---
    @@ -238,7 +262,9 @@ object DecimalLiteral {
     /**
      * In order to do type checking, use Literal.create() instead of 
constructor
      */
    -case class Literal (value: Any, dataType: DataType) extends LeafExpression 
{
    +case class Literal(value: Any, dataType: DataType) extends LeafExpression {
    +
    +  Literal.validateLiteralValue(value, dataType)
    --- End diff --
    
    I'm not sure though, is it ok for `Literal` to have a different Scala-typed 
value for the corresponding `dataType`? , e.g., `new Literal(1 /* int value */, 
LongType)`? In the current master, there are some places to do so, e.g., 
    
https://github.com/apache/spark/blob/927e527934a882fab89ca661c4eb31f84c45d830/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala#L213
    
https://github.com/apache/spark/blob/927e527934a882fab89ca661c4eb31f84c45d830/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala#L796
    
    In the codegen path, this is ok because we add a correct literal suffix in 
`Literal.doGenCode` (e.g., `1L` for `new Literal(1, LongType)`);
    
https://github.com/apache/spark/blob/927e527934a882fab89ca661c4eb31f84c45d830/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala#L294
    
    But, in the non-codegen path (e.g., 
`spark.sql.codegen.factoryMode=NO_CODEGEN` and `ConstantFolding`), this case 
throws an exception ;
    ```
    
    scala> import org.apache.spark.sql.Column
    scala> import org.apache.spark.sql.catalyst.expressions.Literal
    scala> import org.apache.spark.sql.types._
    scala> val intOne: Int = 1
    scala> val lit = Literal.create(intOne, LongType)
    scala> spark.range(1).select(struct(new Column(lit))).collect
    18/10/04 11:35:56 ERROR Executor: Exception in task 3.0 in stage 0.0 (TID 3)
    java.lang.ClassCastException: java.lang.Integer cannot be cast to 
java.lang.Long
        at scala.runtime.BoxesRunTime.unboxToLong(BoxesRunTime.java:105)
        at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getLong(rows.scala:42)
        at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getLong(rows.scala:195)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
        at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
        at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:619)
        ...
    ```
    
    WDYT? cc: @gatorsmile @cloud-fan


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to