Aris Vlasakakis created SPARK-17368:
---------------------------------------

             Summary: Scala value classes create encoder problems and break at 
runtime
                 Key: SPARK-17368
                 URL: https://issues.apache.org/jira/browse/SPARK-17368
             Project: Spark
          Issue Type: Bug
          Components: Spark Core, SQL
    Affects Versions: 2.0.0, 1.6.2
         Environment: Java 8 on MacOS
            Reporter: Aris Vlasakakis


Using Scala value classes as the inner type for Datasets breaks in Spark 2.0 
and 1.6.X.

This simple Spark 2 application demonstrates that the code will compile, but 
will break at runtime with the error. The value class is of course *FeatureId*, 
as it extends AnyVal.

{noformat}
Exception in thread "main" java.lang.RuntimeException: Error while encoding: 
java.lang.RuntimeException: Couldn't find v on int
assertnotnull(input[0, int, true], top level non-flat input object).v AS v#0
+- assertnotnull(input[0, int, true], top level non-flat input object).v
   +- assertnotnull(input[0, int, true], top level non-flat input object)
      +- input[0, int, true]".
        at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:279)
        at 
org.apache.spark.sql.SparkSession$$anonfun$3.apply(SparkSession.scala:421)
        at 
org.apache.spark.sql.SparkSession$$anonfun$3.apply(SparkSession.scala:421)
{noformat}

Test code:

{noformat}
import org.apache.spark.sql.{Dataset, SparkSession}

object BreakSpark {
  case class FeatureId(v: Int) extends AnyVal

  def main(args: Array[String]): Unit = {
    val seq = Seq(FeatureId(1), FeatureId(2), FeatureId(3))
    val spark = SparkSession.builder.getOrCreate()
    import spark.implicits._
    spark.sparkContext.setLogLevel("warn")
    val ds: Dataset[FeatureId] = spark.createDataset(seq)
    println(s"BREAK HERE: ${ds.count}")
  }
}

{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to