Aris Vlasakakis created SPARK-17368: ---------------------------------------
Summary: Scala value classes create encoder problems and break at runtime Key: SPARK-17368 URL: https://issues.apache.org/jira/browse/SPARK-17368 Project: Spark Issue Type: Bug Components: Spark Core, SQL Affects Versions: 2.0.0, 1.6.2 Environment: Java 8 on MacOS Reporter: Aris Vlasakakis Using Scala value classes as the inner type for Datasets breaks in Spark 2.0 and 1.6.X. This simple Spark 2 application demonstrates that the code will compile, but will break at runtime with the error. The value class is of course *FeatureId*, as it extends AnyVal. {noformat} Exception in thread "main" java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: Couldn't find v on int assertnotnull(input[0, int, true], top level non-flat input object).v AS v#0 +- assertnotnull(input[0, int, true], top level non-flat input object).v +- assertnotnull(input[0, int, true], top level non-flat input object) +- input[0, int, true]". at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:279) at org.apache.spark.sql.SparkSession$$anonfun$3.apply(SparkSession.scala:421) at org.apache.spark.sql.SparkSession$$anonfun$3.apply(SparkSession.scala:421) {noformat} Test code: {noformat} import org.apache.spark.sql.{Dataset, SparkSession} object BreakSpark { case class FeatureId(v: Int) extends AnyVal def main(args: Array[String]): Unit = { val seq = Seq(FeatureId(1), FeatureId(2), FeatureId(3)) val spark = SparkSession.builder.getOrCreate() import spark.implicits._ spark.sparkContext.setLogLevel("warn") val ds: Dataset[FeatureId] = spark.createDataset(seq) println(s"BREAK HERE: ${ds.count}") } } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org