Cheng Lian created SPARK-13456:
----------------------------------

             Summary: Cannot create encoders for case classes defined in Spark 
shell after upgrading to Scala 2.11
                 Key: SPARK-13456
                 URL: https://issues.apache.org/jira/browse/SPARK-13456
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Cheng Lian
            Priority: Blocker


Spark 2.0 started to use Scala 2.11 by default since [PR 
#10608|https://github.com/apache/spark/pull/10608].  Unfortunately, after this 
upgrade, Spark fails to create encoders for case classes defined in REPL:
{code}
import sqlContext.implicits._
case class T(a: Int, b: Double)
val ds = Seq(1 -> T(1, 1D), 2 -> T(2, 2D)).toDS()
{code}
Exception thrown:
{noformat}
org.apache.spark.sql.AnalysisException: Unable to generate an encoder for inner 
class `T` without access to the scope that this class was defined in.
Try moving this class out of its parent class.;
  at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$resolveDeserializer$1.applyOrElse(Analyzer.scala:565)
  at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$resolveDeserializer$1.applyOrElse(Analyzer.scala:561)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:262)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:262)
  at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:261)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:267)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:267)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:304)
  at scala.collection.Iterator$$anon$11.next(Iterator.scala:370)
  at scala.collection.Iterator$class.foreach(Iterator.scala:742)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
  at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
  at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
  at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
  at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:308)
  at scala.collection.AbstractIterator.to(Iterator.scala:1194)
  at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:300)
  at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1194)
  at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:287)
  at scala.collection.AbstractIterator.toArray(Iterator.scala:1194)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:353)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:267)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:267)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:267)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5$$anonfun$apply$11.apply(TreeNode.scala:333)
  at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
  at scala.collection.immutable.List.foreach(List.scala:381)
  at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
  at scala.collection.immutable.List.map(List.scala:285)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:331)
  at scala.collection.Iterator$$anon$11.next(Iterator.scala:370)
  at scala.collection.Iterator$class.foreach(Iterator.scala:742)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
  at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
  at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
  at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
  at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:308)
  at scala.collection.AbstractIterator.to(Iterator.scala:1194)
  at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:300)
  at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1194)
  at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:287)
  at scala.collection.AbstractIterator.toArray(Iterator.scala:1194)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:353)
  at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:267)
  at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:251)
  at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.resolveDeserializer(Analyzer.scala:561)
  at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.resolve(ExpressionEncoder.scala:315)
  at org.apache.spark.sql.Dataset.<init>(Dataset.scala:81)
  at org.apache.spark.sql.Dataset.<init>(Dataset.scala:92)
  at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:482)
  at 
org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:140)
  ... 51 elided
{noformat}
However, existing Dataset REPL test case does pass:
{code}
  test("SPARK-2576 importing SQLContext.implicits._") {
    // We need to use local-cluster to test this case.
    val output = runInterpreter("local-cluster[1,1,1024]",
      """
        |val sqlContext = new org.apache.spark.sql.SQLContext(sc)
        |import sqlContext.implicits._
        |case class TestCaseClass(value: Int)
        |sc.parallelize(1 to 10).map(x => TestCaseClass(x)).toDF().collect()
        |
        |// Test Dataset Serialization in the REPL
        |Seq(TestCaseClass(1)).toDS().collect()
      """.stripMargin)
    assertDoesNotContain("error:", output)
    assertDoesNotContain("Exception", output)
  }
{code}
One possible clue is that, {{ReplSuite}} calls {{SparkILoop}} directly, while 
Spark shell is started by {{o.a.s.repl.Main}}, which also sets option 
{{-Yrepl-class-based}}.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to