[ 
https://issues.apache.org/jira/browse/SPARK-17042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Liang updated SPARK-17042:
-------------------------------
    Description: 
A simple fix is to erase the classTag when using the default serializer, since 
it's not needed in that case, and the classTag was failing to deserialize on 
the remote end.

The proper fix is actually to use the right classloader when deserializing the 
classtags, but that is a much more invasive change for 2.0.

The following test can be added to ReplSuite to reproduce the bug:

{code}
  test("replicating blocks of object with class defined in repl") {
    val output = runInterpreter("local-cluster[2,1,1024]",
      """
        |import org.apache.spark.storage.StorageLevel._
        |case class Foo(i: Int)
        |val ret = sc.parallelize((1 to 100).map(Foo), 
10).persist(MEMORY_ONLY_2)
        |ret.count()
        |sc.getExecutorStorageStatus.map(s => s.rddBlocksById(ret.id).size).sum
      """.stripMargin)
    assertDoesNotContain("error:", output)
    assertDoesNotContain("Exception", output)
    assertContains(": Int = 20", output)
  }
{code}

  was:
The following test can be added to ReplSuite to reproduce the bug:

{code}
  test("replicating blocks of object with class defined in repl") {
    val output = runInterpreter("local-cluster[2,1,1024]",
      """
        |import org.apache.spark.storage.StorageLevel._
        |case class Foo(i: Int)
        |val ret = sc.parallelize((1 to 100).map(Foo), 
10).persist(MEMORY_ONLY_2)
        |ret.count()
        |sc.getExecutorStorageStatus.map(s => s.rddBlocksById(ret.id).size).sum
      """.stripMargin)
    assertDoesNotContain("error:", output)
    assertDoesNotContain("Exception", output)
    assertContains(": Int = 20", output)
  }
{code}


> Repl-defined classes cannot be replicated
> -----------------------------------------
>
>                 Key: SPARK-17042
>                 URL: https://issues.apache.org/jira/browse/SPARK-17042
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Block Manager, Spark Core
>            Reporter: Eric Liang
>
> A simple fix is to erase the classTag when using the default serializer, 
> since it's not needed in that case, and the classTag was failing to 
> deserialize on the remote end.
> The proper fix is actually to use the right classloader when deserializing 
> the classtags, but that is a much more invasive change for 2.0.
> The following test can be added to ReplSuite to reproduce the bug:
> {code}
>   test("replicating blocks of object with class defined in repl") {
>     val output = runInterpreter("local-cluster[2,1,1024]",
>       """
>         |import org.apache.spark.storage.StorageLevel._
>         |case class Foo(i: Int)
>         |val ret = sc.parallelize((1 to 100).map(Foo), 
> 10).persist(MEMORY_ONLY_2)
>         |ret.count()
>         |sc.getExecutorStorageStatus.map(s => 
> s.rddBlocksById(ret.id).size).sum
>       """.stripMargin)
>     assertDoesNotContain("error:", output)
>     assertDoesNotContain("Exception", output)
>     assertContains(": Int = 20", output)
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to