The Spark REPL is slightly modified from the normal Scala REPL to prevent
work from being done twice when closures are deserialized on the workers.
 I'm not sure exactly why this causes your problem, but its probably worth
filing a JIRA about it.

Here is another issues with classes defined in the REPL.  Not sure if it is
related, but I'd be curious if the workaround helps you:
https://issues.apache.org/jira/browse/SPARK-1199

Michael


On Thu, Apr 24, 2014 at 3:14 AM, Piotr Kołaczkowski
<pkola...@datastax.com>wrote:

> Hi,
>
> I'm working on Cassandra-Spark integration and I hit a pretty severe
> problem. One of the provided functionality is mapping Cassandra rows into
> objects of user-defined classes. E.g. like this:
>
> class MyRow(val key: String, val data: Int)
> sc.cassandraTable("keyspace", "table").select("key", "data").as[MyRow]  //
> returns CassandraRDD[MyRow]
>
> In this example CassandraRDD creates MyRow instances by reflection, i.e.
> matches selected fields from Cassandra table and passes them to the
> constructor.
>
> Unfortunately this does not work in Spark REPL.
> Turns out any class declared on the REPL is an inner classes, and to be
> successfully created, it needs a reference to the outer object, even though
> it doesn't really use anything from the outer context.
>
> scala> class SomeClass
> defined class SomeClass
>
> scala> classOf[SomeClass].getConstructors()(0)
> res11: java.lang.reflect.Constructor[_] = public
> $iwC$$iwC$SomeClass($iwC$$iwC)
>
> I tried passing a null as a temporary workaround, and it also doesn't work
> - I get NPE.
> How can I get a reference to the current outer object representing the
> context of the current line?
>
> Also, plain non-spark Scala REPL doesn't exhibit this behaviour - and
> classes declared on the REPL are proper top-most classes, not inner ones.
> Why?
>
> Thanks,
> Piotr
>
>
>
>
>
>
>
> --
> Piotr Kolaczkowski, Lead Software Engineer
> pkola...@datastax.com
>
> 777 Mariners Island Blvd., Suite 510
> San Mateo, CA 94404
>

Reply via email to