On a more serious note -- yes, Datasets breaks with Scala value classes in Spark 2.0.0 and Spark 1.6.1. I wrote up a JIRA bug and I hope some more knowledgable people can look at it.
Sean Own has commented on other code generation errors before I put him as shepherd in JIRA. Michael Armbrust has expressed interest in code generation / encoder problems I have found recently. Here is the problem: https://issues.apache.org/jira/browse/SPARK-17368 Thank you On Thu, Sep 1, 2016 at 3:09 PM, Aris <arisofala...@gmail.com> wrote: > Thank you Jakob on two counts > > 1. Yes, thanks for pointing out that spark-shell cannot take value > classes, that was an additional confusion to me! > > 2. We have a Spark 2.0 project which is definitely breaking at runtime > with a Dataset of value classes. I am not sure if this is also the case in > Spark 1.6, I'm going to verify. > > Once I can make a trivial example with value classes breaking I will open > a JIRA for Spark. > > Is Martin Odersky your father? Does this mean Scala is your sister? And > does that mean Spark is your cousin? ;-) > > > On Thu, Sep 1, 2016 at 2:44 PM, Jakob Odersky <ja...@odersky.com> wrote: > >> Hi Aris, >> thanks for sharing this issue. I can confirm that value classes >> currently don't work, however I can't think of reason why they >> shouldn't be supported. I would therefore recommend that you report >> this as a bug. >> >> (Btw, value classes also currently aren't definable in the REPL. See >> https://issues.apache.org/jira/browse/SPARK-17367) >> >> regards, >> --Jakob >> >> On Thu, Sep 1, 2016 at 1:58 PM, Aris <arisofala...@gmail.com> wrote: >> > Hello Spark community - >> > >> > Does Spark 2.0 Datasets *not support* Scala Value classes (basically >> > "extends AnyVal" with a bunch of limitations) ? >> > >> > I am trying to do something like this: >> > >> > case class FeatureId(value: Int) extends AnyVal >> > val seq = Seq(FeatureId(1),FeatureId(2),FeatureId(3)) >> > import spark.implicits._ >> > val ds = spark.createDataset(seq) >> > ds.count >> > >> > >> > This will compile, but then it will break at runtime with a cryptic >> error >> > about "cannot find int at value". If I remove the "extends AnyVal" part, >> > then everything works. >> > >> > Value classes are a great performance boost / static type checking >> feature >> > in Scala, but are they prohibited in Spark Datasets? >> > >> > Thanks! >> > >> > >