Re: Spark with HLists

2014-10-29 Thread Koert Kuipers
looks like a misssing class issue? what makes you think its serialization?

shapeless does indeed have a lot of helper classes that get sucked in and
are not serializable. see here:
https://groups.google.com/forum/#!topic/shapeless-dev/05_DXnoVnI4

and for a project that uses shapeless in spark see here:
https://github.com/tresata/spark-columnar

On Wed, Oct 29, 2014 at 7:05 PM, Simon Hafner  wrote:

> I tried using shapeless HLists as data storage for data inside spark.
> Unsurprisingly, it failed. The deserialization isn't well-defined because
> of
> all the implicits used by shapeless. How could I make it work?
>
> Sample Code:
>
> /* SimpleApp.scala */
> import org.apache.spark.SparkContext
> import org.apache.spark.SparkContext._
> import org.apache.spark.SparkConf
> import shapeless._
> import ops.hlist._
>
> object SimpleApp {
>   def main(args: Array[String]) {
> val logFile = "/tmp/README.md" // Should be some file on your system
> val conf = new SparkConf().setAppName("Simple Application")
> val sc = new SparkContext(conf)
> val logData = sc.textFile(logFile, 2).cache()
> val numAs = logData
>   .map(line => line :: HNil)
>   .filter(_.select[String].contains("a"))
>   .count()
> println("Lines with a: %s".format(numAs))
>   }
> }
>
> Error:
>
> Exception in thread "main" java.lang.NoClassDefFoundError:
> shapeless/$colon$colon
> at SimpleApp$.main(SimpleApp.scala:15)
> at SimpleApp.main(SimpleApp.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Spark with HLists

2014-10-29 Thread Simon Hafner
I tried using shapeless HLists as data storage for data inside spark.
Unsurprisingly, it failed. The deserialization isn't well-defined because of
all the implicits used by shapeless. How could I make it work?

Sample Code:

/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import shapeless._
import ops.hlist._

object SimpleApp {
  def main(args: Array[String]) {
val logFile = "/tmp/README.md" // Should be some file on your system
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData
  .map(line => line :: HNil)
  .filter(_.select[String].contains("a"))
  .count()
println("Lines with a: %s".format(numAs))
  }
}

Error:

Exception in thread "main" java.lang.NoClassDefFoundError:
shapeless/$colon$colon
at SimpleApp$.main(SimpleApp.scala:15)
at SimpleApp.main(SimpleApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org