Singletons aren't hacks; it can be an entirely appropriate pattern for
this. What exception do you get? From Spark or your code? I think this
pattern is orthogonal to using Spark.
On Jan 21, 2015 8:11 AM, octavian.ganea octavian.ga...@inf.ethz.ch
wrote:
In case someone has the same problem:
In case someone has the same problem:
The singleton hack works for me sometimes, sometimes it doesn't in spark
1.2.0, that is, sometimes I get nullpointerexception. Anyway, if you really
need to work with big indexes and you want to have the smallest amount of
communication between master and
currently we migrate from 1.1 to 1.2, and found our program 3x slower,
maybe due to the singleton hack?
could you explain in detail why or how The singleton hack works very
different in spark 1.2.0
thanks!
2015-01-18 20:56 GMT+08:00 octavian.ganea octavian.ga...@inf.ethz.ch:
The singleton
The singleton hack works very different in spark 1.2.0 (it does not work if
the program has multiple map-reduce jobs in the same program). I guess there
should be an official documentation on how to have each machine/node do an
init step locally before executing any other instructions (e.g.
Spark cached the RDD in JVM, so presumably, yes, the singleton trick should
work.
Sent from my Google Nexus 5
On Aug 9, 2014 11:00 AM, Kevin James Matzen kmat...@cs.cornell.edu
wrote:
I have a related question. With Hadoop, I would do the same thing for
non-serializable objects and setup().
Although nobody answers the Two questions, in my practice, it seems both
are yes.
2014-08-04 19:50 GMT+08:00 Fengyun RAO raofeng...@gmail.com:
object LogParserWrapper {
private val logParser = {
val settings = new ...
val builders = new
new
I have a related question. With Hadoop, I would do the same thing for
non-serializable objects and setup(). I also had a use case where it
was so expensive to initialize the non-serializable object that I
would make it a static member of the mapper, turn on JVM reuse across
tasks, and then
I think you’re going to have to make it serializable by registering it with the
Kryo registrator. I think multiple workers are running as separate VMs so it
might need to be able to serialize and deserialize broadcasted variables to the
different executors.
Thanks,
Ron
On Aug 3, 2014, at 6:38
As shown here:
2 - Why Is My Spark Job so Slow and Only Using a Single Thread?
http://engineering.sharethrough.com/blog/2013/09/13/top-3-troubleshooting-tips-to-keep-you-sparking/
123456789101112131415
object JSONParser { def parse(raw: String): String = ...}object
MyFirstSparkJob { def