Hi,
I’m getting some seemingly invalid results when I collect an RDD. This is
happening in both Spark 1.1.0 and 1.2.0, using Java8 on Mac.
See the following code snippet:
JavaRDDThing rdd= pairRDD.values();
rdd.foreach( e - System.out.println ( RDD Foreach: + e ) );
rdd.collect().forEach( e -
at 21:25, Sean Owen so...@cloudera.com wrote:
It sounds a lot like your values are mutable classes and you are
mutating or reusing them somewhere? It might work until you actually
try to materialize them all and find many point to the same object.
On Thu, Dec 18, 2014 at 10:06 AM, Tristan
, Tristan Blakers tris...@blackfrog.org
wrote:
Suspected the same thing, but because the underlying data classes are
deserialised by Avro I think they have to be mutable as you need to
provide
the no-args constructor with settable fields.
Nothing is being cached in my code anywhere
A search shows several historical threads for similar Kryo issues, but none
seem to have a definitive solution. Currently using Spark 1.2.0.
While collecting/broadcasting/grouping moderately sized data sets (~500MB -
1GB), I regularly see exceptions such as the one below.
I’ve tried increasing
I get the same exception simply by doing a large broadcast of about 6GB.
Note that I’m broadcasting a small number (~3m) of fat objects. There’s
plenty of free RAM. This and related kryo exceptions seem to crop-up
whenever an object graph of more than a couple of GB gets passed around.
at
. Perhaps we can get kryo to
turn off that buffer, or we can at least get it to flush more often.)
thanks,
Imran
On Thu, Feb 26, 2015 at 1:06 AM, Tristan Blakers tris...@blackfrog.org
wrote:
I get the same exception simply by doing a large broadcast of about 6GB.
Note that I’m broadcasting
...@gmail.com wrote:
Thanks Tristan for sharing this. Actually this happens when I am reading
a csv file of 3.5 GB.
best,
/Shahab
On Tue, May 5, 2015 at 9:15 AM, Tristan Blakers tris...@blackfrog.org
wrote:
Hi Shahab,
I’ve seen exceptions very similar to this (it also manifests
Hi Shahab,
I’ve seen exceptions very similar to this (it also manifests as negative
array size exception), and I believe it’s a really bug in Kryo.
See this thread:
You could use a map() operation, but the easiest way is probably to just
call values() method on the JavaPairRDDA,B to get a JavaRDDB.
See this link:
https://www.safaribooksonline.com/library/view/learning-spark/9781449359034/ch04.html
Tristan
On 13 May 2015 at 23:12, Yasemin Kaya
We have had excellent results operating on RDDs using Java 8 with Lambdas.
It’s slightly more verbose than Scala, but I haven’t found this an issue,
and haven’t missed any functionality.
The new DataFrame API makes the Spark platform even more language agnostic.
Tristan
On 15 July 2015 at
10 matches
Mail list logo