Re: pyspark: Java null pointer exception when accessing broadcast variables

2015-02-13 Thread Davies Liu
;>> >> > org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:137) >>> >> > at >>> >> > >>> >> > org.apache.spark.api.python.PythonRDD$$anon$1.(PythonRDD.scala:174) >>> >> > at >>>

Re: pyspark: Java null pointer exception when accessing broadcast variables

2015-02-12 Thread Rok Roskar
Hi again, I narrowed down the issue a bit more -- it seems to have to do with the Kryo serializer. When I use it, then this results in a Null Pointer: rdd = sc.parallelize(range(10)) d = {} from random import random for i in range(10) : d[i] = random() rdd.map(lambda x: d[x]).collect()

Re: pyspark: Java null pointer exception when accessing broadcast variables

2015-02-11 Thread Rok Roskar
gt; > org.apache.spark.api.python.PythonRDD$WriterThread$$ > anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(PythonRDD.scala:233) > >> > at > >> > org.apache.spark.api.python.PythonRDD$WriterThread$$ > anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(PythonRDD.scala:

Re: pyspark: Java null pointer exception when accessing broadcast variables

2015-02-11 Thread Davies Liu
) >> > at >> > scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >> > at >> > scala.collection.IterableLike$class.foreach(IterableLike.scala:72) >> > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) >

Re: pyspark: Java null pointer exception when accessing broadcast variables

2015-02-10 Thread Rok Roskar
disk. > > > > > > On Feb 10, 2015, at 10:01 PM, Davies Liu wrote: > > > >> Could you paste the NPE stack trace here? It will better to create a > >> JIRA for it, thanks! > >> &g

Re: pyspark: Java null pointer exception when accessing broadcast variables

2015-02-10 Thread Davies Liu
and am >>> consistently getting Java null pointer exceptions. This is inside an IPython >>> session connected to a standalone spark cluster. I seem to recall being able >>> to do this before but at the moment I am at a loss as to what to try next. >>> Is

Re: pyspark: Java null pointer exception when accessing broadcast variables

2015-02-10 Thread Rok Roskar
park cluster. I seem to recall being able >> to do this before but at the moment I am at a loss as to what to try next. >> Is there a limit to the size of broadcast variables? This one is rather >> large (a few Gb dict). Thanks! >> >> Rok >> >> >>

Re: pyspark: Java null pointer exception when accessing broadcast variables

2015-02-10 Thread Davies Liu
> > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/pyspark-Java-null-pointer-exception-when-accessing-broadcast-variables-tp21580.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --

pyspark: Java null pointer exception when accessing broadcast variables

2015-02-10 Thread rok
o try next. Is there a limit to the size of broadcast variables? This one is rather large (a few Gb dict). Thanks! Rok -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/pyspark-Java-null-pointer-exception-when-accessing-broadcast-variables-tp21580.html Sent fro