Looks like the NPE came from this line: @transient protected lazy val rng = new XORShiftRandom(seed + TaskContext.get().partitionId())
Could TaskContext.get() be null ? On Wed, Jun 10, 2015 at 6:15 PM, Justin Yip <yipjus...@prediction.io> wrote: > Hello, > > I am using 1.4.0 and found the following weird behavior. > > This case works fine: > > scala> sc.parallelize(Seq((1,2), (3, 100))).toDF.withColumn("index", > rand(30)).show() > +--+---+-------------------+ > |_1| _2| index| > +--+---+-------------------+ > | 1| 2| 0.6662967911724369| > | 3|100|0.35734504984676396| > +--+---+-------------------+ > > However, when I use sqlContext.createDataFrame instead, I get a NPE: > > scala> sqlContext.createDataFrame(Seq((1,2), (3, > 100))).withColumn("index", rand(30)).show() > java.lang.NullPointerException > at > org.apache.spark.sql.catalyst.expressions.RDG.rng$lzycompute(random.scala:39) > at org.apache.spark.sql.catalyst.expressions.RDG.rng(random.scala:39) > .. > > > Does any one know why? > > Thanks. > > Justin > > ------------------------------ > View this message in context: NullPointerException with functions.rand() > <http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-with-functions-rand-tp23267.html> > Sent from the Apache Spark User List mailing list archive > <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >