Created PR and verified the example given by Justin works with the change: https://github.com/apache/spark/pull/6793
Cheers On Wed, Jun 10, 2015 at 7:15 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Looks like the NPE came from this line: > @transient protected lazy val rng = new XORShiftRandom(seed + > TaskContext.get().partitionId()) > > Could TaskContext.get() be null ? > > On Wed, Jun 10, 2015 at 6:15 PM, Justin Yip <yipjus...@prediction.io> > wrote: > >> Hello, >> >> I am using 1.4.0 and found the following weird behavior. >> >> This case works fine: >> >> scala> sc.parallelize(Seq((1,2), (3, 100))).toDF.withColumn("index", >> rand(30)).show() >> +--+---+-------------------+ >> |_1| _2| index| >> +--+---+-------------------+ >> | 1| 2| 0.6662967911724369| >> | 3|100|0.35734504984676396| >> +--+---+-------------------+ >> >> However, when I use sqlContext.createDataFrame instead, I get a NPE: >> >> scala> sqlContext.createDataFrame(Seq((1,2), (3, >> 100))).withColumn("index", rand(30)).show() >> java.lang.NullPointerException >> at >> org.apache.spark.sql.catalyst.expressions.RDG.rng$lzycompute(random.scala:39) >> at org.apache.spark.sql.catalyst.expressions.RDG.rng(random.scala:39) >> .. >> >> >> Does any one know why? >> >> Thanks. >> >> Justin >> >> ------------------------------ >> View this message in context: NullPointerException with functions.rand() >> <http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-with-functions-rand-tp23267.html> >> Sent from the Apache Spark User List mailing list archive >> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >> > >