Hello, I am using 1.4.0 and found the following weird behavior.
This case works fine: scala> sc.parallelize(Seq((1,2), (3, 100))).toDF.withColumn("index", rand(30)).show() +--+---+-------------------+ |_1| _2| index| +--+---+-------------------+ | 1| 2| 0.6662967911724369| | 3|100|0.35734504984676396| +--+---+-------------------+ However, when I use sqlContext.createDataFrame instead, I get a NPE: scala> sqlContext.createDataFrame(Seq((1,2), (3, 100))).withColumn("index", rand(30)).show() java.lang.NullPointerException at org.apache.spark.sql.catalyst.expressions.RDG.rng$lzycompute(random.scala:39) at org.apache.spark.sql.catalyst.expressions.RDG.rng(random.scala:39) .. Does any one know why? Thanks. Justin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-with-functions-rand-tp23267.html Sent from the Apache Spark User List mailing list archive at Nabble.com.