You may want to try using df2.na.fill(…)
From: lk_spark <[email protected]>
Date: Tuesday, 6 December 2016 at 3:05 PM
To: "user.spark" <[email protected]>
Subject: how to add colum to dataframe
hi,all:
my spark version is 2.0
I have a parquet file with one colum name url type is string,I wang get
substring from the url and add it to the datafram:
val df = spark.read.parquet("/parquetdata/weixin/page/month=201607")
val df2 = df.withColumn("pa_bid",when($"url".isNull,col("url").substr(3, 5)))
df2.select("pa_bid","url").show
+------+--------------------+
|pa_bid| url|
+------+--------------------+
| null|http://mp.weixin....|
| null|http://mp.weixin....|
| null|http://mp.weixin....|
| null|http://mp.weixin....|
| null|http://mp.weixin....|
| null|http://mp.weixin....|
| null|http://mp.weixin....|
| null|http://mp.weixin....|
Why what I got is null?
2016-12-06
________________________________
lk_spark