Naga –
I believe that the second argument to the withColumn method has to be a column 
calculated from the source DataFrame on which you call that method. The 
following will work:

df2.withColumn("age2", $"age"+10)


Mohammed
Author: Big Data Analytics with 
Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Tuesday, January 26, 2016 1:45 PM
To: naga sharathrayapati
Cc: user
Subject: Re: withColumn

A brief search among the Spark source code showed no support for referencing 
column the way shown in your code.

Are you trying to do a join ?

Cheers

On Tue, Jan 26, 2016 at 1:04 PM, naga sharathrayapati 
<sharathrayap...@gmail.com<mailto:sharathrayap...@gmail.com>> wrote:

I was trying to append a Column to a dataframe df2 by using 'withColumn'(as 
shown below), can anyone help me understand what went wrong?



scala> case class Sharath(name1: String, age1: Long)

defined class Sharath

scala> val df1 = Seq(Sharath("Sharath", 29)).toDF()

df1: org.apache.spark.sql.DataFrame = [name1: string, age1: bigint]

scala> case class Sunil(name: String, age: Long)

defined class Sunil

scala> val df2 = Seq(Sunil("Sunil", 33)).toDF()

df2: org.apache.spark.sql.DataFrame = [name: string, age: bigint]

scala> df2.withColumn("agess",df1("name1"))

org.apache.spark.sql.AnalysisException: resolved attribute(s) name1#0 missing 
from name#2,age#3L in operator !Project [name#2,age#3L,name1#0 AS agess#4];

at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44)

Reply via email to