Hi Divya, You can use the withColumn method from the DataFrame API. Here is the method signature:
def withColumn(colName: String, col: Column<http://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/Column.html>): DataFrame Mohammed Author: Big Data Analytics with Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/> From: Divya Gehlot [mailto:divya.htco...@gmail.com] Sent: Thursday, February 4, 2016 1:29 AM To: user @spark Subject: add new column in the schema + Dataframe Hi, I am beginner in spark and using Spark 1.5.2 on YARN.(HDP2.3.4) I have a use case where I have to read two input files and based on certain conditions in second input file ,have to add a new column in the first input file and save it . I am using spark-csv to read my input files . Would really appreciate if somebody would share their thoughts on best/feasible way of doing it(using dataframe API) Thanks, Divya