Hi,
I am dynamically doing union all and adding new column too
val dfresult =
> dfAcStamp.select("Col1","Col1","Col3","Col4","Col5","Col6","col7","col8","col9")
> val schemaL = dfresult.schema
> var dffiltered = sqlContext.createDataFrame(sc.emptyRDD[Row], schemaL)
> for ((key,values) <- lcrMap) {
> if(values(4) != null){
> println("Condition============="+values(4))
> val renameRepId = values(0)+"REP_ID"
> dffiltered.printSchema
> dfresult.printSchema
> dffiltered =
> dffiltered.unionAll(dfresult.withColumn(renameRepId,lit(values(3))).drop("Col9").select("Col1","Col1","Col3","Col4","Col5","Col6","Col7","Col8","Col9").where(values(4))).distinct()
> }
> }
when I am printing the schema
dfresult
root
|-- Col1: date (nullable = true)
|-- Col2: date (nullable = true)
|-- Col3: string (nullable = false)
|-- Col4: string (nullable = false)
|-- Col5: string (nullable = false)
|-- Col6: string (nullable = true)
|-- Col7: string (nullable = true)
|-- Col8: string (nullable = true)
|-- Col9: null (nullable = true)
dffiltered Schema
root
|-- Col1: date (nullable = true)
|-- Col2: date (nullable = true)
|-- Col3: string (nullable = false)
|-- Col4: string (nullable = false)
|-- Col5: string (nullable = false)
|-- Col6: string (nullable = true)
|-- Col7: string (nullable = true)
|-- Col8: string (nullable = true)
|-- Col9: null (nullable = true)
As It is priting the same schema but when I am doing UnionAll its giving me
below error
org.apache.spark.sql.AnalysisException: Union can only be performed on
tables with the same number of columns, but the left table has 9 columns
and the right has 8;
Could somebody help me in pointing out my mistake .
Thanks,