I think a good idea would be to do a join: outputDF = unlabelledDF.join(predictedDF.select(“id”,”predicted”),”id”)
On 11 February 2016 at 10:12, Zsolt Tóth <toth.zsolt....@gmail.com> wrote: > Hi, > > I'd like to append a column of a dataframe to another DF (using Spark > 1.5.2): > > DataFrame outputDF = unlabelledDF.withColumn("predicted_label", > predictedDF.col("predicted")); > > I get the following exception: > > java.lang.IllegalArgumentException: requirement failed: DataFrame must > have the same schema as the relation to which is inserted. > DataFrame schema: StructType(StructField(predicted_label,DoubleType,true), > ...<other 700 numerical (ByteType/ShortType) columns> > Relation schema: StructType(StructField(predicted_label,DoubleType,true), > ...<the same 700 columns> > > The interesting part is that the two schemas in the exception are exactly > the same. > The same code with other input data (with fewer, both numerical and > non-numerical column) succeeds. > Any idea why this happens? > >