Seems like a bug. Suggest filing an issue with code snippet if this can be reproduced on 1.6 branch.
Cheers On Fri, Feb 12, 2016 at 4:25 AM, Zsolt Tóth <toth.zsolt....@gmail.com> wrote: > Sure. I ran the same job with fewer columns, the exception: > > java.lang.IllegalArgumentException: requirement failed: DataFrame must have > the same schema as the relation to which is inserted. > DataFrame schema: StructType(StructField(pixel0,ByteType,true), > StructField(pixel1,ByteType,true), StructField(pixel10,ByteType,true), > StructField(pixel100,ShortType,true), StructField(pixel101,ShortType,true), > StructField(pixel102,ShortType,true), StructField(pixel103,ShortType,true), > StructField(pixel105,ShortType,true), StructField(pixel106,ShortType,true), > StructField(id,DoubleType,true), StructField(label,ByteType,true), > StructField(predict,DoubleType,true)) > Relation schema: StructType(StructField(pixel0,ByteType,true), > StructField(pixel1,ByteType,true), StructField(pixel10,ByteType,true), > StructField(pixel100,ShortType,true), StructField(pixel101,ShortType,true), > StructField(pixel102,ShortType,true), StructField(pixel103,ShortType,true), > StructField(pixel105,ShortType,true), StructField(pixel106,ShortType,true), > StructField(id,DoubleType,true), StructField(label,ByteType,true), > StructField(predict,DoubleType,true)) > > at scala.Predef$.require(Predef.scala:233) > at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoopFsRelation.scala:113) > at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1.apply(InsertIntoHadoopFsRelation.scala:108) > at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1.apply(InsertIntoHadoopFsRelation.scala:108) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56) > at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:108) > at > org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:57) > at > org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:57) > at > org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:69) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138) > at > org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:933) > at > org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:933) > at > org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:197) > at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:146) > at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:137) > at > org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:304) > > Regards, > > Zsolt > > > 2016-02-12 13:11 GMT+01:00 Ted Yu <yuzhih...@gmail.com>: > >> Can you pastebin the full error with all column types ? >> >> There should be a difference between some column(s). >> >> Cheers >> >> > On Feb 11, 2016, at 2:12 AM, Zsolt Tóth <toth.zsolt....@gmail.com> >> wrote: >> > >> > Hi, >> > >> > I'd like to append a column of a dataframe to another DF (using Spark >> 1.5.2): >> > >> > DataFrame outputDF = unlabelledDF.withColumn("predicted_label", >> predictedDF.col("predicted")); >> > >> > I get the following exception: >> > >> > java.lang.IllegalArgumentException: requirement failed: DataFrame must >> have the same schema as the relation to which is inserted. >> > DataFrame schema: >> StructType(StructField(predicted_label,DoubleType,true), ...<other 700 >> numerical (ByteType/ShortType) columns> >> > Relation schema: >> StructType(StructField(predicted_label,DoubleType,true), ...<the same 700 >> columns> >> > >> > The interesting part is that the two schemas in the exception are >> exactly the same. >> > The same code with other input data (with fewer, both numerical and >> non-numerical column) succeeds. >> > Any idea why this happens? >> > >> > >