Hi,

thanks for the answers. If joining the DataFrames is the solution, then why
does the simple withColumn() succeed for some datasets and fail for others?

2016-02-11 11:53 GMT+01:00 Michał Zieliński <zielinski.mich...@gmail.com>:

> I think a good idea would be to do a join:
>
> outputDF = unlabelledDF.join(predictedDF.select(“id”,”predicted”),”id”)
>
> On 11 February 2016 at 10:12, Zsolt Tóth <toth.zsolt....@gmail.com> wrote:
>
>> Hi,
>>
>> I'd like to append a column of a dataframe to another DF (using Spark
>> 1.5.2):
>>
>> DataFrame outputDF = unlabelledDF.withColumn("predicted_label",
>> predictedDF.col("predicted"));
>>
>> I get the following exception:
>>
>> java.lang.IllegalArgumentException: requirement failed: DataFrame must
>> have the same schema as the relation to which is inserted.
>> DataFrame schema:
>> StructType(StructField(predicted_label,DoubleType,true), ...<other 700
>> numerical (ByteType/ShortType) columns>
>> Relation schema: StructType(StructField(predicted_label,DoubleType,true),
>> ...<the same 700 columns>
>>
>> The interesting part is that the two schemas in the exception are exactly
>> the same.
>> The same code with other input data (with fewer, both numerical and
>> non-numerical column) succeeds.
>> Any idea why this happens?
>>
>>
>

Reply via email to