Cant perform full outer join

2015-09-29 Thread Saif.A.Ellafi
Hi all,

So I Have two dataframes, with two columns: DATE and VALUE.

Performing this:
data = data.join(cur_data, data("DATE") === cur_data("DATE"), "outer")

returns
Exception in thread "main" org.apache.spark.sql.AnalysisException: Reference 
'DATE' is ambiguous, could be: DATE#0, DATE#3.;

But if I change one of the column names, I will get two columns and won't 
really merge "DATE" column as I wish. Any ideas without going to non trivial 
procedures?

Thanks,
Saif



Re: Cant perform full outer join

2015-09-29 Thread Terry Hoo
Saif,

Might be you can rename one of the dataframe to different name first, then
do an outer join and a select like this:

val cur_d = cur_data.toDF("Date_1", "Value_1")
val r = data.join(cur_d, data("DATE" === cur_d("Date_1",
"outer").select($"DATE", $"VALUE", $"Value_1")

Thanks,
Terry

On Tue, Sep 29, 2015 at 9:56 PM,  wrote:

> Hi all,
>
> So I Have two dataframes, with two columns: DATE and VALUE.
>
> Performing this:
> data = data.join(cur_data, data(“DATE”) === cur_data("DATE"), "outer")
>
> returns
> Exception in thread "main" org.apache.spark.sql.AnalysisException:
> Reference 'DATE' is ambiguous, could be: DATE#0, DATE#3.;
>
> But if I change one of the column names, I will get two columns and won’t
> really merge “DATE” column as I wish. Any ideas without going to non
> trivial procedures?
>
> Thanks,
> Saif
>
>