Based on my example, how about renaming columns?

val df1 = Seq((1, 1), (2, 2), (3, 3)).toDF("a", "b")
val df2 = Seq((1, 1), (2, 2), (3, 3)).toDF("a", "b")
val df3 = df1.join(df2, "a").select($"a", df1("b").as("1-b"),
df2("b").as("2-b"))
val df4 = df3.join(df2, df3("2-b") === df2("b"))

// maropu

On Wed, Apr 27, 2016 at 1:58 PM, Divya Gehlot <divya.htco...@gmail.com>
wrote:

> Correct Takeshi
> Even I am facing the same issue .
>
> How to avoid the ambiguity ?
>
>
> On 27 April 2016 at 11:54, Takeshi Yamamuro <linguin....@gmail.com> wrote:
>
>> Hi,
>>
>> I tried;
>> val df1 = Seq((1, 1), (2, 2), (3, 3)).toDF("a", "b")
>> val df2 = Seq((1, 1), (2, 2), (3, 3)).toDF("a", "b")
>> val df3 = df1.join(df2, "a")
>> val df4 = df3.join(df2, "b")
>>
>> And I got; org.apache.spark.sql.AnalysisException: Reference 'b' is
>> ambiguous, could be: b#6, b#14.;
>> If same case, this message makes sense and this is clear.
>>
>> Thought?
>>
>> // maropu
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Apr 27, 2016 at 6:09 AM, Prasad Ravilla <pras...@slalom.com>
>> wrote:
>>
>>> Also, check the column names of df1 ( after joining df2 and df3 ).
>>>
>>> Prasad.
>>>
>>> From: Ted Yu
>>> Date: Monday, April 25, 2016 at 8:35 PM
>>> To: Divya Gehlot
>>> Cc: "user @spark"
>>> Subject: Re: Cant join same dataframe twice ?
>>>
>>> Can you show us the structure of df2 and df3 ?
>>>
>>> Thanks
>>>
>>> On Mon, Apr 25, 2016 at 8:23 PM, Divya Gehlot <divya.htco...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>> I am using Spark 1.5.2 .
>>>> I have a use case where I need to join the same dataframe twice on two
>>>> different columns.
>>>> I am getting error missing Columns
>>>>
>>>> For instance ,
>>>> val df1 = df2.join(df3,"Column1")
>>>> Below throwing error missing columns
>>>> val df 4 = df1.join(df3,"Column2")
>>>>
>>>> Is the bug or valid scenario ?
>>>>
>>>>
>>>>
>>>>
>>>> Thanks,
>>>> Divya
>>>>
>>>
>>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>
>


-- 
---
Takeshi Yamamuro

Reply via email to