subject:"RE\: Saprk 1.5 \- How to join 3 RDDs in a SQL DF\?"

RE: Saprk 1.5 - How to join 3 RDDs in a SQL DF?

2015-10-11 Thread Cheng, Hao

Thank you Ted, that’s very informative; from the DB optimization point of view, 
the Cost Base join re-ordering, and the multi-way joins does provide better 
performance;

But from the API design point of view, 2 arguments (relation) for JOIN in the 
DF API probably be enough for the multiple tables join cases, as we can always 
use the nested 2 way joins to represents the multi-joins.
For example: A join B join C join D (multi-way join)=>
((A join B) join C) join D
 (A join (B join C)) join D
(A join B) join (C join D) etc.
That also means, we leave the optimization work for Spark SQL, not by users, 
and we believe Spark SQL can do most of the dirty work for us.

However, sometimes, people do write an optimal SQL (e.g. with better join 
ordering) than the Spark SQL optimizer does, then we’d better resort to the API 
SqlContext.sql(“”).

Cheers
Hao

From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Monday, October 12, 2015 8:37 AM
To: Cheng, Hao
Cc: Richard Eggert; Subhajit Purkayastha; User
Subject: Re: Saprk 1.5 - How to join 3 RDDs in a SQL DF?

Some weekend reading:
http://stackoverflow.com/questions/20022196/are-left-outer-joins-associative

Cheers

On Sun, Oct 11, 2015 at 5:32 PM, Cheng, Hao 
mailto:hao.ch...@intel.com>> wrote:
A join B join C === (A join B) join C
Semantically they are equivalent, right?

From: Richard Eggert 
[mailto:richard.egg...@gmail.com<mailto:richard.egg...@gmail.com>]
Sent: Monday, October 12, 2015 5:12 AM
To: Subhajit Purkayastha
Cc: User
Subject: Re: Saprk 1.5 - How to join 3 RDDs in a SQL DF?


It's the same as joining 2. Join two together, and then join the third one to 
the result of that.
On Oct 11, 2015 2:57 PM, "Subhajit Purkayastha" 
mailto:spurk...@p3si.net>> wrote:
Can I join 3 different RDDs together in a Spark SQL DF? I can find examples for 
2 RDDs but not 3.

Thanks

Re: Saprk 1.5 - How to join 3 RDDs in a SQL DF?

2015-10-11 Thread Ted Yu

Some weekend reading:
http://stackoverflow.com/questions/20022196/are-left-outer-joins-associative

Cheers

On Sun, Oct 11, 2015 at 5:32 PM, Cheng, Hao  wrote:

> A join B join C === (A join B) join C
>
> Semantically they are equivalent, right?
>
>
>
> *From:* Richard Eggert [mailto:richard.egg...@gmail.com]
> *Sent:* Monday, October 12, 2015 5:12 AM
> *To:* Subhajit Purkayastha
> *Cc:* User
> *Subject:* Re: Saprk 1.5 - How to join 3 RDDs in a SQL DF?
>
>
>
> It's the same as joining 2. Join two together, and then join the third one
> to the result of that.
>
> On Oct 11, 2015 2:57 PM, "Subhajit Purkayastha"  wrote:
>
> Can I join 3 different RDDs together in a Spark SQL DF? I can find
> examples for 2 RDDs but not 3.
>
>
>
> Thanks
>
>
>
>

RE: Saprk 1.5 - How to join 3 RDDs in a SQL DF?

2015-10-11 Thread Cheng, Hao

A join B join C === (A join B) join C
Semantically they are equivalent, right?

From: Richard Eggert [mailto:richard.egg...@gmail.com]
Sent: Monday, October 12, 2015 5:12 AM
To: Subhajit Purkayastha
Cc: User
Subject: Re: Saprk 1.5 - How to join 3 RDDs in a SQL DF?

It's the same as joining 2. Join two together, and then join the third one to 
the result of that.
On Oct 11, 2015 2:57 PM, "Subhajit Purkayastha" 
mailto:spurk...@p3si.net>> wrote:
Can I join 3 different RDDs together in a Spark SQL DF? I can find examples for 
2 RDDs but not 3.

Thanks

Re: Saprk 1.5 - How to join 3 RDDs in a SQL DF?

2015-10-11 Thread Richard Eggert

It's the same as joining 2. Join two together, and then join the third one
to the result of that.
On Oct 11, 2015 2:57 PM, "Subhajit Purkayastha"  wrote:

> Can I join 3 different RDDs together in a Spark SQL DF? I can find
> examples for 2 RDDs but not 3.
>
>
>
> Thanks
>
>
>

RE: Saprk 1.5 - How to join 3 RDDs in a SQL DF?

Re: Saprk 1.5 - How to join 3 RDDs in a SQL DF?

RE: Saprk 1.5 - How to join 3 RDDs in a SQL DF?

Re: Saprk 1.5 - How to join 3 RDDs in a SQL DF?

4 matches

Site Navigation

Mail list logo

Footer information