Yes for RDD -- both are materialized. No for DataFrame/SQL - one side
streams.


On Thu, Sep 17, 2015 at 11:21 AM, Koert Kuipers <ko...@tresata.com> wrote:

> in scalding we join with the smaller side on the left, since the smaller
> side will get buffered while the bigger side streams through the join.
>
> looking at CoGroupedRDD i do not get the impression such a distiction is
> made. it seems both sided are put into a map that can spill to disk. is
> this correct?
>
> thanks
>

Reply via email to