I'm sorry, but I don't really understand what you mean when you say "wide" in this context. For a HashJoin, the only dependencies of the produced RDD are the two input RDDs. For BroadcastNestedLoopJoin The only dependence will be on the streamed RDD. The other RDD will be distributed to all nodes using a Broadcast variable.
Michael On Thu, Apr 3, 2014 at 12:59 PM, Jan-Paul Bultmann <[email protected]>wrote: > Hey, > Does somebody know the kinds of dependencies that the new SQL operators > produce? > I'm specifically interested in the relational join operation as it seems > substantially more optimized. > > The old join was narrow on two RDDs with the same partitioner. > Is the relational join narrow as well? > > Cheers Jan
