Not sure what you mean about “raw” Spark sql, but there is one parameter which 
will impact the optimizer choose broadcast join automatically or not :

spark.sql.autoBroadcastJoinThreshold

You can read Spark doc about above parameter setting and using explain to check 
your join using broadcast or not.

Make sure you gather statistics for tables.
 
There is broadcast hint also. Please be aware if the table being broadcasted to 
all worker nodes is fairly big, it will not be a good option always.

Kathleen

Sent from my iPhone

> On Oct 3, 2018, at 4:37 PM, kant kodali <kanth...@gmail.com> wrote:
> 
> Hi All,
> 
> How to do a broadcast join using raw Spark SQL 2.3.1 or 2.3.2? 
> 
> Thanks
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to