Hello,
Any pointers on what is causing the optimizer to convert broadcast to
shuffle join?
This join is with a file that is just 4kb in size.
Complete plan -->
https://www.dropbox.com/s/apuomw1dg0t1jtc/plan_with_select.txt?dl=0
DAG from UI -->
https://www.dropbox.com/s/4xc9d0rdkx2fun8/DAG_with_se
Micheal,
Output of DF.queryExecution is saved to
https://www.dropbox.com/s/1vizuwpswza1e3x/plan.txt?dl=0
I don't see anything in this to suggest a switch in strategy. Hopefully you
find this helpful.
Srikanth
On Thu, Jan 28, 2016 at 4:43 PM, Michael Armbrust
wrote:
> Can you provide the analyz
Can you provide the analyzed and optimized plans (explain(true))
On Thu, Jan 28, 2016 at 12:26 PM, Srikanth wrote:
> Hello,
>
> I have a use case where one large table has to be joined with several
> smaller tables.
> I've added broadcast hint for all small tables in the joins.
>
> val large
Hello,
I have a use case where one large table has to be joined with several
smaller tables.
I've added broadcast hint for all small tables in the joins.
val largeTableDF = sqlContext.read.format("com.databricks.spark.csv")
val metaActionDF = sqlContext.read.format("json")
val cidOrg