Hi, It seems as if when doing broadcast join, the entire dataframe is resent even if part of it has already been broadcasted.
Consider the following case: val df1 = ??? val df2 = ??? val df3 = ??? df3.join(broadcast(df1), on=cond, "left_outer") followed by df4.join(broadcast(df1.union(df2), on=cond, "left_outer") I would expect the second broadcast to only broadcast the difference. However, if I do explain(true) I see the entire union is broadcast. My use case is that I have a series of dataframes on which I need to do some enrichment, joining them with a small dataframe. The small dataframe gets additional information (as the result of each aggregation). Is there an efficient way of doing this? Thanks, Assaf.