Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/20368#discussion_r163610413
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala
---
@@ -126,6 +126,22 @@ class BroadcastJoinSuite extends QueryTest with
SQLTestUtils {
}
}
+ test("broadcast hint is retained in a cached plan") {
+ Seq(true, false).foreach { materialized =>
+ withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
+ val df1 = spark.createDataFrame(Seq((1, "4"), (2,
"2"))).toDF("key", "value")
+ val df2 = spark.createDataFrame(Seq((1, "1"), (2,
"2"))).toDF("key", "value")
+ broadcast(df2).cache()
+ if (materialized) df2.collect()
+ val df3 = df1.join(df2, Seq("key"), "inner")
--- End diff --
All the other cases are creating Dataframes like this. Anyway, I changed
all of them in the new PR.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]