Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20407#discussion_r167392674 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -156,6 +156,15 @@ object SQLConf { .booleanConf .createWithDefault(true) + val ALLOW_NESTEDJOIN_FALLBACK = buildConf("spark.sql.join.broadcastJoinFallback.enabled") + .internal() + .doc("When true (default), if the other options are not available, fallback to try and use " + + "BroadcastNestedLoopJoin as join strategy. This can cause OOM which can be a problem " + + "in some scenarios, eg. when running the thriftserver. Turn to false to disable it: an " + + "AnalysisException will be thrown.") --- End diff -- I see and in general I agree with you, but in this case `BroadcastNestedLoopJoin` is used as a fallback even though we are over the threshold for the broadcast joins. So it is very likely that this is going to throw an OOM. So the rationale of this choice seems to me: I don't have any working join implementation, then just go for this one even though I know in advance that a OOM is likely to occur; anyway I don't have other choices, so if the OOM won't occur I have been lucky and I have been able to run it, otherwise it throws an OOM instead of an `AnalysisException` (so I am not able to run this scenario anyway). And this is fine if we think to normal Spark application, while for long running ones like STS an OOM is much worse than an `AnalysisException`. Therefore, I know that this is not the solution to all OOM problem, but the reason of this PR is that I think that when this rule was introduced, the choice was made "forgetting" of use cases like STS. Do you agree with me?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org