Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12668#discussion_r61089122
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
    @@ -107,16 +80,28 @@ private[sql] abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
        * - Shuffle hash join: if the average size of a single partition is 
small enough to build a hash
        *     table.
        * - Sort merge: if the matching join keys are sortable.
    +   *
    +   * If there is no joining keys, Join implementations are chosen with the 
following precedence:
    +   * - BroadcastNestedLoopJoin: if one side of the join could be 
broadcasted
    +   * - CartesianProduct: for Inner join
    +   * - BroadcastNestedLoopJoin
        */
    -  object EquiJoinSelection extends Strategy with PredicateHelper {
    +  object JoinSelection extends Strategy with PredicateHelper {
    +
    +    /**
    +     * Matches a plan whose output should be small enough to be used in 
broadcast join.
    +     */
    +    private def canBroadcast(plan: LogicalPlan): Boolean = {
    +      plan.statistics.sizeInBytes <= conf.autoBroadcastJoinThreshold
    +    }
     
         /**
          * Matches a plan whose single partition should be small enough to 
build a hash table.
          *
          * Note: this assume that the number of partition is fixed, requires 
additional work if it's
          * dynamic.
          */
    -    def canBuildHashMap(plan: LogicalPlan): Boolean = {
    +    private def canBuildHashMap(plan: LogicalPlan): Boolean = {
    --- End diff --
    
    NIT: canBuildLocalHashMap..? since we want to establish if we can safely 
build a hashmap on each partition.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to