bart-samwel commented on pull request #29709:
URL: https://github.com/apache/spark/pull/29709#issuecomment-691034060


   On Thu, Sep 10, 2020 at 4:31 PM Yuming Wang <[email protected]>
   wrote:
   
   > Can't we just do this automatically by applying the "would this be small
   > enough to broadcast" criterion instead of looking at the "is this actually
   > selected for broadcast"?
   >
   > First of all, this is a SortMergeJoin. To use DPP, you need to disable
   > spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly. But
   > disabling spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly
   > may affect other Join.
   >
   That just means that
   spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly needs a
   second
   
spark.sql.optimizer.dynamicPartitionPruning.enableForBroadcastSizedJoinInputsOnly,
   which is slightly more lenient because it also applies to things that are
   broadcast-sized but that don't actually get broadcast because of <reasons>.
   
   
   > Second, the statistics of the plan are usually inaccurate, which makes us
   > unable to determine whether it is suitable for broadcasting in the plan
   > phase.
   >
   But we use statistics for that *now* already -- inaccurate as they are. And
   your argument is that this is for things that would broadcast otherwise
   except for the join type, which implies that the statistics would have
   worked here?
   
   
   > So, I think it’s most appropriate to add hint.
   >
   > —
   > You are receiving this because you commented.
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/spark/pull/29709#issuecomment-690328789>, or
   > unsubscribe
   > 
<https://github.com/notifications/unsubscribe-auth/AKOBKFDN237RJFSSUW3XLLTSFDPNDANCNFSM4RESONJA>
   > .
   >
   
   
   -- 
   Bart Samwel
   [email protected]
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to