mingmwang commented on issue #4139:
URL: 
https://github.com/apache/arrow-datafusion/issues/4139#issuecomment-1306856449

   @alamb @andygrove @isidentical @Dandandan @yahoNanJing 
   I plan to add a new physical rule for physical join implementation selection 
based on stats.
   
   The basic idea is if one of side(left or right) is small enough(threshold on 
size or row count, for example 10M), will choose HashJoin:CollectLeft.
   Else if either side exceed that threshold but still smaller than hash join 
threshold(estimated_size/partition_count <= 128M for example), will choose  
HashJoin: Partitioned.
   Otherwise choose SortMergeJoin.
   
   Please share your thoughts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to