jonathanc-n commented on issue #17267:
URL: https://github.com/apache/datafusion/issues/17267#issuecomment-3290012901

   @alamb @2010YOUY01 There is a problem with the HJ -> SMJ runtime switch. I 
might be unaware but there isn't a way to enforce sort and repartition at 
runtime. 
   
   The solution I was thinking of was to:
    - During physical planner, pass in a flag for whether to allow for the 
switch from HJ to SMJ during runtime. 
    - Hash join will have required input ordering enforce a sort if the flag 
above is true
    - If hash join detects a spill during `collect_left_input`, it will spill 
and let SMJ pick up the work. If no spill is detected, then it   will continue 
as usual.
   
   This solution doesn't make too much sense since we would need to pay the 
cost of sorting + the regular hash join execution if there is no need to spill. 
We should just specify to users that the prefer_hash_join flag should be false 
if the user needs spilling for joins. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to