Github user jinfengni commented on the issue:

    https://github.com/apache/drill/pull/905
  
    @vvysotskyi , what I'm thinking is not just comparing the ratio of column 
count vs ratio of row count.  
    
    Let's take a step back. This SwapHashJoinVisitor is trying to correct the 
join decision made by either VolcanoPlanner or HepPlanner, in the sense the 
original join order might need put bigger dataset on the build side. The swap 
is needed because we want to avoid high memory requirement on hash join 
operator. The question is how do we define "big dataset". For now, 
SwapHashJoinVisitor simply uses rowCount, which is not sufficient. 
    
    In stead, we probably should add a method `getSelfMemCost` to all `Prel` 
node. For non-blocking operator, it's simply returning either 0 or some 
constant (to hold one single batch). For non-blocking operator such as 
HashJoin, it will return a value proportional to rowCount X columnCount (more 
precisely, total number of bytes per row, considering different column data 
type). 
    
    Same as existing method of `computeSelfCost`, we need 
`getCumulativeMemCost` which will return the cumulative cost for child nodes 
rooted at one `Prel` node.  With this `getSelfMemCost` and 
`getCumulativeMemCost` defined for HashJoin, and a HashJoin with input1, input2 
as inputs, we could estimate cumulative memory cost for HashJoin(input1, 
input2), and HashJoin(input2, input1), and use that as criteria to decide 
whether we have to switch them. 
    
    This idea is not trying to adjust the row count estimation. In stead, it's 
trying to change the criteria where we may think it's necessary to swap, based 
on the observation that we want to do swap only when we want to reduce memory 
requirement for a query. 
    
    Will the above idea work? If it could not address this issue, it's probably 
fine to go with what you proposed. Before we go that option, please give some 
thoughts about the above idea. 
    



---

Reply via email to