RE: RE: Performance question with joins

John Walker Wed, 03 Apr 2024 03:55:53 -0700

Hi Lorenz

> So for the current query its way faster to force the join being done on all
> matching variables (X, Y) - but I don't know implications for other queries.
>


So, in this case, it proves more efficient to make a JoinKey from the 
intersection of two sets of variables, rather than just a single variable.

It's worth to note that, for this query, that artificial join key will 
correspond to a single binding on both the left and right side of the join.
Does it then follow that it is always "better" to use the most specific 
(composite) key for hash joins?
i.e. the join key should be based on all shared variables between the left and 
right side of the join, not just the first one that is found.

Are there cases where that would not hold true?

Is it straightforward to try modifying the AbstractIterHashJoin to call the 
JoinKey.create method instead of JoinKey.createVarKey method?
Or make it possible to control which strategy is used for choosing the join key?

> 
> Cheers,
> Lorenz
> 

John

RE: RE: Performance question with joins

Reply via email to