Hi Lorenz > So for the current query its way faster to force the join being done on all > matching variables (X, Y) - but I don't know implications for other queries. >
So, in this case, it proves more efficient to make a JoinKey from the intersection of two sets of variables, rather than just a single variable. It's worth to note that, for this query, that artificial join key will correspond to a single binding on both the left and right side of the join. Does it then follow that it is always "better" to use the most specific (composite) key for hash joins? i.e. the join key should be based on all shared variables between the left and right side of the join, not just the first one that is found. Are there cases where that would not hold true? Is it straightforward to try modifying the AbstractIterHashJoin to call the JoinKey.create method instead of JoinKey.createVarKey method? Or make it possible to control which strategy is used for choosing the join key? > > Cheers, > Lorenz > John
