I've been looking into two OUTER JOIN tests in the TCK tests, and it appears to
me that the expected results are incorrect. The two tests are:
org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testRightOuterJoin1
org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testLeftOuterJoin2
I'll do my best to explain my logic in a concise manner.
Both tests are set up the same way: two nodes are created:
/testroot/workarea/node1 {jcr:primaryType=nt:unstructured, prop1=yikqysrwur}
/testroot/workarea/node1/node2 {jcr:primaryType=nt:unstructured,
prop1=yikqysrwur, prop2=yikqysrwur, jcr:mixinTypes=[mix:referenceable],
jcr:uuid=c9118bb2-922e-4612-acd7-7152105f5684}
where a single string is randomly generated and used for the values for "prop1"
and "prop2", and only the second node is made to be "mix:referenceable".
The "testRightOuterJoin1" test runs this query:
SELECT * FROM [nt:unstructured] AS left
RIGHT OUTER JOIN [nt:unstructured] AS right
ON left.prop1 = right.prop2
WHERE ISDESCENDANTNODE(right,'/testroot/workarea')
The left side of the join has at least two tuples (one for "node1", one for
"node2", and other nodes which do not have a 'prop1' value), and column of
interest is the "prop1" column. Thus the left side tuples (or the parts we care
about for the join) look like:
[ node1, yikqysrwur ]
[ node2, yikqysrwur ]
[ …, <null> ]
The right side of the join has only two tuples ("node1" and "node2") because of
the "ISDESCENDANTNODE" criteria, and the only column of interest is the "prop2"
column. Thus, the right side tuples (or the parts we care about for the join)
look like:
[ node1, <null> ]
[ node2, yikqysrwur ]
When we perform a RIGHT OUTER JOIN, we have to **include all the tuples on the
right** even if they don't match a value on the left tuples. Thus, "node1" must
be included in the results, and because it has a null value for the "prop2"
column will not match any of the tuples on the left (since a null value is not
equal to another null value in the case of join criteria). So the result set
should contain these combinations of nodes:
[ null, node1 ]
[ node1, node2 ]
[ node2, node2 ]
However, the test expects the following result:
[ node1, node2 ]
[ node2, node2 ]
This seems incorrect to me, because it is missing the [node1, null] tuple that
was on the right side of the join. Can anyone explain why my reasoning is
wrong, or do you agree that the test is incorrect?
The "testLeftOuterJoin2" test uses a LEFT OUTER JOIN but instead reverses the
properties in the query, and thus fails for a similar reason.
Best regards,
Randall Hauch