I've been looking into two OUTER JOIN tests in the TCK tests, and it appears to 
me that the expected results are incorrect. The two tests are:
org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testRightOuterJoin1
org.apache.jackrabbit.test.api.query.qom.EquiJoinConditionTest#testLeftOuterJoin2

I'll do my best to explain my logic in a concise manner.

Both tests are set up the same way: two nodes are created:

/testroot/workarea/node1 {jcr:primaryType=nt:unstructured, prop1=yikqysrwur}
/testroot/workarea/node1/node2 {jcr:primaryType=nt:unstructured, 
prop1=yikqysrwur, prop2=yikqysrwur, jcr:mixinTypes=[mix:referenceable], 
jcr:uuid=c9118bb2-922e-4612-acd7-7152105f5684}

where a single string is randomly generated and used for the values for "prop1" 
and "prop2", and only the second node is made to be "mix:referenceable".  

The "testRightOuterJoin1" test runs this query:

SELECT * FROM [nt:unstructured] AS left  
RIGHT OUTER JOIN [nt:unstructured] AS right  
ON left.prop1 = right.prop2  
WHERE ISDESCENDANTNODE(right,'/testroot/workarea')

The left side of the join has at least two tuples (one for "node1", one for 
"node2", and other nodes which do not have a 'prop1' value), and column of 
interest is the "prop1" column. Thus the left side tuples (or the parts we care 
about for the join) look like:

[ node1, yikqysrwur ]
[ node2, yikqysrwur ]
[ …, <null> ]

The right side of the join has only two tuples ("node1" and "node2") because of 
the "ISDESCENDANTNODE" criteria, and the only column of interest is the "prop2" 
column. Thus, the right side tuples (or the parts we care about for the join) 
look like:

[ node1, <null> ]
[ node2, yikqysrwur ]

When we perform a RIGHT OUTER JOIN, we have to **include all the tuples on the 
right** even if they don't match a value on the left tuples. Thus, "node1" must 
be included in the results, and because it has a null value for the "prop2" 
column will not match any of the tuples on the left (since a null value is not 
equal to another null value in the case of join criteria). So the result set 
should contain these combinations of nodes:

[ null, node1 ]
[ node1, node2 ]
[ node2, node2 ]

However, the test expects the following result:

[ node1, node2 ]
[ node2, node2 ]


This seems incorrect to me, because it is missing the [node1, null] tuple that 
was on the right side of the join. Can anyone explain why my reasoning is 
wrong, or do you agree that the test is incorrect?

The "testLeftOuterJoin2" test uses a LEFT OUTER JOIN but instead reverses the 
properties in the query, and thus fails for a similar reason.

Best regards,

Randall Hauch  

Reply via email to