Petr Kadlec created OPENJPA-2841:
------------------------------------

             Summary: Broken superfluous joins with subqueries
                 Key: OPENJPA-2841
                 URL: https://issues.apache.org/jira/browse/OPENJPA-2841
             Project: OpenJPA
          Issue Type: Bug
          Components: query, sql
    Affects Versions: 3.1.2, 2.4.2
            Reporter: Petr Kadlec


OpenJPA generates wrong SQL for a query with nested subqueries, causing the 
result to be wrong. My attempts to rewrite the query helped to get the correct 
results, even though the generated alternate queries seem not to be perfect. 
(There might a way to write them better.)

The following query:

{{select u from User u where u.email=:firstEmail and u<>:firstUser and exists 
(select o1 from Order o1 where o1.user=u and exists(select o2 from Order o2 
where o2.user=:firstUser and o1.productCode=o2.productCode))}}

generates the following SQL:

{{SELECT t0.id, t0.email, t0.name FROM users t0 WHERE (t0.email = ? AND t0.id 
<> ? AND EXISTS (SELECT t1.id FROM orders t1, orders t3 WHERE (t1.user_id = 
t0.id AND EXISTS (SELECT t2.id FROM orders t2 WHERE (t2.user_id = ? AND 
t3.product_code = t2.product_code)))))}}

Notice the superfluous join to “orders t3” which causes the whole query result 
to be wrong.

When the query is modified to replace the second nested subquery with a join to

{{select u from User u where u.email=:firstEmail and u<>:firstUser and exists 
(select o1 from Order o1, Order o2 where o1.user=u and o2.user=:firstUser and 
o1.productCode=o2.productCode)}}

the generated SQL is:

{{SELECT t0.id, t0.email, t0.name FROM users t0 WHERE (t0.email = ? AND t0.id 
<> ? AND EXISTS (SELECT t3.id FROM users t1 CROSS JOIN orders t2, orders t3 
WHERE (t3.user_id = t0.id AND t2.user_id = ? AND t3.product_code = 
t2.product_code)))}}

which contains a strange construction of doing a cross join to „users t1“ 
(which is completely unused later). Still, this seems to return the correct 
result, at least. (In my short testing; to be honest, I do not understand what 
exactly is this construction doing, and if its semantics is better, equal, or 
worse than not doing the cross join.) However, the unnecessary join does seem 
to affect performance.

Finally, when I rewrote the query from the other side, to:

{{select o from Order o join o.user u where u.email=:firstEmail and 
u<>:firstUser and o.productCode in (select o1.productCode from Order o1 where 
o1.user=:firstUser)}}

The resulting SQL is:

{{SELECT t0.id, t0.product_code, t3.id, t3.email, t3.name FROM orders t0 INNER 
JOIN users t1 ON t0.user_id = t1.id LEFT OUTER JOIN users t3 ON t0.user_id = 
t3.id WHERE (t1.email = ? AND t0.user_id <> ? AND t0.product_code IN (SELECT 
t2.product_code FROM orders t2 WHERE (t2.user_id = ?)))}}

Which generates correct results, even though it _still_ contains a superfluous 
outer join to „users t3“ (used only in the result projection) which is 
identical to the previous inner join, which again affects performance.

A complete reproducible project is available [at my 
Github|https://github.com/mormegil-cz/openjpa-bug-repro].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to