Hi all, Currently we handle the JPQL DISTINCT keyword as a 1:1 mapping to the SQL DISTINCT keyword. So the following JPQL: "SELECT DISTINCT b FROM Book b WHERE b.title = 'Gone Sailing'" may result in the following SQL SQL : "SELECT DISTINCT t0.id, t0.dueDate, t0.title, . . . FROM Book t0 WHERE t0.title = 'Gone Sailing'"
This works fine for most queries, but when I was looking into OPENJPA-894 I noticed a problem with some relationships and the JOIN FETCH clause. A JOIN FETCH looking like this : JPQL: "SELECT b from Book b JOIN FETCH b.subjects WHERE b.title = 'Gone Sailing'" Has two effects : 1. It eagerly loads b.subjects 2. It returns (potentially) multiple references to the same book. One reference to Book(id=1) for every subject associated with Book(id=1). The resulting SQL may look like this (in this case Book is MxM with Subject) : SQL "SELECT t0.oid, t0.dueDate, t0.title, t1.BOOKS_OID, t2.oid, t2.name FROM LIBBOOK t0 INNER JOIN LIBSUBJECT_LIBBOOK t1 ON t0.oid = t1.BOOKS_OID INNER JOIN LIBSUBJECT t2 ON t1.SUBJECTS_OID = t2.oid WHERE t0.title = 'Gone Fishing'" Lets say that the Book with title = 'Gone Sailing' has two subjects : Outdoors and Sportsman. In that case the SQL will return two rows that look something like this : OID DUEDATE TITLE BOOKS_OID OID NAME ---- ---------- --------- ---------- ---- ----------- 3 2009-11-11 Gone Fishing 3 12 Outdoors 3 2009-11-11 Gone Fishing 3 13 Sportsman The fix for OPENJPA-894 generates a result list with two references to the same Book(id=3). If you only wanted eager fetching of b.subjects, and didn't want duplicates a good first guess would be to add the DISTINCT keyword (I'm finally getting back to the subject) JPQL : "SELECT DISTINCT b from Book b JOIN FETCH b.subjects WHERE b.title = 'Gone Sailing'" SQL : "SELECT DISTINCT t0.oid, t0.dueDate, t0.title, t1.BOOKS_OID, t2.oid, t2.name FROM LIBBOOK t0 INNER JOIN LIBSUBJECT_LIBBOOK t1 ON t0.oid = t1.BOOKS_OID INNER JOIN LIBSUBJECT t2 ON t1.SUBJECTS_OID = t2.oid WHERE t0.title = 'Gone Fishing'" We'll still get the same two rows because the SQL DISTINCT keyword is applied to all permutations of the columns - not just the oid (and if it did only apply to the oid we wouldn't be eagerly loading b.subjects anyway). As a result I think we'll have to use a SetBackedList (or otherwise enforce distinct results after getting rows from SQL) as our ResultList. I have a patch that does this, and resolves the use case I described above (it's in the TestLibService unit test), but I'm open to any other ideas for ways to resolve the problem. Thanks, -mike
