Hi Mike, Very good description of the problem. JOIN FETCH is one place where the assumption that a object-oriented query is same as a row-based query shows its strain. DISTINCT in JPQL referred to b, while DISTINCT in SQL referred to the row it selects -- and b is not a row but the root of an object graph!
SetBackedList is indeed a good idea. Other option is to drop the JOIN FETCH clauses altogether from part of the query. But to add them to the FetchPlan. Then query JPQL: "SELECT b from Book b JOIN FETCH b.subjects WHERE b.title = 'Gone Sailing'" is effectively "SELECT b from Book b WHERE b.title = 'Gone Sailing'" fetch.add(Book.class, "subjects"); Do not how much trouble to tweak OpenJPA that way though! Michael Dick wrote: > > Hi all, > > Currently we handle the JPQL DISTINCT keyword as a 1:1 mapping to the SQL > DISTINCT keyword. So the following > JPQL: "SELECT DISTINCT b FROM Book b WHERE b.title = 'Gone Sailing'" > may result in the following SQL > SQL : "SELECT DISTINCT t0.id, t0.dueDate, t0.title, . . . FROM Book t0 > WHERE > t0.title = 'Gone Sailing'" > > This works fine for most queries, but when I was looking into OPENJPA-894 > I > noticed a problem with some relationships and the JOIN FETCH clause. A > JOIN > FETCH looking like this : > JPQL: "SELECT b from Book b JOIN FETCH b.subjects WHERE b.title = 'Gone > Sailing'" > Has two effects : > 1. It eagerly loads b.subjects > 2. It returns (potentially) multiple references to the same book. One > reference to Book(id=1) for every subject associated with Book(id=1). > > The resulting SQL may look like this (in this case Book is MxM with > Subject) > : > SQL "SELECT t0.oid, t0.dueDate, t0.title, t1.BOOKS_OID, t2.oid, t2.name > FROM > LIBBOOK t0 INNER JOIN LIBSUBJECT_LIBBOOK t1 ON t0.oid = t1.BOOKS_OID INNER > JOIN LIBSUBJECT t2 ON t1.SUBJECTS_OID = t2.oid WHERE t0.title = 'Gone > Fishing'" > > Lets say that the Book with title = 'Gone Sailing' has two subjects : > Outdoors and Sportsman. In that case the SQL will return two rows that > look > something like this : > > OID DUEDATE TITLE BOOKS_OID OID NAME > ---- ---------- --------- ---------- ---- ----------- > 3 2009-11-11 Gone Fishing 3 12 Outdoors > 3 2009-11-11 Gone Fishing 3 13 Sportsman > > The fix for OPENJPA-894 generates a result list with two references to the > same Book(id=3). > > If you only wanted eager fetching of b.subjects, and didn't want > duplicates > a good first guess would be to add the DISTINCT keyword (I'm finally > getting > back to the subject) > > JPQL : "SELECT DISTINCT b from Book b JOIN FETCH b.subjects WHERE b.title > = > 'Gone Sailing'" > SQL : "SELECT DISTINCT t0.oid, t0.dueDate, t0.title, t1.BOOKS_OID, > t2.oid, > t2.name FROM LIBBOOK t0 INNER JOIN LIBSUBJECT_LIBBOOK t1 ON t0.oid = > t1.BOOKS_OID INNER JOIN LIBSUBJECT t2 ON t1.SUBJECTS_OID = t2.oid WHERE > t0.title = 'Gone Fishing'" > > We'll still get the same two rows because the SQL DISTINCT keyword is > applied to all permutations of the columns - not just the oid (and if it > did > only apply to the oid we wouldn't be eagerly loading b.subjects anyway). > As > a result I think we'll have to use a SetBackedList (or otherwise enforce > distinct results after getting rows from SQL) as our ResultList. > > I have a patch that does this, and resolves the use case I described above > (it's in the TestLibService unit test), but I'm open to any other ideas > for > ways to resolve the problem. > > Thanks, > -mike > > ----- Pinaki -- View this message in context: http://n2.nabble.com/How-should-we-handle-the-JPQL-DISTINCT-keyword-tp3908400p3909427.html Sent from the OpenJPA Developers mailing list archive at Nabble.com.
