Re: How should we handle the JPQL DISTINCT keyword

Pinaki Poddar Wed, 28 Oct 2009 19:11:47 -0700

Hi Mike,
  Very good description of the problem. 
  JOIN FETCH is one place where the assumption that a object-oriented query
is same as a row-based query shows its strain. DISTINCT in JPQL referred to
b, while DISTINCT in SQL referred to the row it selects -- and b is not a
row but the root of an object graph!


  SetBackedList is indeed a good idea. 

  Other option is to drop the JOIN FETCH clauses altogether from part of the
query. But to add them to the FetchPlan. Then query   
JPQL: "SELECT b from Book b JOIN FETCH b.subjects WHERE b.title = 'Gone
Sailing'"
is effectively
  "SELECT b from Book b WHERE b.title = 'Gone Sailing'"
  fetch.add(Book.class, "subjects");

  Do not how much trouble to tweak OpenJPA that way though!




Michael Dick wrote:
> 
> Hi all,
> 
> Currently we handle the JPQL DISTINCT keyword as a 1:1 mapping to the SQL
> DISTINCT keyword. So the following
> JPQL:  "SELECT DISTINCT b FROM Book b WHERE b.title = 'Gone Sailing'"
> may result in the following SQL
> SQL : "SELECT DISTINCT t0.id, t0.dueDate, t0.title, . . . FROM Book t0
> WHERE
> t0.title = 'Gone Sailing'"
> 
> This works fine for most queries, but when I was looking into OPENJPA-894
> I
> noticed a problem with some relationships and the JOIN FETCH clause. A
> JOIN
> FETCH looking like this :
> JPQL: "SELECT b from Book b JOIN FETCH b.subjects WHERE b.title = 'Gone
> Sailing'"
> Has two effects :
>    1. It eagerly loads b.subjects
>    2. It returns (potentially) multiple references to the same book. One
> reference to Book(id=1) for every subject associated with Book(id=1).
> 
> The resulting SQL may look like this (in this case Book is MxM with
> Subject)
> :
> SQL "SELECT t0.oid, t0.dueDate, t0.title, t1.BOOKS_OID, t2.oid, t2.name
> FROM
> LIBBOOK t0 INNER JOIN LIBSUBJECT_LIBBOOK t1 ON t0.oid = t1.BOOKS_OID INNER
> JOIN LIBSUBJECT t2 ON t1.SUBJECTS_OID = t2.oid WHERE t0.title = 'Gone
> Fishing'"
> 
> Lets say that the Book with title = 'Gone Sailing' has two subjects :
> Outdoors and Sportsman. In that case the SQL will return two rows that
> look
> something like this :
> 
> OID    DUEDATE    TITLE          BOOKS_OID  OID  NAME
> ----   ---------- ---------      ---------- ---- -----------
> 3      2009-11-11 Gone Fishing   3          12   Outdoors
> 3      2009-11-11 Gone Fishing   3          13   Sportsman
> 
> The fix for OPENJPA-894 generates a result list with two references to the
> same Book(id=3).
> 
> If you only wanted eager fetching of b.subjects, and didn't want
> duplicates
> a good first guess would be to add the DISTINCT keyword (I'm finally
> getting
> back to the subject)
> 
> JPQL : "SELECT DISTINCT b from Book b JOIN FETCH b.subjects WHERE b.title
> =
> 'Gone Sailing'"
> SQL :  "SELECT DISTINCT t0.oid, t0.dueDate, t0.title, t1.BOOKS_OID,
> t2.oid,
> t2.name FROM LIBBOOK t0 INNER JOIN LIBSUBJECT_LIBBOOK t1 ON t0.oid =
> t1.BOOKS_OID INNER JOIN LIBSUBJECT t2 ON t1.SUBJECTS_OID = t2.oid WHERE
> t0.title = 'Gone Fishing'"
> 
> We'll still get the same two rows because the SQL DISTINCT keyword is
> applied to all permutations of the columns - not just the oid (and if it
> did
> only apply to the oid we wouldn't be eagerly loading b.subjects anyway).
> As
> a result I think we'll have to use a SetBackedList (or otherwise enforce
> distinct results after getting rows from SQL) as our ResultList.
> 
> I have a patch that does this, and resolves the use case I described above
> (it's in the TestLibService unit test), but I'm open to any other ideas
> for
> ways to resolve the problem.
> 
> Thanks,
> -mike
> 
> 


-----
Pinaki 
-- 
View this message in context: 
http://n2.nabble.com/How-should-we-handle-the-JPQL-DISTINCT-keyword-tp3908400p3909427.html
Sent from the OpenJPA Developers mailing list archive at Nabble.com.

Re: How should we handle the JPQL DISTINCT keyword

Reply via email to