[jira] Commented: (DERBY-4331) Join returns results in wrong order

Mamta A. Satoor (JIRA) Thu, 06 Aug 2009 09:52:39 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740149#action_12740149
 ]


Mamta A. Satoor commented on DERBY-4331:
----------------------------------------

The bug behind the query below is as follows
SELECT CS.ID 
FROM --DERBY-PROPERTIES joinOrder=FIXED 
    FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH 
    , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET 
    , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID 
WHERE 
 CS.ID = FC.CHANGESET AND 
 F.ID = FC.FILE 
 ORDER BY CS.ID DESC; 

Optimizer while considering any join order collects the sorting provided by 
that join order. For instance, for the query above, optimizer will first start 
with [F, -1, -1] as the join order and for this join order,  it knows that it 
can guarantee row ordering on FILES.REPOSITORY AND FILE.PATH. Next in the join 
order will be [F, FC, -1] and now available sorting is F.REPOSITORY, F.PATH, 
FC.FILE and FC.CHANGESET. The important thing to notice here is that the 
sorting available at any point is in the order collected by optimizer and this 
piece of information was not being considered by the code removed by Mike from 
FromBaseTable. This behavior can be seen when optimizer is considering the join 
order [F, FC, CS]. While considering this join order, the removed code from 
FromBaseTable kicks in for the predicate CS.ID=FC.CHANGESET. The removed code 
looks at equijoin CS.ID=FC.CHANGESET and it looks at the available ordering so 
far and sees FC.CHANGESET in there and incorrectly concludes that required 
sorting is available on CS.ID indirectly through FC.CHANGSET. It fails to take 
into account that FC.CHANGSET is in the 4th position in the ordering. The rows 
returned from [F, FC, -1] are ordered F.REPOSITORY, F.PATH, FC.FILE and 
FC.CHANGSET and not just FC.CHANGESET. Because the code incorrectly decides 
that required sorting is available, it decides to avoid the sorting for the 
given query. 
The result of the query above without Mike's changes are as follows
ID 
----------- 
1 
2 
3 
2 
2 
3 
As we can see from above that FC.CHANGSET is sorted for every row selected from 
the outermost table F. 
a)The first row in the result above is for the first row qualified from F
b)The next 2 rows in the result above are for the second qualified row from F
c) the 4th row in the result above is for the 3rd qualified row from F 
d)and last 2 rows in the result above are for the 4th qualified row from F. 

If the outermost optimizable F was only one row resultset, we could safely 
assume that the required ordering is provided by indexes being considered for 
[F, FC, -1] but that is not the case. 

In future, if we ever decide to enhance sort avoidance code to qualify more 
queries for sort avoidance, we should consider the position of the sorted 
columns in available sorting list of columns and then decide if the query is 
pre-ordered on required ordering.

> Join returns results in wrong order
> -----------------------------------
>
>                 Key: DERBY-4331
>                 URL: https://issues.apache.org/jira/browse/DERBY-4331
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.5.2.0, 10.6.0.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Mike Matrigali
>         Attachments: createDBsimpler.txt, createDBsimplerVer2.txt, 
> DERBY4331_additionalTests_diff.txt, derby4331_do_not_commit.diff, 
> derby_4331_patch_2.diff, notorderby4331.zip, orderby4331.zip, repro.sql, 
> repro2.sql, repro2_qryplan.txt, wisconsin.diff
>
>
> In Derby 10.5.2.0, the query below started returning results in wrong order 
> (note the ORDER BY clause). Derby 10.5.1.1 gives the expected ordering.
> ij> SELECT CS.ID
> FROM
>     CHANGESETS CS, FILECHANGES FC, REPOSITORIES R, FILES F, AUTHORS A
> WHERE
>     R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND 
> F.REPOSITORY = R.ID AND A.REPOSITORY = R.ID AND
>     CS.REPOSITORY = R.ID AND CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND
>     A.ID = CS.AUTHOR AND
>     EXISTS (
>         SELECT 1
>         FROM FILES F2
>         WHERE
>             F2.ID = FC.FILE AND F2.REPOSITORY = R.ID AND
>             F2.PATH LIKE '/%' ESCAPE '#')
> ORDER BY CS.ID DESC;
> ID         
> -----------
> 1          
> 2          
> 3          
> 2          
> 2          
> 3          
> 6 rows selected

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-4331) Join returns results in wrong order

Reply via email to