[ 
https://issues.apache.org/jira/browse/DERBY-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mamta A. Satoor updated DERBY-4631:
-----------------------------------

    Attachment: DERBY_4631_patch3_stat.txt
                DERBY_4631_patch3_diff.txt

DERBY-4631 Wrong join column returned by right outer join with NATURAL or USING 
and territory-based collation

I have a patch(DERBY_4631_patch3_diff.txt) which is ready for review and 
commit. To recap the issue, SQL:2003 says that the join columns in a natural 
join or in a named columns join should be added to the select list by 
coalescing the column from the left table with the column from the right table

Derby has it's on logic to retrieve the join column values. It always picks up 
join column's value from the left table when we are working with natural left 
outer join and it picks up the join column's value from the right table when we 
are working with natural right outer join. 

But this logic does not work for all cases for right outer join. The fix 
provided in this patch will pick the join column's value based on following 
logic(this logic mimics the functionality of COALESCE) 
1)if the left table's column value is null then pick up the right table's 
column's value. 
2)If the left table's column value is non-null, then pick up that value 

Following are the files impacted by this patch
$ svn stat -q
M       java\engine\org\apache\derby\impl\sql\compile\ResultColumn.java
M       java\engine\org\apache\derby\impl\sql\compile\JoinNode.java
M       java\engine\org\apache\derby\impl\sql\compile\ResultColumnList.java
M       
java\testing\org\apache\derbyTesting\functionTests\tests\lang\CollationTest.java

The changes are as follows
Two additional fields have been added to ResultColumn.java 
rightOuterJoinUsingClause  and joinResultSet 
rightOuterJoinUsingClause will be set to true for following 2 cases
1)if this column represents the join column which is part of the SELECT list of 
a RIGHT OUTER JOIN with USING/NATURAL. eg
 select c from t1 right join t2 using (c)
This case is talking about column c as in "select c"
2)if this column represents the join column from the right table for predicates 
generated for the USING/NATURAL of RIGHT OUTER JOIN eg
  select c from t1 right join t2 using (c)
For "using(c)", a join predicate will be created as follows
   t1.c=t2.c
This case is talking about column t2.c of the join predicate.

joinResultSet will be non-null for the case 1) above. It will show the 
association of this result column to the join resultset created for the RIGHT 
OUTER JOIN with USING/NATURAL. This information along with 
rightOuterJoinUsingClause will be used during the code generation time.
These 2 additional fields will be used to identify ResultColumn which belong to 
a join column in the SELECT
list and identify ResultColumn which belong to right join column for the 
predicate generated for USING/NATURAL
columns. Additionally, ResultColumn which belong to a join column in the SELECT 
list will also know about the
JoinNode which they belong to. These 2 pieces of information will then be used 
at the code generation time
for join column for RIGHT OUTER JOIN with USING/NATURAL based on following logic
1)if the left table's column value is null then pick up the right table's 
column's value.
2)If the left table's column value is non-null, then pick up that value 

Changes in JoinNode.java just identifies the ResultColumn which represent the 
join column from the right table 
for predicates generated for the USING/NATURAL of RIGHT OUTER JOIN eg
     select c from t1 right join t2 using (c)
For "using(c)", a join predicate will be created, t1.c=t2.c. JoinNode changes 
will set 
ResultColumn.rightOuterJoinUsingClause flag to true for t2.c

The code generation changes have gone into ResultColumnList.java


                
> Wrong join column returned by right outer join with NATURAL or USING and 
> territory-based collation
> --------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4631
>                 URL: https://issues.apache.org/jira/browse/DERBY-4631
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.6.1.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Mamta A. Satoor
>              Labels: derby_triage10_8
>         Attachments: DERBY_4631_not_for_commit_patch1_diff.txt, 
> DERBY_4631_not_for_commit_patch1_stat.txt, 
> DERBY_4631_not_for_commit_patch2_diff.txt, 
> DERBY_4631_not_for_commit_patch2_stat.txt, DERBY_4631_patch3_diff.txt, 
> DERBY_4631_patch3_stat.txt
>
>
> SQL:2003 says that the join columns in a natural join or in a named
> columns join should be added to the select list by coalescing the
> column from the left table with the column from the right table.
> Section 7.7, <joined table>, syntax rules:
> > 1) Let TR1 be the first <table reference>, and let TR2 be the <table
> > reference> or <table factor> that is the second operand of the
> > <joined table>. Let RT1 and RT2 be the row types of TR1 and TR2,
> > respectively. Let TA and TB be the range variables of TR1 and TR2,
> > respectively. (...)
> and
> > 7) If NATURAL is specified or if a <join specification> immediately
> > containing a <named columns join> is specified, then:
> (...)
> > d) If there is at least one corresponding join column, then let SLCC
> > be a <select list> of <derived column>s of the form
> >
> > COALESCE ( TA.C, TB.C ) AS C
> >
> > for every column C that is a corresponding join column, taken in
> > order of their ordinal positions in RT1.
> For a right outer join, Derby doesn't use COALESCE(TA.C, TB.C), but
> rather just TB.C (the column in the right table) directly.
> This is in most cases OK, because COALESCE(TA.C, TB.C) = TB.C is an
> invariant in a right outer join. (Because TA.C is either NULL or equal
> to TB.C.)
> However, in a database with territory-based collation, equality
> between two values does not mean they are identical, especially now
> that the strength of the collator can be specified (DERBY-1748).
> Take for instance this join:
> ij> connect 
> 'jdbc:derby:testdb;create=true;territory=en_US;collation=TERRITORY_BASED:SECONDARY';
> ij> create table big(x varchar(5));
> 0 rows inserted/updated/deleted
> ij> insert into big values 'A','B','C';
> 3 rows inserted/updated/deleted
> ij> create table small(x varchar(5));
> 0 rows inserted/updated/deleted
> ij> insert into small values 'b','c','d';
> 3 rows inserted/updated/deleted
> ij> select x, t1.x, t2.x, coalesce(t1.x, t2.x) from small t1 natural right 
> outer join big t2;
> X    |X    |X    |4    
> -----------------------
> A    |NULL |A    |A    
> B    |b    |B    |b    
> C    |c    |C    |c    
> 3 rows selected
> I believe that the expected result from the above query is that the
> first column should have the same values as the last column. That is,
> the first column should contain {'A', 'b', 'c'}, not {'A', 'B', 'C'}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to