[jira] [Commented] (DERBY-4631) Wrong join column returned by right outer join with NATURAL or USING and territory-based collation

Mamta A. Satoor (Commented) (JIRA) Mon, 16 Jan 2012 21:11:16 -0800

    [ 
https://issues.apache.org/jira/browse/DERBY-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187453#comment-13187453
 ]


Mamta A. Satoor commented on DERBY-4631:
----------------------------------------

*********************************************************************
Mike's comment
(With solution 2) I think my most basic question is in a 1 to 1 million row 
join will you be adding 1 million new function calls for the coalesce, or is 
this somehow optimized after bind?
*********************************************************************

Yes, if the join column is part of the join query (say as SELECT list or GROUP 
BY, WHERE clause etc), then we will be calling it for every qualified row in 
join. At this point, I think if join column is only part of USING clause or 
used internally through NATURAL join to do equi join, then I do not think we 
will have to generate coalesce function for the join columns.

I debugged through the existing code more and found that to do equijoin for 
USING clause or NATURAL join, we generate merged row which includes the 
necessary join columns values from both sides (for USING clause, the necessary 
join column(s) will be what was specified with the USING clause, for NATURAL 
joins, it will be all the join columns). If we find that the join column is 
getting referenced outside of the equi join, then in order to get the join 
column's value, we do the mapping of the join column to the merged row's column 
(the mapping happens to either the left table's column value in merged row(if 
we are dealing with inner join or left outer join) or to the right table's 
column value(if we are dealing with right outer join)).

If we go with Solution 1), then this mapping of join column to merge row will 
need to change so that we look at teft table's column value first. If it is 
null, then we should pick up the right table's column value. This should happen 
no matter if we are dealing with inner join, left outer join or right outer 
join with natural or uisng clause.

                
> Wrong join column returned by right outer join with NATURAL or USING and 
> territory-based collation
> --------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4631
>                 URL: https://issues.apache.org/jira/browse/DERBY-4631
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.6.1.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Mamta A. Satoor
>              Labels: derby_triage10_8
>
> SQL:2003 says that the join columns in a natural join or in a named
> columns join should be added to the select list by coalescing the
> column from the left table with the column from the right table.
> Section 7.7, <joined table>, syntax rules:
> > 1) Let TR1 be the first <table reference>, and let TR2 be the <table
> > reference> or <table factor> that is the second operand of the
> > <joined table>. Let RT1 and RT2 be the row types of TR1 and TR2,
> > respectively. Let TA and TB be the range variables of TR1 and TR2,
> > respectively. (...)
> and
> > 7) If NATURAL is specified or if a <join specification> immediately
> > containing a <named columns join> is specified, then:
> (...)
> > d) If there is at least one corresponding join column, then let SLCC
> > be a <select list> of <derived column>s of the form
> >
> > COALESCE ( TA.C, TB.C ) AS C
> >
> > for every column C that is a corresponding join column, taken in
> > order of their ordinal positions in RT1.
> For a right outer join, Derby doesn't use COALESCE(TA.C, TB.C), but
> rather just TB.C (the column in the right table) directly.
> This is in most cases OK, because COALESCE(TA.C, TB.C) = TB.C is an
> invariant in a right outer join. (Because TA.C is either NULL or equal
> to TB.C.)
> However, in a database with territory-based collation, equality
> between two values does not mean they are identical, especially now
> that the strength of the collator can be specified (DERBY-1748).
> Take for instance this join:
> ij> connect 
> 'jdbc:derby:testdb;create=true;territory=en_US;collation=TERRITORY_BASED:SECONDARY';
> ij> create table big(x varchar(5));
> 0 rows inserted/updated/deleted
> ij> insert into big values 'A','B','C';
> 3 rows inserted/updated/deleted
> ij> create table small(x varchar(5));
> 0 rows inserted/updated/deleted
> ij> insert into small values 'b','c','d';
> 3 rows inserted/updated/deleted
> ij> select x, t1.x, t2.x, coalesce(t1.x, t2.x) from small t1 natural right 
> outer join big t2;
> X    |X    |X    |4    
> -----------------------
> A    |NULL |A    |A    
> B    |b    |B    |b    
> C    |c    |C    |c    
> 3 rows selected
> I believe that the expected result from the above query is that the
> first column should have the same values as the last column. That is,
> the first column should contain {'A', 'b', 'c'}, not {'A', 'B', 'C'}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-4631) Wrong join column returned by right outer join with NATURAL or USING and territory-based collation

Reply via email to