[ 
https://issues.apache.org/jira/browse/DERBY-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093016#comment-13093016
 ] 

Mike Matrigali commented on DERBY-5367:
---------------------------------------

I looked at the patch, and I am leaning toward it is not worth putting in the 
special case code for collation.  I think it would be better if you just got 
rid of the optimization of only updating the row location, and just do the 
whole row update in the case of matching row always.  I think this path is not 
a very normal path and better to make the code 
simple and not make the fast insert path for all inserts deal with checking for 
collation or not.   Extra benefit is we will get better testing with one code 
path
and can optimize later if we need to.

Do leave a comment why you are not doing the optimization, as it will seem 
strange without at least a reference to collation and why 2 things that compare 
the same might not actually be same value.  

I do agree that in general the fix is the right one, in that you need to update 
the whole row rather than part in the collation case.

Either as part of this fix or log another jira I do believe the other block 
that handles all fields being the same also should be fixed.  I posted a guess 
at those code paths, but to be sure probably easiest is to add a stack trace 
dump to standard out and run the full test suite.  Again I think it would be 
reasonable to 
just remove the optimization always and do a full update with a comment.

> Stale data retrieved when using new collation=TERRITORY_BASED:PRIMARY feature
> -----------------------------------------------------------------------------
>
>                 Key: DERBY-5367
>                 URL: https://issues.apache.org/jira/browse/DERBY-5367
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.8.1.2, 10.9.0.0
>         Environment: Mac OS X, Windows
>            Reporter: Brett Wooldridge
>            Assignee: Kristian Waagan
>            Priority: Critical
>         Attachments: derby-5367-1a-update_row_fully.diff, 
> derby-5367-1b-update_row_fully.diff, derby-5367-1b-update_row_fully.stat
>
>
> Our product recently upgraded to version 10.8.1.2 in order to take advantage 
> of the new 'case-insensitive' mode offered by Derby in the form of the 
> "collation=TERRITORY_BASED:PRIMARY" connection parameter.
> Unfortunately, we have run into an issue whereby stale data appears to be 
> retrieved from an index, even though the data in the table itself has changed.
> You can see this issue in the IJ session below.  The database in question was 
> created using this Java parameter when invoking IJ:
> -Dij.database=jdbc:derby:test;create=true;collation=TERRITORY_BASED:PRIMARY
> Here is the IJ session:
> CONNECTION0* -        jdbc:derby:test
> * = current connection
> ij> CREATE TABLE tag (
>     tag_id INTEGER GENERATED BY DEFAULT AS IDENTITY NOT NULL,
>     tag VARCHAR(255) NOT NULL,
>     CONSTRAINT tag_pk PRIMARY KEY (tag_id),
>     CONSTRAINT tag_tag_unique UNIQUE (tag)
> );
> 0 rows inserted/updated/deleted
> ij> -- first insert a value 'Test', note the upper-case 'T' in 'Test'
> ij> INSERT INTO tag (tag) VALUES ('Test');
> 1 row inserted/updated/deleted
> ij> SELECT * FROM tag;
> TAG_ID     |TAG                                                               
>                                                               
> --------------------------------------------------------------------------------------------------------------------------------------------
> 1          |Test                                                              
>                                                               
> 1 row selected
> ij> -- Now delete the row
> ij> DELETE FROM tag WHERE tag='Test';
> 1 row inserted/updated/deleted
> ij> -- You could run another SELECT here to verify it is gone, but it is.
> ij> -- Now insert a new value 'test', note the lower-case 't' in 'test'
> ij> INSERT INTO tag (tag) VALUES ('test');
> 1 row inserted/updated/deleted
> ij> -- Now verify that the table contains only the lower-case version: 'test'
> ij> SELECT * FROM tag;
> TAG_ID     |TAG                                                               
>                                                               
> --------------------------------------------------------------------------------------------------------------------------------------------
> 2          |test                                                              
>                                                               
> 1 row selected
> ij> -- Now, here is the bug.
> ij> SELECT tag FROM tag;
> TAG                                                                           
>                                                   
> --------------------------------------------------------------------------------------------------------------------------------
> Test                                                                          
>                                                   
> 1 row selected
> ij> 
> Note in the last SELECT we specify the 'tag' column specifically.  When we 
> 'SELECT *', Derby performs a table-scan and the result is correct.  However, 
> when we 'SELECT tag', Derby appears to use the index created for the 
> 'tag_tag_unique' unique constraint.  As an optimization Derby, like many 
> databases, will use values directly from the index in the case where the 
> index covers all requested columns.
> The bigger question is, why doesn't the DELETE action cause the entry in the 
> tag_tag_unique index to be deleted?  Is this a further optimization?  If so, 
> it is imperative that the index at least be updated when the new value is 
> inserted.
> This is rather a severe bug for us that causes stale data to be returned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to