Mike Matrigali created DERBY-6780:
-------------------------------------

             Summary: change conglomerate cache to handle concurrency and alter 
table add column calls.
                 Key: DERBY-6780
                 URL: https://issues.apache.org/jira/browse/DERBY-6780
             Project: Derby
          Issue Type: Bug
          Components: Store
    Affects Versions: 10.12.0.0
            Reporter: Mike Matrigali


The store maintains a "conglomerate cache" for performance reasons, to
avoid having to go to disk and rebuild the Conglomerate structure every
time there is some interaction with a table.  It maintains this cache across
all users in the db and across transactions.

The store's conglomerate cache as originally designed expected the
"Conglomerate" data structure to be static.   At issue is that support was
added to this data structure to track the number and types of columns in
the conglomerate.  Initially alter table add column always resulted in
a new underlying "Conglomerate" being created so the originaly expectation
was still valid.

At some point some versions of alter table add column were implemented
that did not require a rebuild of all of the underlying conglomerates, 
including the base table.  This meant that the data in the conglomerate
cache could go out of date.  The fix implemented for this was to have 
alter table get exclusive lock on table, update the cache as it did it's work,
and invalidate the entire cache on abort.   When abort happened the code
did not have enough information to invalidate just what it needed.  

DERBY-4057 added more concurrency to the testing of alter table and
showed a problem with the current code.  The current normal path 
for store interaction is to first get the Conglomerate from the cache and then
use information in the data structure to "open" the table with proper locking.
This allows a small window if concurrent alter table add column (and maybe
drop column) are happening. 

various problems come to mind:
1) a concurrent thread might get conglomerate with uncommitted alter 
    table info, and then wait on alter table to finish.  If the alter table 
aborts
    it will be left with wrong information.
2) a concurrent thread might come in after alter table abort invalidates
     the cache, but before the abort finishes.  This will then load a bad 
     version into the cache for others to see later.  This is what was 
     happening with background concurrent issues in DERBY-6774, and fixed
     only for the background threads.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to