[jira] [Commented] (DERBY-6542) Improve the concurrency of identity columns by using SYS.SYSSEQUENCES

Rick Hillegas (JIRA) Mon, 14 Apr 2014 10:35:51 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968551#comment-13968551
 ]


Rick Hillegas commented on DERBY-6542:
--------------------------------------

Thanks for raising these issues, Mike. Some responses follow:

MM> it has been so many years since this discussion. Can you remind me what the 
MM> architectural difference is between sequences and identity columns 
currently. Or
MM> if it is written up in one place point me there. It is hard working through 
the 
MM> various old issues to tell what still applies or not after all the
MM> fixes have gone in.

The behavior of sequence generators is described by the header comment in 
SequenceUpdater. The key differences in behavior between sequences and identity 
columns are these:

i) A sequence generator pre-allocates a chunk of new values instead of updating 
a catalog row every time a new value is requested. During orderly shutdown, the 
unused values are flushed and the catalog row is updated so that shutdown won't 
leak chunks of unused values.

ii) The sequence generator always updates the catalog in a subtransaction of 
the user's current execution transaction, expecting to get the lock 
immediately. The write is committed immediately. If the update fails, then the 
generator raises a TOO_MUCH_CONTENTION exception. It is believed that this can 
only happen if someone is scanning SYS.SYSSEQUENCES directly, rather than using 
the SYSCS_UTIL.SYSCS_PEEK_AT_SEQUENCE function. In contrast, identity columns 
try to use a subtransaction, but if that doesn't work, they escalate to using 
the parent transaction. There is substantial lock contention because the 
volatile identity counter is maintained in a core catalog.


MM> Scanning the old topics on this issue there seemed to be the following 
issues, could you comment
MM> how your proposal affects identity column behavior for these:
MM> 1) correctness issue DERBY-5493 (i think the answer is that sequences 
behave better than identity)

It appears to me that DERBY-5493 and  DERBY-5494 were both fixed. I am not 
aware of any outstanding correctness problems with sequence generators.

MM> 2) is there a worst case where the change will reduce current identity 
performance if sequences are used, 
MM> likely some case with only preallocation=1. There was some discussion about 
performance difference of identity and sequences,
MM> could you note if there is any downside to switching to sequences. I think 
the
MM> areas of concern were performance with preallocation 1 being the worst 
case. There
MM> were changes both in sequences and identity columns since that discussion 
so not
MM> clear what the current state is.

I don't think we've measured this. There's certainly a possibility that a 
degenerate preallocation range would result in reduced performance. I think 
that the big downside is the possibility of leaking chunks of unused identity 
values if the database crashes or is brought down ungracefully.
 
MM> 3) is there a behavior change for "lost" values by changing to sequences?

Other than the leakage mentioned above, I'm not aware of any behavior change 
for lost values.

Thanks,
-Rick


> Improve the concurrency of identity columns by using SYS.SYSSEQUENCES
> ---------------------------------------------------------------------
>
>                 Key: DERBY-6542
>                 URL: https://issues.apache.org/jira/browse/DERBY-6542
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.11.0.0
>            Reporter: Rick Hillegas
>            Assignee: Rick Hillegas
>
> This is another attempt to improve the concurrency of identity columns. The 
> previous attempt was tracked by DERBY-4437.
> This new attempt will try out Mike's last suggestion: use system-created 
> sequences managed by SYS.SYSSEQUENCES. This should reduce the contention on 
> the core catalogs.
> I'm hopeful about this approach because of the experiments tracked by 
> DERBY-6533. There we are not seeing any problems related to sequence 
> generators but we are seeing lots of identity-related lock timeouts.
> Here is the general shape of this approach:
> 1) When adding an identity column to a table, Derby will create a sequence 
> generator for the column. The sequence generator will live in the SYS schema 
> and its name will be the table's UUID.
> 2) DROP SEQUENCE will not operate on system-created sequences. System-created 
> sequences will be dropped/modified by DROP/ALTER TABLE commands.
> 3) We will add a new system function for inspecting the current, in-memory 
> value of an identity generator without getting a lock on SYS.SYSSEQUENCES: 
> SYSCS_UTIL.SYSCS_PEEK_AT_IDENTITY( tableSchemaName, tableName ).
> 4) Derby will create a sequence for every legacy identity column after 
> hard-upgrade.
> 5) These changes will take place only after hard-upgrade. Soft-upgrade will 
> not change the behavior of identity columns.
> Comments on this proposal are welcome. Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (DERBY-6542) Improve the concurrency of identity columns by using SYS.SYSSEQUENCES

Reply via email to