[jira] [Updated] (DERBY-4437) Concurrent inserts into table with identity column perform poorly

Rick Hillegas (JIRA) Fri, 24 Jun 2011 09:32:10 -0700

     [ 
https://issues.apache.org/jira/browse/DERBY-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rick Hillegas updated DERBY-4437:
---------------------------------

    Attachment: Experiments_4437.html
                derby-4437-06-aa-selfTuning.diff

Attaching a couple files:

1) derby-4437-06-aa-selfTuning - This is an experimental patch, not intended 
for commit. This patch adds a crude heuristic to the default range 
preallocator. The heuristic attempts to tune the size of the preallocation 
range based on the rate at which identity values are being requested.

2) Experiments_4437.html - This is a webpage of results from some experiments 
which I ran, measuring the throughput of Knut's experiment with various 
hardcoded range lengths and with the crude heuristic.

Based on my experiments, I believe that I can offer the following modest 
conclusions:

i) I don't know  how to write useful self-tuning logic which will accomplish 
what Mike wants. This feels like a research project to me. Someone else may 
want to pick up this project but I do not feel I can spend any more time on it.

ii) Derby is able to keep boosting the throughput as you keep boosting the size 
of the preallocated range. Derby will keep delivering better throughput  as you 
boost the size of that range well past your tolerance for leaked values.

iii) I can't offer the customer anything better than a knob which declares how 
many values the app is willing to leak.

I can do the following additional work on this issue. Let me know if you think 
I should do this work:

A) Add a knob so that apps can tune the size of the default preallocated range.

B) Change the current default range size of 5 to some other number. If you 
think this is useful, let me know what a better number would be.

Devising self-tuning logic sounds like an interesting project but one which 
should happen under another JIRA.


> Concurrent inserts into table with identity column perform poorly
> -----------------------------------------------------------------
>
>                 Key: DERBY-4437
>                 URL: https://issues.apache.org/jira/browse/DERBY-4437
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Rick Hillegas
>         Attachments: D4437PerfTest.java, D4437PerfTest2.java, 
> Experiments_4437.html, derby-4437-01-aj-allTestsPass.diff, 
> derby-4437-02-ac-alterTable-bulkImport-deferredInsert.diff, 
> derby-4437-03-aa-upgradeTest.diff, 
> derby-4437-04-aa-reclaimUnusedValuesOnShutdown.diff, 
> derby-4437-05-aa-pluggablePreallocation.diff, 
> derby-4437-06-aa-selfTuning.diff, insertperf.png, insertperf2.png, 
> prealloc.png
>
>
> I have a multi-threaded application which is very insert-intensive. I've 
> noticed that it sometimes can come into a state where it slows down 
> considerably and basically becomes single-threaded. This is especially 
> harmful on modern multi-core machines since most of the available resources 
> are left idle.
> The problematic tables contain identity columns, and here's my understanding 
> of what happens:
> 1) Identity columns are generated from a counter that's stored in a row in 
> SYS.SYSCOLUMNS. During normal operation, the counter is maintained in a 
> nested transaction within the transaction that performs the insert. This 
> allows the nested transaction to commit the changes to SYS.SYSCOLUMN 
> separately from the main transaction, and the exclusive lock that it needs to 
> obtain on the row holding the counter, can be releases after a relatively 
> short time. Concurrent transactions can therefore insert into the same table 
> at the same time, without needing to wait for the others to commit or abort.
> 2) However, if the nested transaction cannot lock the row in SYS.SYSCOLUMNS 
> immediately, it will give up and retry the operation in the main transaction. 
> This prevents self-deadlocks in the case where the main transaction already 
> owns a lock on SYS.SYSCOLUMNS. Unfortunately, this also increases the time 
> the row is locked, since the exclusive lock cannot be released until the main 
> transaction commits. So as soon as there is one lock collision, the waiting 
> transaction changes to a locking mode that increases the chances of others 
> having to wait, which seems to result in all insert threads having to obtain 
> the SYSCOLUMNS locks in the main transaction. The end result is that only one 
> of the insert threads can execute at any given time as long as the 
> application is in this state.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (DERBY-4437) Concurrent inserts into table with identity column perform poorly

Reply via email to