[
https://issues.apache.org/jira/browse/DERBY-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653364#action_12653364
]
Knut Anders Hatlen commented on DERBY-2991:
-------------------------------------------
Thanks to all of you for looking at this.
To Kathey's questions, I will get back to working on this issue next
week and hope to have optimized the patch further and tested it some
more by the end of that week. It's difficult to say when a fix can be
committed to trunk, as it depends on whether we find the suggested
approach acceptable or if we need to find another way to fix it. And
as Mike mentioned more testing is needed in any case. I do hope we can
come up with an approach everyone feels comfortable with in the next
couple of weeks before the holiday season, and then get it implemented
and properly tested in January. I agree with Mike though, that it's
probably too risky to backport the fix, unless we end up with
something simpler than the current proposal. Not that I'm planning to
introduce any bugs, but since it changes some fundamental assumptions
in a critical part of the system, I wouldn't feel all that comfortable
with backporting it.
To Mike's questions:
Yes, I think it is possible to only save the position in situations
that could cause deadlocks. That would mean that we only save the
position if we need to wait for a lock. One possible complication is
that it's not necessarily the scan that's waiting for a lock that's
holding the scan lock that causes the deadlock. Therefore we would
need to save the position in all the open B-tree scans in the
transaction when a lock cannot be granted immediately. And this must
be done for all locks that we wait on, not only locks in B-tree scans,
so it will add complexity to code outside the B-tree code as
well. Given that this increases the complexity and removing the scan
locks reduces the complexity, I'd prefer a solution where the scan
locks are removed completely if it can be done with acceptable
performance.
As to the performance, I hope that it will be possible to measure the
actual cost of copying the key once I have removed the expensive and
unnecessary renavigation. You are right that the suggested
optimization will only save the renavigation, not the copying. Some
thoughts/guesses:
- The current code that saves the key always allocates a new
DataValueDescriptor array and populates it with empty
DataValueDescriptors of the correct type. It shouldn't be
necessary to do this more than once per scan. That could save much
allocation/gc work.
- When we save the key, we retrieve it by calling fetchFromSlot(),
which will deserialize the value of the key from the page. I think
that this is causing the same work to be performed twice when we
save the position in methods like BTreeForwardScan.fetchRows(), at
least when the scan is returning the full key, and we could
probably make this more efficient.
- What we save by not obtaining/releasing the scan lock includes 2-3
lookups in a global hashtable per leaf page, which could give an
extra benefit in a multi-threaded scenario.
- It's not clear to me that the size of the key is significant. The
cost of the scan lock is constant per page. Since a bigger key
means fewer keys per page, the cost of copying keys should also be
more or less constant per page. So the ratio between the cost
saved and the cost added may not be that much affected by the size
of the key.
- One of the worst cases is probably where extra qualification of
the row is needed. Like "SELECT index_column FROM t WHERE
length(index_column) = 10", in which case I think we'll release
the latch, and therefore also save the key, for every row in the
index.
The suggestions for new tests look reasonable to me. The current
regression tests probably don't test that the current position has
been moved to another page while we didn't hold the latch, since those
situations would have resulted in a lock timeout before. So we
basically need to have tests for split/merge/reclaim for every call to
reposition() in the code. Will need to think more about how to do
that. Thanks for raising the concern for the previous key locking as
well.
> Index split deadlock
> --------------------
>
> Key: DERBY-2991
> URL: https://issues.apache.org/jira/browse/DERBY-2991
> Project: Derby
> Issue Type: Bug
> Components: Store
> Affects Versions: 10.2.2.0, 10.3.1.4
> Environment: Windows XP, Java 6
> Reporter: Bogdan Calmac
> Assignee: Knut Anders Hatlen
> Attachments: d2991-preview-1a.diff, d2991-preview-1a.stat,
> d2991-preview-1b.diff, d2991-preview-1b.stat, d2991-preview-1c.diff,
> d2991-preview-1c.stat, derby.log, InsertSelectDeadlock.java, Repro2991.java,
> stacktraces_during_deadlock.txt
>
>
> After doing dome research on the mailing list, it appears that the index
> split deadlock is a known behaviour, so I will start by describing the
> theoretical problem first and then follow with the details of my test case.
> If you have concurrent select and insert transactions on the same table, the
> observed locking behaviour is as follows:
> - the select transaction acquires an S lock on the root block of the index
> and then waits for an S lock on some uncommitted row of the insert transaction
> - the insert transaction acquires X locks on the inserted records and if it
> needs to do an index split creates a sub-transaction that tries to acquire an
> X lock on the root block of the index
> In summary: INDEX LOCK followed by ROW LOCK + ROW LOCK followed by INDEX LOCK
> = deadlock
> In the case of my project this is an important issue (lack of concurrency
> after being forced to use table level locking) and I would like to contribute
> to the project and fix this issue (if possible). I was wondering if someone
> that knows the code can give me a few pointers on the implications of this
> issue:
> - Is this a limitation of the top-down algorithm used?
> - Would fixing it require to use a bottom up algorithm for better
> concurrency (which is certainly non trivial)?
> - Trying to break the circular locking above, I would first question why
> does the select transaction need to acquire (and hold) a lock on the root
> block of the index. Would it be possible to ensure the consistency of the
> select without locking the index?
> -----
> The attached test (InsertSelectDeadlock.java) tries to simulate a typical
> data collection application, it consists of:
> - an insert thread that inserts records in batch
> - a select thread that 'processes' the records inserted by the other thread:
> 'select * from table where id > ?'
> The derby log provides detail about the deadlock trace and
> stacktraces_during_deadlock.txt shows that the inser thread is doing an index
> split.
> The test was run on 10.2.2.0 and 10.3.1.4 with identical behaviour.
> Thanks,
> Bogdan Calmac.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.