I tried your repro on a build from the trunk and got a deadlock message
from the embedded case, when I ran it on a clean database. I reran after
that and inconsistently got sometimes deadlock and sometimes timeout -
note each run will add another uncommitted row to each of the tables in
the db and thus change the timing. As you say it seems like the amount
of time waited is the deadlock wait time, not the lock timeout time in
all cases.
This seems like a bug to me. I know there are some "time" estimates
used by the lock manager to avoid expensive time request calls in java
for every lock request. Maybe this estimate is going wrong for some
reason.
Maybe Dan may have some advice here?
/mikem
ps. I guess any bug numbered 666 is destined to cause problems :-)
Bryan Pendleton (JIRA) wrote:
[ http://issues.apache.org/jira/browse/DERBY-666?page=comments#action_12357762 ]
Bryan Pendleton commented on DERBY-666:
---------------------------------------
The strange thing is, when I run the program in Embedded mode, and I get the lock
timeout, the program *is* sensitive to the value of derby.locks.deadlockTimeout! If I set
derby.locks.deadlockTimeout to 5, the program gives the "A lock could not be
obtained within the time requested." error after 5 seconds, rather than after the
standard 20 seconds. So it seems that the Embedded configuration is detecting the problem
as a deadlock, fundamentally; it's just reporting it back out to my program as a lock
timeout rather than as a deadlock.
I think I'd better stop now. I've succeeded in getting myself thoroughly
confused :)
Enhance derby.locks.deadlockTrace to print stack traces for all threads
involved in a deadlock
----------------------------------------------------------------------------------------------
Key: DERBY-666
URL: http://issues.apache.org/jira/browse/DERBY-666
Project: Derby
Type: Improvement
Components: Store
Versions: 10.1.1.0
Reporter: Bryan Pendleton
Priority: Minor
Attachments: repro.java
I was reading http://www.linux-mag.com/content/view/2134/ (good article, btw!),
and it says:
The next two properties are needed to diagnose concurrency (locking and
deadlock) problems.
*derby.locks.monitor=true logs all deadlocks that occur in the system.
*derby.locks.deadlockTrace=true log a stack trace of all threads involved
in lock-related rollbacks.
It seems, that, in my environment, the deadlockTrace property does not log a
stack trace of *all* threads involved in the deadlock.
Instead, it only logs a stack trace of the *victim* thread involved in the
deadlock.
I think it would be very useful if the derby.locks.deadlockTrace setting could
in fact log a stack trace of all involved threads.
In a posting to derby-dev, Mike Matrigali noted that an earlier implementation
of a similar feature had to be removed because it was too expensive in both
time and space, but he suggested that there might be several possible ways to
implement this in an acceptably efficient manner:
A long time ago there use to be room in each lock to point at a
stack trace for each lock, but that was removed to optimize the size
of the lock data structure which can have many objects outstanding.
And creating and storing the stack for every lock was incredibly slow
and just was not very useful for any very active application. I think
I was the only one who ever used it.
The plan was sometime to add a per user data structure which could be
filled in when it was about to wait on a lock, which would give most of what is
interesting in a deadlock.
The current deadlockTrace is meant to dump the lock table out to derby.log when
a deadlock is encountered.
I agree getting a dump of all stack traces would be very useful, and
with the later jvm debug interfaces may now be possible - in earlier
JVM's there weren't any java interfaces to do so. Does anyone have
the code to donate to dump all thread stacks to a buffer?
Mike also suggested a manual technique as a workaround; it would be useful to
put this into the documentation somewhere, perhaps on the page which documents
derby.locks.deadlockTrace? Here's Mike's suggestion:
What I do if I can reproduce easily is set try to catch the wait by
hand and then depending on the environment either send the magic
signal or hit ctrl-break in the server window which will send the JVM
specific thread dumps to derby.log.
The magic signal, btw, is 'kill -QUIT', at least with Sun JVMs in my experience.