[
https://issues.apache.org/jira/browse/DERBY-6510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935272#comment-13935272
]
Brett Bergquist commented on DERBY-6510:
----------------------------------------
In reply to "in your source code how do you execute this query? For instance
does it prepare it once at
system startup and then execute multiple times? Or does it just executed it
without preparing
counting on the derby statement cache to compile it once and then reuse it."
The actual query is generated by EclipseLink version 1.13 (JPA). It is not
prepared before hand but and does count on derby caching it once compiled.
Yesterday that query had been performed (as part of the web service request
processig) 125 times before the system was hard reset. So out of the 125
times, 7 times it had an issue (1 request eventually completed after 3 hours
and 6 were stuck in this state.
As to your comment on "does your application use auto incrementing columns in
any of the tables involved in this query? have no
idea if that is a problem, just trying to think of ways that we might get lock
conficts during optimizer run. I actually think we get all the locks up front
before we start looping, so very likely not an issue.
The optimizer will do I/O to the user indexes to get cost estimates. These are
mostly just single probes
to indexes so would usually assume in cache. But if for some reason they were
not then if the system got
I/O bould on the user table disk then optimizer run could take longer on user
system than in your test system,
and might be intermittent."
No auto-increment columns are used. We had performance problems on these so
we went with a JPA supported table generator. As for I/O bound, the system has
a 1 TB Oracle ZFS storage array connected by a fiber channel, I don't remember
the model (maybe 7620) so I don't think that is an issue.
I will attach the output of a "prstat" that was executed for 1 minute while the
issue was occurring. The PID for derby is 4551 and you will see that about 11%
to 12% of the CPU was being utilized by Derby. Note that Derby is also
performing about 100 inserts/second continuous into another table at the same
time, so that will account for some of that CPU percentage.
> Deby engine threads not making progress
> ---------------------------------------
>
> Key: DERBY-6510
> URL: https://issues.apache.org/jira/browse/DERBY-6510
> Project: Derby
> Issue Type: Bug
> Components: Network Server
> Affects Versions: 10.9.1.0
> Environment: Oracle Solaris 10/9, Oracle M5000 32 CPU, 128GB memory,
> 8GB allocated to Derby Network Server
> Reporter: Brett Bergquist
> Priority: Critical
> Attachments: dbstate.log, derbystacktrace.txt, queryplan.txt
>
>
> We had an issue today in a production environment at a large customer site.
> Basically 5 database interactions became stuck and are not progressing.
> Part of the system dump performs a stack trace every few seconds for a period
> of a minute on the Glassfish application server and the Derby database engine
> (running in network server mode). Also, the dump captures the current
> transactions and the current lock table (ie. syscs_diag.transactions and
> syscs_diag.lock_table). We had to restart the system and in doing so, the
> Derby database engine would not shutdown and had to be killed.
> The stack traces of the Derby engine show 5 threads that are basically making
> no progress in that at each sample, they are at the same point, waiting.
> I will attach the stack traces as well as the state of the transactions and
> locks.
> Interesting is that the "derby.jdbc.xaTransactionTimeout =1800" is set, yet
> the transactions did not timeout. The timeout is for 30 minutes but the
> transactions were in process for hours.
--
This message was sent by Atlassian JIRA
(v6.2#6252)