[ 
https://issues.apache.org/jira/browse/DERBY-6510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935190#comment-13935190
 ] 

Mike Matrigali commented on DERBY-6510:
---------------------------------------

the xa timeout issue may be easier to reproduce/fix.  I am not sure exactly 
where that
code is implemented, but likely at high level in code rather than store.   
Maybe someone
else can help here. In general
derby does not implement timeouts as one might expect, it does not set timers 
that
then interrupt at the given time.  This is because there are a lot of places in 
the code that
just can't handle a random interrupt.

So what is usually done is that that checks are placed in the code that is 
"expected" to 
be visited every so often and then time is manually checked.   At least that is 
what is
done on query timeout when I looked last.  And I think these checks are placed 
high
up where there is communication between client and server - like when rows move
back and forth.  So it may be that during optimizer spin we are not checking - 
not sure.
Should be easy to make a reproducible case for this part:
o create a query that is going cause hell for the optimizer, many way join with 
many indexes
o set your timeout and see what happens.

If this repro's then should not be too hard to pick a good place in the 
optmizer to check as it
loops.  You don't want to check every loop as the get time interfaces are 
expensive compared
to the loop. but checking every N times for some N that makes sense should be 
fine.

> Deby engine threads not making progress
> ---------------------------------------
>
>                 Key: DERBY-6510
>                 URL: https://issues.apache.org/jira/browse/DERBY-6510
>             Project: Derby
>          Issue Type: Bug
>          Components: Network Server
>    Affects Versions: 10.9.1.0
>         Environment: Oracle Solaris 10/9, Oracle M5000 32 CPU, 128GB memory, 
> 8GB allocated to Derby Network Server
>            Reporter: Brett Bergquist
>            Priority: Critical
>         Attachments: dbstate.log, derbystacktrace.txt
>
>
> We had an issue today in a production environment at a large customer site.   
> Basically 5 database interactions became stuck and are not progressing.   
> Part of the system dump performs a stack trace every few seconds for a period 
> of a minute on the Glassfish application server and the Derby database engine 
> (running in network server mode).   Also, the dump captures the current 
> transactions and the current lock table (ie. syscs_diag.transactions and 
> syscs_diag.lock_table).   We had to restart the system and in doing so, the 
> Derby database engine would not shutdown and had to be killed.
> The stack traces of the Derby engine show 5 threads that are basically making 
> no progress in that at each sample, they are at the same point, waiting.
> I will attach the stack traces as well as the state of the transactions and 
> locks.   
> Interesting is that the "derby.jdbc.xaTransactionTimeout =1800" is set, yet 
> the transactions did not timeout.  The timeout is for 30 minutes but the 
> transactions were in process for hours.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to