[ 
https://issues.apache.org/jira/browse/DERBY-6510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935519#comment-13935519
 ] 

Brett Bergquist commented on DERBY-6510:
----------------------------------------

I ran a copy of the production system (recent database, single user accessing 
the system, not receiving the stream of 100 inserts/second) with the 
"derby.optimizer.noTimeout=true" and with 
"derby.language.logStatementText=true".   I then performed a test of the web 
service which was the trigger for the queries that appear to be stuck and the 
system operated correctly with little degradation in performance.

Because of this, I don't think there is any query in the processing path of the 
web service that even if the optimizer went through all of the plans, could 
account for hours of processing.

So believing that, does it make sense at this point to discount an issue even 
if the optimizer computed bad costs?  Even if it computed a bad cost because of 
something like statistics being bad or infinite cost estimates, etc. and had to 
go through all of the plans before picking one, I don't see it taking hours to 
perform with this test.

So this seems to lead me to the area of how does it figure out when all of the 
plans have been check or why it would try to recompute the optimization over 
and over again, not making much progress.



> Deby engine threads not making progress
> ---------------------------------------
>
>                 Key: DERBY-6510
>                 URL: https://issues.apache.org/jira/browse/DERBY-6510
>             Project: Derby
>          Issue Type: Bug
>          Components: Network Server
>    Affects Versions: 10.9.1.0
>         Environment: Oracle Solaris 10/9, Oracle M5000 32 CPU, 128GB memory, 
> 8GB allocated to Derby Network Server
>            Reporter: Brett Bergquist
>            Priority: Critical
>         Attachments: dbstate.log, derbystacktrace.txt, prstat.log, 
> prstat_normal.log, queryplan.txt, queryplan_nooptimizerTimeout.txt
>
>
> We had an issue today in a production environment at a large customer site.   
> Basically 5 database interactions became stuck and are not progressing.   
> Part of the system dump performs a stack trace every few seconds for a period 
> of a minute on the Glassfish application server and the Derby database engine 
> (running in network server mode).   Also, the dump captures the current 
> transactions and the current lock table (ie. syscs_diag.transactions and 
> syscs_diag.lock_table).   We had to restart the system and in doing so, the 
> Derby database engine would not shutdown and had to be killed.
> The stack traces of the Derby engine show 5 threads that are basically making 
> no progress in that at each sample, they are at the same point, waiting.
> I will attach the stack traces as well as the state of the transactions and 
> locks.   
> Interesting is that the "derby.jdbc.xaTransactionTimeout =1800" is set, yet 
> the transactions did not timeout.  The timeout is for 30 minutes but the 
> transactions were in process for hours.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to