Kim,
Inspektr inability to insert records causing actions undertaken as part
of Spring Web Flow conversations to hang (causing subsequent requests to
access those conversations to wait on the lock for the conversation,
thereby tying up Tomcat request processing threads out to exhaustion and
thereby causing the CAS server to become nonresponsive) is an
interesting hypothesis.
However, I'm wondering why Inspektr's use of asynchronous database
access doesn't prevent this from tying up any conversations?
Inspektr's JDBC-using audit trail manager:
https://github.com/dima767/inspektr/blob/inspektr-1.0.0.GA/inspektr-support-spring/src/main/java/com/github/inspektr/audit/support/JdbcAuditTrailManager.java
uses
/** ExecutorService that has one thread to asynchronously save requests. */
private final ExecutorService executorService =
Executors.newSingleThreadExecutor();
to execute the logging to the database in a single asynchronous thread.
Since JdbcAuditTrailManager will in practice be a singleton, singly
instantiated from auditTrailContext.xml , and since it instantiates a
single thread ExecutorService once, there should be a singleton such
ExecutorService such that at any given time only one thread will be
working on writing Inspektr audit log entries to the database.
Is the hypothesis that these threads terminate abnormally without
returning their database connection to the pool? That's at least plausible:
http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/Executors.html#newSingleThreadExecutor()
I get how that could exhaust the database connection pool (sort of;
shouldn't it be testing these connections?), but I guess I still don't
see how this locks any conversations, since the database access is
happening in a separate thread from that operating on the conversation.
Mostly, audit trail logging should be able to fail in arbitrarily
horrible ways and only have the consequence of failing to log the audit
trail to the database. [1]
Is this a matter of clean()
public void clean() {
this.transactionTemplate.execute(new
TransactionCallbackWithoutResult() { ...
not using that executor, such that clean() happens synchronously? (What
calls clean(), anyway? That happening asynchronously in its own thread?
Looks like doubly an exercise left to the reader -- firstly to get
clean() to be called, and secondly default clean select SQL will select
no records to be cleaned.) I don't see offhand why clean() would be
tying up the conversation -- if it's being called, how is that happening
in a thread holding a SWF conversation?
Andrew
[1] (Maybe that's a disturbing thought, having a CAS server happily
churning along servicing requests and *not* logging the audit trail as
intended?)
On 02/14/2011 05:06 PM, Cary, Kim wrote:
Math, hi.
Our stuff is running on OSX server. I've seen several mysql databases (we put
some log entries into mysql) stop accepting new entries when the db gets to the
10Gb mark. No limit imposed by us, just empirically observed.
My wild guess is that maybe the inability to insert new records at that point
caused a deadlock.
We're using queries for the audit data from Inspektr. Saved queries are handy.
There is no built in managment gui for the data, but I imagine Crystal Reports
would be handy.
Best,
KC
On Feb 14, 2011, at 11:38 AM, Matthew Selwood wrote:
Hi Kim,
I’m in the process of implemented Inspektr on our CAS build here. What 10Gb
threshold are you referred to? Is this a DB limit you’ve put in place?
On a similar note, is there a ‘user friendly’ way of viewing the database audit
and stats, or do you just write queries?
Thanks,
Matt
-------------------------------
Matthew Selwood
Programmer Analyst – Web Services
University of Victoria
250.472.5565 | [email protected] | CLE C006
On 2/13/11 9:36 PM, "Cary, Kim"<[email protected]> wrote:
Our CAS service came up unresponsive (connection timed out) after throwing
about 6 of these errors tonite.
Google doesn't have much on this, but one case shows some involvement with c3po
failing to talk to the database. We use inspektr and our audit database hit a
10Gb threshold tonite. Restarting tomcat without making any other changes put
the service back into operation.
Can anyone share any clues on pursuing the root cause of the outage?
Thanks,
KC
Feb 13, 2011 7:12:03 PM org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet cas threw exception
org.springframework.webflow.conversation.impl.LockTimeoutException: Unable to
acquire conversation lock after 30 seconds
at
org.springframework.webflow.conversation.impl.JdkConcurrentConversationLock.lock(JdkConcurrentConversationLock.java:44)
at
org.springframework.webflow.conversation.impl.ContainedConversation.lock(ContainedConversation.java:69)
at
org.springframework.webflow.execution.repository.support.ConversationBackedFlowExecutionLock.lock(ConversationBackedFlowExecutionLock.java:51)
at
org.springframework.webflow.executor.FlowExecutorImpl.resumeExecution(FlowExecutorImpl.java:160)
at
org.springframework.webflow.mvc.servlet.FlowHandlerAdapter.handle(FlowHandlerAdapter.java:183)
at
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:771)
at
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:716)
at
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:647)
at
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:563)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at
org.jasig.cas.web.init.SafeDispatcherServlet.service_aroundBody2(SafeDispatcherServlet.java:115)
at
org.jasig.cas.web.init.SafeDispatcherServlet.service_aroundBody3$advice(SafeDispatcherServlet.java:44)
at
org.jasig.cas.web.init.SafeDispatcherServlet.service(SafeDispatcherServlet.java:1)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
com.github.inspektr.common.web.ClientInfoThreadLocalFilter.doFilter(ClientInfoThreadLocalFilter.java:46)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:568)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:613)
Feb 13, 2011 7:12:14 PM org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet cas threw exception
org.springframework.webflow.conversation.impl.LockTimeoutException: Unable to
acquire conversation lock after 30 seconds
at
org.springframework.webflow.conversation.impl.JdkConcurrentConversationLock.lock(JdkConcurrentConversationLock.java:44)
at
org.springframework.webflow.conversation.impl.ContainedConversation.lock(ContainedConversation.java:69)
at
org.springframework.webflow.execution.repository.support.ConversationBackedFlowExecutionLock.lock(ConversationBackedFlowExecutionLock.java:51)
at
org.springframework.webflow.executor.FlowExecutorImpl.resumeExecution(FlowExecutorImpl.java:160)
at
org.springframework.webflow.mvc.servlet.FlowHandlerAdapter.handle(FlowHandlerAdapter.java:183)
at
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:771)
at
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:716)
at
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:647)
at
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:563)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at
org.jasig.cas.web.init.SafeDispatcherServlet.service_aroundBody2(SafeDispatcherServlet.java:115)
at
org.jasig.cas.web.init.SafeDispatcherServlet.service_aroundBody3$advice(SafeDispatcherServlet.java:44)
at
org.jasig.cas.web.init.SafeDispatcherServlet.service(SafeDispatcherServlet.java:1)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
com.github.inspektr.common.web.ClientInfoThreadLocalFilter.doFilter(ClientInfoThreadLocalFilter.java:46)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:568)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:613)
--
You are currently subscribed to [email protected] as: [email protected]
To unsubscribe, change settings or access archives, see
http://www.ja-sig.org/wiki/display/JSG/cas-user
--
You are currently subscribed to [email protected] as:
[email protected]
To unsubscribe, change settings or access archives, see
http://www.ja-sig.org/wiki/display/JSG/cas-user