https://issues.apache.org/bugzilla/show_bug.cgi?id=56648

            Bug ID: 56648
           Summary: ContainerBase.addChild blocks all user requests while
                    single webapp is being deployed
           Product: Tomcat 7
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: major
          Priority: P2
         Component: Catalina
          Assignee: dev@tomcat.apache.org
          Reporter: vkleinschm...@blackboard.com

While deploying a complex webapp that took very long to initialize (using
Spring, lots of filesystem overhead, don't ask), we found that all user
requests got blocked in ApplicationContext.getContext, while trying to
determine the current web application via ContainerBase.findChild(). This
synchronizes on the HashMap ContainerBase.children, which was being locked by
the addChild() method that was doing the webapp deployment:

...
    - locked <0x000000055070f778> (a java.util.HashMap)
    at
org.apache.catalina.core.ContainerBase.access$000(ContainerBase.java:124)
    at
org.apache.catalina.core.ContainerBase$PrivilegedAddChild.run(ContainerBase.java:146)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:777)
    at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:601)
    at
blackboard.tomcat.servletcontainer.TomcatContainerAdapter.registerWebApp(TomcatContainerAdapter.java:200)

I found that addChild() synchronizes for so long on this collection because it
wants to know whether the child container deployed successfully before adding
it to the collection, and it checks at the beginning whether it was already in
the collection before trying to deploy it. Both of those seem like valid
reasons, however it is clearly unacceptable to block the collection (and thus
all user requests!) while deploying a new webapp, which can take any amount of
time, including hanging on OS/filesystem locks.

Here's what all those other threads looked like:
<thread details here> waiting for monitor entry [0x0000000058b89000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at org.apache.catalina.core.ContainerBase.findChild(ContainerBase.java:855)
    - waiting to lock <0x000000055070f778> (a java.util.HashMap)
    at
org.apache.catalina.core.ApplicationContext.getContext(ApplicationContext.java:211)
    at sun.reflect.GeneratedMethodAccessor524.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at
org.apache.catalina.core.ApplicationContextFacade$1.run(ApplicationContextFacade.java:456)
    at java.security.AccessController.doPrivileged(Native Method)
    at
org.apache.catalina.core.ApplicationContextFacade.executeMethod(ApplicationContextFacade.java:454)
    at
org.apache.catalina.core.ApplicationContextFacade.invokeMethod(ApplicationContextFacade.java:402)
    at
org.apache.catalina.core.ApplicationContextFacade.doPrivileged(ApplicationContextFacade.java:374)
    at
org.apache.catalina.core.ApplicationContextFacade.getContext(ApplicationContextFacade.java:122)
...<varying application code here>...


I see two possible approaches to address this:

A) Use a threadsafe ConcurrentHashMap instead of synchronizing on a plain old
HashMap. Means we'd need to be sure that the picture of which child contexts
are available at a given time doesn't have to always be consistent among all
threads. I cannot judge that.

B) Fix addChild to use a flag to mark the child as initializing, and check the
flag at the beginning, after verifying that it's not yet in the collection. If
it's not there yet, and the flag isn't set yet, set the flag, then try to
deploy the webapp. Once that's successful, add it to the collection, then unset
the flag. Here we'd need to synchronize on the collection only very briefly -
for the check at the start and for the addition at the end. The rest of the
code would just need to synchronize on the flag, or on the child object itself,
not on the collection, which it is not interacting with in any way while
deploying the webapp.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to