Rich-- Having looked into this more now, I've got a little more detail about what's happening:
There are two issues intertwined here. The first issue is related to the ControlBeanContext implementation class and its usage of a global, VM-wide lock. At issue is the fact that all ControlContainerContext implementations inherit from a BeanContextServicesSupport base class. This class (and another in that hierarchy) use a the VM-wide object BeanContext.globalHierarchyLock for synchronization in two situations: - adding / removing a control to / from a context - adding / removing / getting a "service" from a context This acts as a choke point for all Control references and is what shows up in the locking traces in a previous mail. This is problematic, but it's also orthogonal (I believe) to the issue with the "destroy" lifecycle method on a JPF. Ultimately, I think that the JDK assumed that the classes in java.beans.beancontext.* would be used to run a single hierarchy of controls on a VM. Beehive's usage pattern is a little different in that in enterprise applications, many hierarchies of beans run on a VM (one per JPF, in NetUI's case). As such, a global lock doesn't fit well. So, hopefully that provides some background on the BeanContext.globalHierarchyLock. That being said, I think that the deadlock is orthogonal to this problem as the "destroy" lifecycle method on a JPF can be invoked without waiting for a lock on the JPF instance itself. The result is that two threads can actually run through a JPF simultaneously. With Daryl's changes in February, the ControlContainerContext object for a Page Flow is a class-level field, and locking is performed on this object at: - create time - action execution time - page rendering time (JSP and Faces) Guess that the question is whether this situation is protected: Thread A is execution an action on JPF X Thread B is destroying JPF X en route to running JPF Y >From the code in PageFlowController.persistInSession (transitively down to StorageHandler.removeAttribute), it seems like a lock is held on the "current" JPF that would prevent this. Right? If that's true, then I've got a little more info to go on in investigating this problem. Thanks for any info... :) Eddie On 3/31/06, Eddie O'Neil <[EMAIL PROTECTED]> wrote: > Good questions -- let's see if I can explain the threads more > clearly in ASCII. :) > > What's happening is this (note, both threads operate on the *same* > instance of a JPF): > > Thread 1 (executing a page flow) Thread2 (destroying a page flow) > acquire lock on FooPage Flow > acquire BC.gHL > acquire lock on BarControlBean > wait for > lock on BarControlBean > wait for lock on BC.gHL > > Because the Controls infrastructure is based on the BeanContext > (Services|Support|etc) code in the java.beans package of the JDK, it > uses the BeanContextServicesSupport class as a base class for the > ControlBeanContext object in a Control. This JDK class (BCSS) uses > the BeanContext.globalHierarchyLock (also in the JDK) to serialize > access to shared data structures like the list of "service" objects in > a BeanContext, etc. This is a lock that is static throughout the JDK > (!). > > To answer your questions: > > > What grabs the lock on BeanContextServicesSupport.globalHierarchyLock? > The base classes for the ControlBeanContext object which delegates up > to it's "super" in order to serialize changes and service requests to > the classes that implement event listener support, etc. > > > What prevents deadlock in general between locking on > > BeanContextServicesSupport.globalHierarchyLock and BarControlBean? > > Good question -- this seems like the crux of the issue. In general, > this is protected by a Control container serializing access to the > container's ControlContainerContext object. In this case, the problem > occurs because two threads are trying to setup and destroy the same > context object simultaneously. If those setup and destroy calls were > serialized against the ControlContainerContext object stored in the > HttpSession, presumably, the semantics would be maintained. > > My guess is that this deadlock occurred with a double page submit or > quick refresh that caught the tail end of the previous thread and the > beginning of the next request such that the threads interleaved in > this way. > > So, hopefully that helps explain what's happening. > > Ultimately, I'm not sure how to fix this yet and would appreciate > thoughts on the topic. Would it be safe to acquire a lock on the JPF > instance during onDestroy? That would serialize the access to the > session scoped CCC object. > > Another way to go (and it's a lot of work) is to build a NetUI / > web-tier specific control container that doesn't leverage the *Support > classes in the JDK. This would allow the container's implementation > to be tailored to the single-threaded nature of the web tier and would > remove a lot of the performance implications around locking and > synchronized data structures. But, I've not thought through this yet, > either. :) > > Eddie > > > > On 3/30/06, Rich Feit <[EMAIL PROTECTED]> wrote: > > Looks like a tough one. First of all, the problem with synchronizing in > > onDestroy is also a deadlock issue: since onDestroy is called from an > > HttpSessionBindingListener, the Servlet container may have a lock on the > > session itself when onDestroy is called. I don't think it's mandated by > > the Servlet spec, but I don't think it's forbidden either. > > Unfortunately, this means that a deadlock can occur if code in another > > thread synchronizes on the page flow instance first (as in an action > > method invocation), then calls a method on HttpSession that synchronizes > > on the session object. Again, I don't think this is mandated by the > > spec, but it happens. > > > > Nothing strikes me off the bat, but I actually don't understand the > > deadlock sequence below (I don't disagree -- just don't have enough info > > to see it). What grabs the lock on > > BeanContextServicesSupport.globalHierarchyLock? What prevents deadlock > > in general between locking on > > BeanContextServicesSupport.globalHierarchyLock and BarControlBean? > > > > Also, how is the same JPF being *created* by two separate threads? > > > > Rich > > > > Eddie O'Neil wrote: > > > Rich/Daryl-- > > > > > > Hey, I've run into a thread deadlock problem in the interaction > > > between Controls and Page Flow that happens in the following > > > circumstance: > > > > > > thread1: acquire lock on FooPageFlow (during FlowController.execute) > > > thread2: acquire lock on > > > BeanContextServicesSupport.globalHierarchyLock (JDK class) > > > thread1: acquire lock on BarControlBean (ControlBean.ensureControl()) > > > thread2: wait for lock on BarControlBean (BeanContextSupport.remove()) > > > thread1: wait for lock on > > > BeanContextServicesSupport.globalHierarchyLock (JDK class) > > > > > > So, the problem is that the same JPF is being both created and > > > destroyed by two different threads. It appears that the "destroy" > > > phase of the JPF lifecycle is entirely unsynchronized and can freely > > > acquire Control locks without having serialized access to the Page > > > Flow itself. > > > > > > My thought is to add a synchronization point in > > > JavaControlUtils.uninitJavaControls that locks on the > > > PageFlowManagedObject as this would serialize access to the Page Flow. > > > But, I seem to recall some threading issues with locking on a JPF > > > during the "destroy" part of the lifecycle and don't want to resurrect > > > those. > > > > > > Any suggestions about how best to make this fix? > > > > > > Eddie > > > > > > > > >
