Re: page flow destroy, controls, and deadlock

Eddie O'Neil Fri, 07 Apr 2006 07:33:53 -0700

  Yeah -- that's right the deadlock is caused by lack of
synchronization during "destroy" on the second valueUnbound event.  I
think (but haven't confirmed yet) that the same thing could happen
with different implementations of the
ServletContainerAdapter.ensureFailover method if it calls
"session.setAttribute(...)".  Just a hunch, though, and I need to
confirm this.


  Back to the deadlock, absence of synchronization on the "destroy"
lifecycle call allows a second thread to enter into the
ControlBeanContext which can cause the thread to deadlock with a
second thread that's already in the ControlBeanContext.

> I don't think that's an option here, since it voids the whole purpose of
> DeferredSessionStorageHandler
> (http://mail-archives.apache.org/mod_mbox/beehive-dev/200601.mbox/[EMAIL 
> PROTECTED]
> ) -- am I understanding you here?

Yeah, that makes sense -- just wanted to rule this out as an option.  :)

>
> Otherwise, I wonder if there's a way to track this without adding state
> to the page flow instance.  Possibly in a request attribute?  Just a
> thought...

Sure -- this would certainly work.  It could take the form of a set of
attribute names that should be persisted / failed over, but that
should ignore lifecycle methods.  Will look at this and get back to
you.

Thanks for the comments...

Eddie


>
> Rich
>
> Eddie O'Neil wrote:
> >   Gotcha -- thanks for the info; glad to know I wasn't entirely out in
> > left field.  :)
> >
> >   Given this and putting the issue of the session expiration aside for
> > a second, I think that root cause of the problem is that it is
> > possible for the DeferredSessionStorageHandler to run the
> > "valueUnbound" method code on a PageFlowController twice in these two
> > circumstances:
> >
> >   - DeferredSessionStorageHandler::setAttribute / remove Attribute
> >   - DeferredSessionStorageHandler::applyChanges @ line 221
> >
> > The first runs in "real time" with the removal or replacement of a
> > Page Flow in the session and runs inside of the lock, but the second
> > is the deferred part that runs at the end of a request to persist
> > changed attributes in the session.  The "applyChanges" then makes
> > calls to set / remove the Page Flow attribute from the real
> > HttpSession, and this triggers the "valueUnbound" event and ultimately
> > an "onDestroy".  This latter step is unsynchronized and thus causes
> > the deadlock.
> >
> >   If the Page Flow is stored in the session and is an
> > HttpSessionBindingListener, any deferred calls are going to trigger
> > this event.  In order to keep consecutive JPF invocations in the same
> > order, the onDestroy of the previous JPF needs to run before onCreate
> > of the next JPF.  The cheap way to fix this would be to track the
> > state of "valueUnbound" as a private boolean inside of the
> > PageFlowController.  Then, when the second valueUnbound call is made,
> > the event will be a no-op on the JPF.
> >
> >   Another solution would be to tunnel down into the real HttpSession
> > and make JPF changes "live" against the session rather than deferring
> > them.
> >
> >   Have an opinion either way?
> >
> > Eddie
> >
> >
> >
> > On 4/6/06, Rich Feit <[EMAIL PROTECTED]> wrote:
> >
> >> Funny, I'd drafted a longish email earlier asking about the BeanContext
> >> lock, because it sounded like the *fundamental* issue (ordering problems
> >> in locking that and the control bean instance itself).  But it's not, or
> >> at least it's unrelated.  In a way, that's good.  :)
> >>
> >> Regarding this:
> >>
> >>> Thread A is execution an action on JPF X
> >>> Thread B is destroying JPF X en route to running JPF Y
> >>>
> >>> From the code in PageFlowController.persistInSession (transitively
> >>> down to StorageHandler.removeAttribute), it seems like a lock is held
> >>> on the "current" JPF that would prevent this.  Right?
> >>>
> >> That's correct.  The only time onDestroy() should ever run without a
> >> lock on the page flow instance is when the session is expiring.  That's
> >> the situation where the session itself is already locked, so it's
> >> dangerous to subsequently lock the page flow instance (while another
> >> thread may be accessing the session through synchronized get() after
> >> already locking the page flow instance).  The user could also simulate
> >> this situation by explicitly removing the page flow session attribute,
> >> but of course that's not something we should try to support.
> >>
> >> Let me know if you see/find any holes in that, though...
> >>
> >> Rich
> >>
> >> Eddie O'Neil wrote:
> >>
> >>> Rich--
> >>>
> >>>   Having looked into this more now, I've got a little more detail
> >>> about what's happening:
> >>>
> >>>   There are two issues intertwined here.  The first issue is related
> >>> to the ControlBeanContext implementation class and its usage of a
> >>> global, VM-wide lock.  At issue is the fact that all
> >>> ControlContainerContext implementations inherit from a
> >>> BeanContextServicesSupport base class.  This class (and another in
> >>> that hierarchy) use a the VM-wide object
> >>> BeanContext.globalHierarchyLock for synchronization in two situations:
> >>>
> >>>   - adding / removing a control to / from a context
> >>>   - adding / removing / getting a "service" from a context
> >>>
> >>> This acts as a choke point for all Control references and is what
> >>> shows up in the locking traces in a previous mail.  This is
> >>> problematic, but it's also orthogonal (I believe) to the issue with
> >>> the "destroy" lifecycle method on a JPF.  Ultimately, I think that the
> >>> JDK assumed that the classes in java.beans.beancontext.* would be used
> >>> to run a single hierarchy of controls on a VM.  Beehive's usage
> >>> pattern is a little different in that in enterprise applications, many
> >>> hierarchies of beans run on a VM (one per JPF, in NetUI's case).  As
> >>> such, a global lock doesn't fit well.
> >>>
> >>>   So, hopefully that provides some background on the
> >>> BeanContext.globalHierarchyLock.
> >>>
> >>>   That being said, I think that the deadlock is orthogonal to this
> >>> problem as the "destroy" lifecycle method on a JPF can be invoked
> >>> without waiting for a lock on the JPF instance itself.  The result is
> >>> that two threads can actually run through a JPF simultaneously.
> >>>
> >>>   With Daryl's changes in February, the ControlContainerContext object
> >>> for a Page Flow is a class-level field, and locking is performed on
> >>> this object at:
> >>>
> >>>   - create time
> >>>   - action execution time
> >>>   - page rendering time (JSP and Faces)
> >>>
> >>> Guess that the question is whether this situation is protected:
> >>>
> >>> Thread A is execution an action on JPF X
> >>> Thread B is destroying JPF X en route to running JPF Y
> >>>
> >>> >From the code in PageFlowController.persistInSession (transitively
> >>> down to StorageHandler.removeAttribute), it seems like a lock is held
> >>> on the "current" JPF that would prevent this.  Right?
> >>>
> >>>   If that's true, then I've got a little more info to go on in
> >>> investigating this problem.
> >>>
> >>>   Thanks for any info...  :)
> >>>
> >>> Eddie
> >>>
> >>>
> >>>
> >>>
> >>> On 3/31/06, Eddie O'Neil <[EMAIL PROTECTED]> wrote:
> >>>
> >>>
> >>>>   Good questions -- let's see if I can explain the threads more
> >>>> clearly in ASCII.  :)
> >>>>
> >>>>   What's happening is this (note, both threads operate on the *same*
> >>>> instance of a JPF):
> >>>>
> >>>> Thread 1 (executing a page flow)              Thread2 (destroying a page 
> >>>> flow)
> >>>> acquire lock on FooPage Flow
> >>>>                                                              acquire 
> >>>> BC.gHL
> >>>> acquire lock on BarControlBean
> >>>>                                                              wait for
> >>>> lock on BarControlBean
> >>>> wait for lock on BC.gHL
> >>>>
> >>>> Because the Controls infrastructure is based on the BeanContext
> >>>> (Services|Support|etc) code in the java.beans package of the JDK, it
> >>>> uses the BeanContextServicesSupport class as a base class for the
> >>>> ControlBeanContext object in a Control.  This JDK class (BCSS) uses
> >>>> the BeanContext.globalHierarchyLock (also in the JDK) to serialize
> >>>> access to shared data structures like the list of "service" objects in
> >>>> a BeanContext, etc.  This is a lock that is static throughout the JDK
> >>>> (!).
> >>>>
> >>>> To answer your questions:
> >>>>
> >>>>
> >>>>
> >>>>> What grabs the lock on BeanContextServicesSupport.globalHierarchyLock?
> >>>>>
> >>>>>
> >>>> The base classes for the ControlBeanContext object which delegates up
> >>>> to it's "super" in order to serialize changes and service requests to
> >>>> the classes that implement event listener support, etc.
> >>>>
> >>>>
> >>>>
> >>>>> What prevents deadlock in general between locking on
> >>>>> BeanContextServicesSupport.globalHierarchyLock and BarControlBean?
> >>>>>
> >>>>>
> >>>> Good question -- this seems like the crux of the issue.  In general,
> >>>> this is protected by a Control container serializing access to the
> >>>> container's ControlContainerContext object.  In this case, the problem
> >>>> occurs because two threads are trying to setup and destroy the same
> >>>> context object simultaneously.  If those setup and destroy calls were
> >>>> serialized against the ControlContainerContext object stored in the
> >>>> HttpSession, presumably, the semantics would be maintained.
> >>>>
> >>>> My guess is that this deadlock occurred with a double page submit or
> >>>> quick refresh that caught the tail end of the previous thread and the
> >>>> beginning of the next request such that the threads interleaved in
> >>>> this way.
> >>>>
> >>>> So, hopefully that helps explain what's happening.
> >>>>
> >>>> Ultimately, I'm not sure how to fix this yet and would appreciate
> >>>> thoughts on the topic.  Would it be safe to acquire a lock on the JPF
> >>>> instance during onDestroy?  That would serialize the access to the
> >>>> session scoped CCC object.
> >>>>
> >>>> Another way to go (and it's a lot of work) is to build a NetUI /
> >>>> web-tier specific control container that doesn't leverage the *Support
> >>>> classes in the JDK.  This would allow the container's implementation
> >>>> to be tailored to the single-threaded nature of the web tier and would
> >>>> remove a lot of the performance implications around locking and
> >>>> synchronized data structures.  But, I've not thought through this yet,
> >>>> either.  :)
> >>>>
> >>>> Eddie
> >>>>
> >>>>
> >>>>
> >>>> On 3/30/06, Rich Feit <[EMAIL PROTECTED]> wrote:
> >>>>
> >>>>
> >>>>> Looks like a tough one.  First of all, the problem with synchronizing in
> >>>>> onDestroy is also a deadlock issue: since onDestroy is called from an
> >>>>> HttpSessionBindingListener, the Servlet container may have a lock on the
> >>>>> session itself when onDestroy is called.  I don't think it's mandated by
> >>>>> the Servlet spec, but I don't think it's forbidden either.
> >>>>> Unfortunately, this means that a deadlock can occur if code in another
> >>>>> thread synchronizes on the page flow instance first (as in an action
> >>>>> method invocation), then calls a method on HttpSession that synchronizes
> >>>>> on the session object.  Again, I don't think this is mandated by the
> >>>>> spec, but it happens.
> >>>>>
> >>>>> Nothing strikes me off the bat, but I actually don't understand the
> >>>>> deadlock sequence below (I don't disagree -- just don't have enough info
> >>>>> to see it).  What grabs the lock on
> >>>>> BeanContextServicesSupport.globalHierarchyLock?  What prevents deadlock
> >>>>> in general between locking on
> >>>>> BeanContextServicesSupport.globalHierarchyLock and BarControlBean?
> >>>>>
> >>>>> Also, how is the same JPF being *created* by two separate threads?
> >>>>>
> >>>>> Rich
> >>>>>
> >>>>> Eddie O'Neil wrote:
> >>>>>
> >>>>>
> >>>>>> Rich/Daryl--
> >>>>>>
> >>>>>>   Hey, I've run into a thread deadlock problem in the interaction
> >>>>>> between Controls and Page Flow that happens in the following
> >>>>>> circumstance:
> >>>>>>
> >>>>>> thread1: acquire lock on FooPageFlow (during FlowController.execute)
> >>>>>> thread2: acquire lock on
> >>>>>> BeanContextServicesSupport.globalHierarchyLock (JDK class)
> >>>>>> thread1: acquire lock on BarControlBean (ControlBean.ensureControl())
> >>>>>> thread2: wait for lock on BarControlBean (BeanContextSupport.remove())
> >>>>>> thread1: wait for lock on
> >>>>>> BeanContextServicesSupport.globalHierarchyLock (JDK class)
> >>>>>>
> >>>>>> So, the problem is that the same JPF is being both created and
> >>>>>> destroyed by two different threads.  It appears that the "destroy"
> >>>>>> phase of the JPF lifecycle is entirely unsynchronized and can freely
> >>>>>> acquire Control locks without having serialized access to the Page
> >>>>>> Flow itself.
> >>>>>>
> >>>>>>   My thought is to add a synchronization point in
> >>>>>> JavaControlUtils.uninitJavaControls that locks on the
> >>>>>> PageFlowManagedObject as this would serialize access to the Page Flow.
> >>>>>>  But, I seem to recall some threading issues with locking on a JPF
> >>>>>> during the "destroy" part of the lifecycle and don't want to resurrect
> >>>>>> those.
> >>>>>>
> >>>>>>   Any suggestions about how best to make this fix?
> >>>>>>
> >>>>>> Eddie
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>
> >
> >
>

Re: page flow destroy, controls, and deadlock

Reply via email to