ahgittin commented on PR #1307:
URL: https://github.com/apache/brooklyn-server/pull/1307#issuecomment-1165356164

   this seems to be triggering a race condition where the DMG is being 
destroyed while a policy runs trying to create a child bucket.  for me it is 
non-deterministic, happening about 1 in 4 runs.
   
   it is rare that this would occur IRL, basically you need to destroy a DMG in 
the same sub-second that it is trying to add.  fortunately it is easier to 
re-create with a test because it can destroy right after the entity which will 
trigger the bucket is created, when the policy is still running.
   
   it would still be nice to fix though!  i think there are several ways we 
could do this:
   
   (1) if persisting a BrooklynObject (source BO, eg entity, policy) which has 
a reference to a target BO which is being destroyed or is destroyed, warn 
during persistence; this will at least help us to track down the cause if we 
subsequently see a rebind problem - EASY but NOT A FIX
   
   (1') - as (1), but either omit from persistence or destroy the source BO 
when one is detected with a missing target - FIX but MESSY, it's basically a 
clean up process rather than the root cause, and RISKY if ever there are valid 
cases for persisting something that might have a dangling reference (but I 
don't think there are)
   
   (2) when creating a source BO, check if its parent/entity is no longer 
managed, and fail at that point before it is recorded so that it doesn't end up 
in persisted state - FIX but NEED BE CAREFUL of races and deadlocks
   
   (3) where target BO's create source BO entities/policies/etc directly, set a 
flag and do a check in `onManagementStopping` to block until such creation is 
completed - FIX but PER ENTITY so tedious and easy to miss and BLOCKS DELETION 
which is not ideal
   
   i think (2) will be the best, maybe also with (1) -- keeping 1' in mind for 
future; for (2) i think currently we set a flag on the target BO early on in 
the management stopping process, before it tells the persistence store the item 
is being deleted, so even if they are racing, the source BO checking for that 
flag should work.  even if unset it will immediately add the reference to the 
target BO, giving "plenty of time" (milliseconds) for that to complete before 
the target BO goes from the management stopping to informing persistence to 
delete, and meaning by the time the target BO tells persistence state it is 
gone, it will almost certainly have the reference to the source BO and so both 
will be deleted


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to