Aled Sage created BROOKLYN-580:
----------------------------------

             Summary: Rebinding to MachineEntity: sometimes fails to reconnect 
sensor feeds
                 Key: BROOKLYN-580
                 URL: https://issues.apache.org/jira/browse/BROOKLYN-580
             Project: Brooklyn
          Issue Type: Bug
    Affects Versions: 0.12.0
            Reporter: Aled Sage


On rebind, sometimes \{{MachineEntity}} instances do not have their feeds 
recreated. This is illustrated by non-deterministic test failure in 
\{{MachineEntityJcloudsRebindTest}}.

The problem is that \{{SoftwareProcessImpl.callRebindHooks}} schedules a task 
to call \{{connectSensors}} in something between 0 and 10 seconds time, which 
will try to recreate the feeds. However, if this executes too soon (while 
rebind is still happening), the \{{SshMachineLocation}} may not yet be managed. 
If that is the case, the feed is not created.

This is most likely to happen if there are a lot of entities/locations, so 
iterating over them for rebind takes longer. It is random in that the delay in 
calling \{{connectSensors}} can sometimes be extremely short (the randomness 
there is to avoid the thundering herd problem on rebind).

Although the symptoms are similar to 
https://issues.apache.org/jira/browse/BROOKLYN-425, the underlying cause is 
different - therefore treating this as a new issue rather than reopening the 
old one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to