[
https://issues.apache.org/jira/browse/BROOKLYN-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666704#comment-15666704
]
Aled Sage commented on BROOKLYN-386:
------------------------------------
On rebind, the following happens:
* Entity is instantiated, but not yet managed; it therefore has a
{{QueueingSubscriptionManager}}
* Entity's locations are re-added via AbstractEntity.addLocations(); this calls
{{sensors().emit(AbstractEntity.LOCATION_ADDED, loc)}}
* These attribute-changed publications are queued inside the
{{QueueingSubscriptionManager}}
* The Entity's policies are added; {{CreateUserPolicy.setEntity}} subscribes to
{{LOCATION_ADDED}} events (which is registered in the
{{QueueingSubscriptionManager}})
* The entity becomes managed: the {{QueueingSubscriptionManager}} is drained.
* in {{EntityManagementSupport.onManagementStarting}}, it replays the
{{QueueingSubscriptionManager.queuedSubscriptions}} (so now the policy's
subscription is active)
* in {{EntityManagementSupport.onManagementStarted}}, it replays the
{{QueueingSubscriptionManager.queuedSensorEvents}} (including the locationAdded
event, which is received by the policy)
* The policy executes {{onEvent}}, which triggers (asynchronously) adding the
user to the machine.
Three ways I see to fix this are:
1. Avoid publishing location-changed events when just re-adding the locations
on rebind.
2. Replay the queued subscribe/publish in a smarter way, so they correctly
interleave (i.e. events 'published' before the 'subscribe' would not be
received)
3. Change when {{policy.setEntity()}} is called, so that it is only done when
the entity is managed.
4. Guard against duplicate events inside {{CreateUserPolicy}}.
For (1), that makes sense to me. We already do something similar when
re-setting the attributes on rebind (calling
{{entity.setAttributeWithoutPublishing()}}).
For (2), it becomes less of an issue if we've done (3), but still would be nice
to have as it feels like a more general problem. However, this is fiddly! The
subscribe/publish replays are called from different methods. Changing the order
may have subtle consequences on other entity/policy implementations!
For (3), I'll discuss that on the dev@brooklyn mailing list.
For (4), that is the easiest thing to do. But other policies/enrichers could
still hit the same problem.
> NPE on rebind calling CreateUserPolicy.addUser
> ----------------------------------------------
>
> Key: BROOKLYN-386
> URL: https://issues.apache.org/jira/browse/BROOKLYN-386
> Project: Brooklyn
> Issue Type: Bug
> Reporter: Aled Sage
>
> I found this NullPointerException in the log:
> {noformat}
> 2016-09-07 13:50:40,633 INFO o.a.b.c.m.r.RebindIteration
> [brooklyn-execmanager-EVQzoN78-0]: Rebind complete (MASTER) in 41.0s: 6 apps,
> 16 entities, 56 locations, 2 policies, 88 enrichers, 0 feeds, 162 catalog
> items
> 2016-09-07 13:50:40,633 DEBUG o.a.b.c.m.r.RebindIteration
> [brooklyn-execmanager-EVQzoN78-0]: RebindManager complete; apps: [fxky5xbx0z,
> vt864wmzpn, u3ohrxr21o, X0UTBSWZ, sJslLEBo, eb95zYiG]
> 2016-09-07 13:50:40,634 INFO o.a.b.p.j.os.CreateUserPolicy
> [brooklyn-execmanager-EVQzoN78-0]: Adding auto-generated user myname @
> 1.2.3.4:11071
> 2016-09-07 13:50:40,667 DEBUG o.a.b.c.m.r.RebindManagerImpl [main]: Starting
> persistence
> (org.apache.brooklyn.core.mgmt.rebind.RebindManagerImpl@19d095d5[mgmt=EVQzoN78]),
> mgmt EVQzoN78
> 2016-09-07 13:50:40,668 DEBUG o.a.b.l.j.JcloudsSshMachineLocation
> [brooklyn-execmanager-EVQzoN78-0]: Problem getting node-metadata for
> SshMachineLocation[MyVcloudDirector(Test):[email protected]/1.1.1.1:11071(id=N1UFSoVb)],
> node id urn:vcloud:vm:be3270fd-698f-4be3-b8
> 55-d379505ac95a (continuing)
> java.lang.NullPointerException: null
> at
> org.apache.brooklyn.location.jclouds.JcloudsSshMachineLocation.getOptionalNode(JcloudsSshMachineLocation.java:225)
> [brooklyn-locations-jclouds-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
> at
> org.apache.brooklyn.location.jclouds.JcloudsSshMachineLocation.getOptionalOperatingSystem(JcloudsSshMachineLocation.java:519)
> [brooklyn-locations-jclouds-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
> at
> org.apache.brooklyn.location.jclouds.JcloudsSshMachineLocation.inferMachineDetails(JcloudsSshMachineLocation.java:543)
> [brooklyn-locations-jclouds-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
> at
> org.apache.brooklyn.location.ssh.SshMachineLocation.getMachineDetails(SshMachineLocation.java:1058)
> [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
> at
> org.apache.brooklyn.policy.jclouds.os.CreateUserPolicy.addUser(CreateUserPolicy.java:145)
> [brooklyn-locations-jclouds-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
> at
> org.apache.brooklyn.policy.jclouds.os.CreateUserPolicy$1.run(CreateUserPolicy.java:114)
> [brooklyn-locations-jclouds-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
> at
> org.apache.brooklyn.util.concurrent.CallableFromRunnable.call(CallableFromRunnable.java:43)
> [brooklyn-utils-common-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
> at
> org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
> [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> [na:1.7.0_95]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> [na:1.7.0_95]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_95]
> {noformat}
> It shouldn't try to create the user again on rebind. And we should check to
> avoid the NPE as well.
> But this is benign, given that we don't want it to be executing the
> create-user code again anyway.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)