GitHub user ahgittin opened a pull request:

    https://github.com/apache/incubator-brooklyn/pull/101

    Fix enrichers too much rebinded

    There is a nasty issue that enrichers (and policies) often get duplicated 
on a rebind.  The reason for this is that many times, the places where 
enrichers are added is near where sensors are added, and because we don't 
persist sensor feeds we count on those being added on rebind (eg in 
`SoftwareProcessImpl.callRebindHooks()` invoking `connectSensors()`).
    
    Normally you'd think it's fine if an enricher gets added again.  But you'd 
be wrong, here's why it's nasty:  say we have an enricher on a JBoss (e.g. to 
take a time average), and then another at a web-cluster parent (to average 
across nodes), and another at the controlled-dwac parent (to propagate).  After 
rebind, we have two of each of these.  And after another we have three of each. 
 Now when the original sensor is published, we get the enriched value published 
3 times from JBoss.  And each of these 3 values is handled now by 3 enrichers 
at the web-cluster parent, so the enriched sensor at web-cluster is published 9 
times.  And you guessed it, the simple propagator has 3 instances each running 
9 times so is publishing 27 times for initial event.  Persist a few more times, 
and you're now publishing N^3, queueing lots and lots of tasks and bringing 
down the server.
    
    There are a lot of things we can do to improve it.
    
    This PR fixes several of the places where enrichers might be added multiple 
times (moving them to `init()` or similar as that's generally safe for 
enrichers).
    
    This PR also does a somewhat messy check in 
`AbstractEntity.addEnricher(...)` and `addPolicy(...)` to see whether there is 
already an enricher/policy which seems similar, and refuses to add it and warns 
if encountered.  This should help flag other places this becomes an issue (and 
will clean up servers which have too many enrichers already).
    
    These two techniques seem pretty good for now.  Ideally long-term however 
we can get rid of the second, having cleaned these things up so it isn't a 
problem.  As a point-in-time brain dump, I'd like to see:
    
    * feeds which can ignore duplicate values do so
    * feeds which can drop intermediate values do so under duress (or we know 
which tasks to drop under heavy load)
    * **we have a cleaner model for feeds** where we can tell if the populator 
of a sensor is registered, recognising sensor feeds and enrichers are often 
quite similar, and both are general cases of subscriptions (maybe that's what 
we need to be persisting?) along with policy
    * although some of the software-process sensor feed definitions should 
support multiple implementations (e.g. configure the built-in brooklyn 
collector or an external collectd)
    * allows sensor feeds to be loaded at init time also, and do this in 
general, with a `whenServiceUp` or `whenServiceExpectedStateIsRunning` guard 
(cleaning up `SERVICE_STATE` to be `EXPECTED_STATE` and `ACTUAL_STATE`)
    * allowing a caller to set multiple inputs for a sensor, or even 
dynamically add inputs which should be used (esp to compute service up, someone 
could easily add a new requirement for service_up)
    * persist sensor feed info and subscriptions
    
    /cc @aledsage @neykov

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ahgittin/incubator-brooklyn 
fix-enrichers-too-much-rebinded

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-brooklyn/pull/101.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #101
    
----
commit e6b7af4c3489a6be616937d4a0d54db6e653e9a9
Author: Alex Heneveld <[email protected]>
Date:   2014-07-31T05:23:35Z

    add some more tests for enricher rebind (but none of these are the culprit 
for the multiple-enrichers-being-added)

commit e58b744f0274d0c2e9587c78e8eeb786507067cd
Author: Alex Heneveld <[email protected]>
Date:   2014-07-31T05:25:20Z

    fix vanilla-java integration tests, and clarify the API and fix paths used 
in ArchiveBuilder which caused the problems (jars can't use "./" prefix, and 
these jars had "./test-classes/" !);
    also clean up warnings and javadoc

commit 424bda6845aeb67a42e4da2038598a34929493ce
Author: Alex Heneveld <[email protected]>
Date:   2014-07-31T06:13:52Z

    increase JMX poll frequency slightly (to 3s rather than 500ms) as it's a 
bit expensive

commit d135df402c3b93f69772418babec004678cebd94
Author: Alex Heneveld <[email protected]>
Date:   2014-07-31T06:14:27Z

    add failing test for enrichers being added multiple times on rebind

commit cc64c47935fcbeab694fae6a659f5766e7da8ee2
Author: Alex Heneveld <[email protected]>
Date:   2014-07-31T06:54:48Z

    add a fix for enrichers/policies being added multiple times on rebind, by 
declining to add an enricher which looks too much like one already present

commit 51145520aced1b20a482f78bde4b43cfa4697b8d
Author: Alex Heneveld <[email protected]>
Date:   2014-07-31T06:55:35Z

    explicitly fix two of the instances where enrichers and policies were being 
added again on rebind

commit 6416d446196bdfb0e0723414e524fc0b673d8fa3
Author: Alex Heneveld <[email protected]>
Date:   2014-07-31T07:38:34Z

    fix more of the explicit places where enrichers could be re-added on rebind

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to