[
https://issues.apache.org/jira/browse/BROOKLYN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571252#comment-14571252
]
Svetoslav Neykov commented on BROOKLYN-149:
-------------------------------------------
Some thoughts:
* The id of the catalog item is "OJ081XYKT_0=", including the ending equals sign
* Catalog item IDs are not auto generated, it's coming from an external source
* Deleting the catalog item after starting the app would definitely cause the
described behaviour
* Do the logs contain any previous mentions of the ID?
* Do the persistence backups contain the catalog item, do items in the backups
reference the catalog item?
* Easiest fix is to get the catalog item back in the persistence files (i.e.
copying from a backup or from another instance, even hand-crafting it)
* Latest Brooklyn behaviour is to keep the process running, despite having
errors, so one can poke around (with a warning message on entry), can be
disabled with a startup option.
* Riak is part of core Brooklyn, so they don't actually need the catalog item
to rebind. For these entities one can remove the catalog item id reference in
the persistence files, so Brooklyn doesn't try to locate the missing catalog
item.
> Rebind failed when entity's catalog item not found
> --------------------------------------------------
>
> Key: BROOKLYN-149
> URL: https://issues.apache.org/jira/browse/BROOKLYN-149
> Project: Brooklyn
> Issue Type: Bug
> Affects Versions: 0.7.0-SNAPSHOT
> Reporter: Aled Sage
> Fix For: 0.7.0-SNAPSHOT
>
>
> A customer's Brooklyn instance failed to rebind on restart. The error was:
> {noformat}
> vcompose1476-compose-amp.console-v1.5.3.log:2015-05-15 06:57:12,808 ERROR
> Management node zdJa2A7Y enountered problem during rebind when promoting self
> to master; demoting to FAILED and rethrowing:
> brooklyn.util.exceptions.PropagatedRuntimeException: Failure rebinding, 71
> errors including: problem creating ENTITY Ocs2eaWX of type
> brooklyn.entity.nosql.riak.RiakClusterImpl: Failed to load catalog item
> OJ081XYKT_0=:1.0 required for rebinding.
> oklyn-Allow-Non-Master-Access' to force)
> {noformat}
> The full exception was:
> {noformat}
> 2015-05-15 06:40:31,369 WARN b.e.r.RebindExceptionHandlerImpl
> [brooklyn-execmanager-boo0I83w-0]: No catalog item found with id
> OJ081XYKT_0=:1.0; returning null
> 2015-05-15 06:40:31,395 WARN b.e.r.RebindExceptionHandlerImpl
> [brooklyn-execmanager-boo0I83w-0]: Rebind: continuing after problem creating
> ENTITY Ocs2eaWX of type brooklyn.entity.nosql.riak.RiakClusterImpl
> java.lang.IllegalStateException: Failed to load catalog item OJ081XYKT_0=:1.0
> required for rebinding.
> at
> brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.getLoadingContextFromCatalogItemId(RebindIteration.java:903)
> ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at
> brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.load(RebindIteration.java:869)
> ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at
> brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.newEntity(RebindIteration.java:814)
> ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at
> brooklyn.entity.rebind.RebindIteration.instantiateLocationsAndEntities(RebindIteration.java:407)
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at
> brooklyn.entity.rebind.RebindIteration.doRun(RebindIteration.java:234)
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at
> brooklyn.entity.rebind.InitialFullRebindIteration.doRun(InitialFullRebindIteration.java:69)
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at
> brooklyn.entity.rebind.RebindIteration.run(RebindIteration.java:260)
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at
> brooklyn.entity.rebind.RebindManagerImpl.rebindImpl(RebindManagerImpl.java:545)
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at
> brooklyn.entity.rebind.RebindManagerImpl$3.call(RebindManagerImpl.java:496)
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at
> brooklyn.entity.rebind.RebindManagerImpl$3.call(RebindManagerImpl.java:494)
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at
> brooklyn.util.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:469)
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> [na:1.7.0_71]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> [na:1.7.0_71]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> {noformat}
> In the persisted state, there is no mention of the catalog item OJ081XYKT_0.
> My assumption is that the customer manually added a catalog item (via the
> web-console), deployed an app (entitled "<snip> riak", of type RiakCluster),
> and then deleted the catalog item (or that "deletion" could have been an
> issue with persistence of catalog items - see
> https://github.com/apache/incubator-brooklyn/pull/555).
> The desired behaviour is that this does not cause the entire Brooklyn
> instance to fail to rebind/start.
> ---
> There are several potential things to investigate/improve:
> * Test (manually, and then perhaps automated tests?):
> * adding a catalog item (via web-console), deploying an app, and
> restarting AMP
> * adding a catalog item (via web-console), deploying an app, deleting the
> catalog item (but not the app), and restarting AMP
> * Investigate what catalog ids are used when adding through the web-console
> (or did they manually choose the name OJ081XYKT_0?)
> * Configurable for whether to continue startup onCreateFailed
> (e.g. web-console pops up with "there was an error..."), but can click
> continue.
> * Broolyn web-console to have a page showing all errors
> * Support "quick fixes" such as deleting the item(s).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)