[ 
https://issues.apache.org/jira/browse/BROOKLYN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aled Sage updated BROOKLYN-149:
-------------------------------
    Description: 
A customer's Brooklyn instance failed to rebind on restart. The error was:

{noformat}
vcompose1476-compose-amp.console-v1.5.3.log:2015-05-15 06:57:12,808 ERROR 
Management node zdJa2A7Y enountered problem during rebind when promoting self 
to master; demoting to FAILED and rethrowing: 
brooklyn.util.exceptions.PropagatedRuntimeException: Failure rebinding, 71 
errors including: problem creating ENTITY Ocs2eaWX of type 
brooklyn.entity.nosql.riak.RiakClusterImpl: Failed to load catalog item 
OJ081XYKT_0=:1.0 required for rebinding.
oklyn-Allow-Non-Master-Access' to force)
{noformat}

The full exception was:

{noformat}
2015-05-15 06:40:31,369 WARN  b.e.r.RebindExceptionHandlerImpl 
[brooklyn-execmanager-boo0I83w-0]: No catalog item found with id 
OJ081XYKT_0=:1.0; returning null
2015-05-15 06:40:31,395 WARN  b.e.r.RebindExceptionHandlerImpl 
[brooklyn-execmanager-boo0I83w-0]: Rebind: continuing after problem creating 
ENTITY Ocs2eaWX of type brooklyn.entity.nosql.riak.RiakClusterImpl
java.lang.IllegalStateException: Failed to load catalog item OJ081XYKT_0=:1.0 
required for rebinding.
        at 
brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.getLoadingContextFromCatalogItemId(RebindIteration.java:903)
 ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.load(RebindIteration.java:869)
 ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.newEntity(RebindIteration.java:814)
 ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindIteration.instantiateLocationsAndEntities(RebindIteration.java:407)
 [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindIteration.doRun(RebindIteration.java:234) 
[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.InitialFullRebindIteration.doRun(InitialFullRebindIteration.java:69)
 [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at brooklyn.entity.rebind.RebindIteration.run(RebindIteration.java:260) 
[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindManagerImpl.rebindImpl(RebindManagerImpl.java:545) 
[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindManagerImpl$3.call(RebindManagerImpl.java:496) 
[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindManagerImpl$3.call(RebindManagerImpl.java:494) 
[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.util.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:469)
 [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
[na:1.7.0_71]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_71]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_71]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
{noformat}

In the persisted state, there is no mention of the catalog item OJ081XYKT_0.

My assumption is that the customer manually added a catalog item (via the 
web-console), deployed an app (entitled "<snip> riak", of type RiakCluster), 
and then deleted the catalog item (or that "deletion" could have been an issue 
with persistence of catalog items - see 
https://github.com/apache/incubator-brooklyn/pull/555).

The desired behaviour is that this does not cause the entire Brooklyn instance 
to fail to rebind/start.

---
There are several potential things to investigate/improve:

* Test (manually, and then perhaps automated tests?):
   * adding a catalog item (via web-console), deploying an app, and restarting 
AMP
   * adding a catalog item (via web-console), deploying an app, deleting the 
catalog item (but not the app), and restarting AMP

* Investigate what catalog ids are used when adding through the web-console  
  (or did they manually choose the name OJ081XYKT_0?)

* Configurable for whether to continue startup onCreateFailed  
  (e.g. web-console pops up with "there was an error..."), but can click 
continue.

* Broolyn web-console to have a page showing all errors
  * Support "quick fixes" such as deleting the item(s).


  was:
A customer's Brooklyn instance failed to rebind on restart. The error was:

{noformat}
vcompose1476-compose-amp.console-v1.5.3.log:2015-05-15 06:57:12,808 ERROR 
Management node zdJa2A7Y enountered problem during rebind when promoting self 
to master; demoting to FAILED and rethrowing: 
brooklyn.util.exceptions.PropagatedRuntimeException: Failure rebinding, 71 
errors including: problem creating ENTITY Ocs2eaWX of type 
brooklyn.entity.nosql.riak.RiakClusterImpl: Failed to load catalog item 
OJ081XYKT_0=:1.0 required for rebinding.
oklyn-Allow-Non-Master-Access' to force)
{noformat}

The full exception was:

{noformat}
2015-05-15 06:40:31,369 WARN  b.e.r.RebindExceptionHandlerImpl 
[brooklyn-execmanager-boo0I83w-0]: No catalog item found with id 
OJ081XYKT_0=:1.0; returning null
2015-05-15 06:40:31,395 WARN  b.e.r.RebindExceptionHandlerImpl 
[brooklyn-execmanager-boo0I83w-0]: Rebind: continuing after problem creating 
ENTITY Ocs2eaWX of type brooklyn.entity.nosql.riak.RiakClusterImpl
java.lang.IllegalStateException: Failed to load catalog item OJ081XYKT_0=:1.0 
required for rebinding.
        at 
brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.getLoadingContextFromCatalogItemId(RebindIteration.java:903)
 ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.load(RebindIteration.java:869)
 ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.newEntity(RebindIteration.java:814)
 ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindIteration.instantiateLocationsAndEntities(RebindIteration.java:407)
 [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindIteration.doRun(RebindIteration.java:234) 
[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.InitialFullRebindIteration.doRun(InitialFullRebindIteration.java:69)
 [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at brooklyn.entity.rebind.RebindIteration.run(RebindIteration.java:260) 
[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindManagerImpl.rebindImpl(RebindManagerImpl.java:545) 
[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindManagerImpl$3.call(RebindManagerImpl.java:496) 
[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.entity.rebind.RebindManagerImpl$3.call(RebindManagerImpl.java:494) 
[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at 
brooklyn.util.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:469)
 [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
        at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
[na:1.7.0_71]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_71]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_71]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
{noformat}

In the persisted state, there is no mention of the catalog item OJ081XYKT_0.

My assumption is that the customer manually added a catalog item (via the 
web-console), deployed an app (entitled "<snip> riak", of type RiakCluster), 
and then deleted the catalog item (or that "deletion" could have been an issue 
with persistence of catalog items - see 
https://github.com/apache/incubator-brooklyn/pull/555).

The desired behaviour is that this does not cause the entire Brooklyn instance 
to fail to rebind/start.

---
There are several potential things to investigate/improve:

* Test (manually, and then perhaps automated tests?):
  * adding a catalog item (via web-console), deploying an app, and restarting 
AMP
  * adding a catalog item (via web-console), deploying an app, deleting the 
catalog item (but not the app), and restarting AMP

* Investigate what catalog ids are used when adding through the web-console  
  (or did they manually choose the name OJ081XYKT_0?)

* Configurable for whether to continue startup onCreateFailed  
  (e.g. web-console pops up with "there was an error..."), but can click 
continue.

* Broolyn web-console to have a page showing all errors
  * Support "quick fixes" such as deleting the item(s).



> Rebind failed when entity's catalog item not found
> --------------------------------------------------
>
>                 Key: BROOKLYN-149
>                 URL: https://issues.apache.org/jira/browse/BROOKLYN-149
>             Project: Brooklyn
>          Issue Type: Bug
>    Affects Versions: 0.7.0-SNAPSHOT
>            Reporter: Aled Sage
>             Fix For: 0.7.0-SNAPSHOT
>
>
> A customer's Brooklyn instance failed to rebind on restart. The error was:
> {noformat}
> vcompose1476-compose-amp.console-v1.5.3.log:2015-05-15 06:57:12,808 ERROR 
> Management node zdJa2A7Y enountered problem during rebind when promoting self 
> to master; demoting to FAILED and rethrowing: 
> brooklyn.util.exceptions.PropagatedRuntimeException: Failure rebinding, 71 
> errors including: problem creating ENTITY Ocs2eaWX of type 
> brooklyn.entity.nosql.riak.RiakClusterImpl: Failed to load catalog item 
> OJ081XYKT_0=:1.0 required for rebinding.
> oklyn-Allow-Non-Master-Access' to force)
> {noformat}
> The full exception was:
> {noformat}
> 2015-05-15 06:40:31,369 WARN  b.e.r.RebindExceptionHandlerImpl 
> [brooklyn-execmanager-boo0I83w-0]: No catalog item found with id 
> OJ081XYKT_0=:1.0; returning null
> 2015-05-15 06:40:31,395 WARN  b.e.r.RebindExceptionHandlerImpl 
> [brooklyn-execmanager-boo0I83w-0]: Rebind: continuing after problem creating 
> ENTITY Ocs2eaWX of type brooklyn.entity.nosql.riak.RiakClusterImpl
> java.lang.IllegalStateException: Failed to load catalog item OJ081XYKT_0=:1.0 
> required for rebinding.
>         at 
> brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.getLoadingContextFromCatalogItemId(RebindIteration.java:903)
>  ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at 
> brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.load(RebindIteration.java:869)
>  ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at 
> brooklyn.entity.rebind.RebindIteration$BrooklynObjectInstantiator.newEntity(RebindIteration.java:814)
>  ~[brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at 
> brooklyn.entity.rebind.RebindIteration.instantiateLocationsAndEntities(RebindIteration.java:407)
>  [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at 
> brooklyn.entity.rebind.RebindIteration.doRun(RebindIteration.java:234) 
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at 
> brooklyn.entity.rebind.InitialFullRebindIteration.doRun(InitialFullRebindIteration.java:69)
>  [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at 
> brooklyn.entity.rebind.RebindIteration.run(RebindIteration.java:260) 
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at 
> brooklyn.entity.rebind.RebindManagerImpl.rebindImpl(RebindManagerImpl.java:545)
>  [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at 
> brooklyn.entity.rebind.RebindManagerImpl$3.call(RebindManagerImpl.java:496) 
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at 
> brooklyn.entity.rebind.RebindManagerImpl$3.call(RebindManagerImpl.java:494) 
> [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at 
> brooklyn.util.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:469)
>  [brooklyn-core-0.7.0-20150509.1751.jar:0.7.0-20150509.1751]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [na:1.7.0_71]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> {noformat}
> In the persisted state, there is no mention of the catalog item OJ081XYKT_0.
> My assumption is that the customer manually added a catalog item (via the 
> web-console), deployed an app (entitled "<snip> riak", of type RiakCluster), 
> and then deleted the catalog item (or that "deletion" could have been an 
> issue with persistence of catalog items - see 
> https://github.com/apache/incubator-brooklyn/pull/555).
> The desired behaviour is that this does not cause the entire Brooklyn 
> instance to fail to rebind/start.
> ---
> There are several potential things to investigate/improve:
> * Test (manually, and then perhaps automated tests?):
>    * adding a catalog item (via web-console), deploying an app, and 
> restarting AMP
>    * adding a catalog item (via web-console), deploying an app, deleting the 
> catalog item (but not the app), and restarting AMP
> * Investigate what catalog ids are used when adding through the web-console  
>   (or did they manually choose the name OJ081XYKT_0?)
> * Configurable for whether to continue startup onCreateFailed  
>   (e.g. web-console pops up with "there was an error..."), but can click 
> continue.
> * Broolyn web-console to have a page showing all errors
>   * Support "quick fixes" such as deleting the item(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to