Taylor, Have you considered the use of an external object store for persistence[1]?
Robert [1] http://brooklyn.apache.org/v/latest/ops/persistence/index.html#object-store-persistence On 31 July 2017 at 15:04, Taylor <[email protected]> wrote: > > Duncan, > > Thanks so much for the thorough response! > > I will be reviewing the links you sent today. > > With respect to the snapshot: I am running brooklyn on a CentOS VM hosted > on XenServer. Since the original email I have been experimenting with > snapshots to try and diagnose what the issue is. The only way I can take a > snapshot and revert is if I stop the service and power off the vm before > taking a snapshot (disk only, no memory). If I take the snapshot while the > service is running or a after the service is stopped the persisted state > will get corrupted. > > This has me worried for the case of a production outage. > > Are there any tools to aid in fixing the persisted state manually? > > What mechanism is safe for backing up the persisted state? Can I backup > while the service is running? > > Thanks, > > Taylor > > ________________________________ > From: Duncan Grant <[email protected]> > Sent: Monday, July 31, 2017 3:31 AM > To: [email protected] > Cc: Taylor > Subject: Re: Booklyn fails to start > > Taylor, > > The error you're seeing is with Brooklyn failing to rebind to persisted > state [1]. Could you explain what you mean when you are talking about > taking a snapshot and then reverting to the snapshot (do you mean the VM > image where you are running brooklyn?) > > There are a couple of ways to deal with problems with persisted state. > You can either fix the persisted state manually[2] or you can have brooklyn > ignore errors with persisted state when it starts [4]. Both of these run > the risk of brooklyn becoming detached from existing applications so back > up your persistance directory (or object store) first. > > Let me know if this helps (or doesn't) or I'm on IRC just now if you'd > like some answers in real-time. > > Regards > > Duncan > > [1]https://brooklyn.apache.org/v/latest/ops/persistence/ > index.html#rebinding-to-state > Persistence - Apache Brooklyn<https://brooklyn.apache.org/v/latest/ops/ > persistence/index.html#rebinding-to-state> > brooklyn.apache.org > Persistence. Brooklyn can be configured to persist its state so that the > Brooklyn server can be restarted, or so that a high availability standby > server can take over. > > > > [2]https://brooklyn.apache.org/v/latest/ops/persistence/ > index.html#determine-underlying-cause > Persistence - Apache Brooklyn<https://brooklyn.apache.org/v/latest/ops/ > persistence/index.html#determine-underlying-cause> > brooklyn.apache.org > Persistence. Brooklyn can be configured to persist its state so that the > Brooklyn server can be restarted, or so that a high availability standby > server can take over. > > > > [3]https://brooklyn.apache.org/v/latest/ops/persistence/ > index.html#fix-up-the-state > Persistence - Apache Brooklyn<https://brooklyn.apache.org/v/latest/ops/ > persistence/index.html#fix-up-the-state> > brooklyn.apache.org > Persistence. Brooklyn can be configured to persist its state so that the > Brooklyn server can be restarted, or so that a high availability standby > server can take over. > > > > [4]https://brooklyn.apache.org/v/latest/ops/persistence/ > index.html#ignore-errors > Persistence - Apache Brooklyn<https://brooklyn.apache.org/v/latest/ops/ > persistence/index.html#ignore-errors> > brooklyn.apache.org > Persistence. Brooklyn can be configured to persist its state so that the > Brooklyn server can be restarted, or so that a high availability standby > server can take over. > > > > > > On Mon, 31 Jul 2017 at 07:57 Taylor <[email protected]<mailto:ts > [email protected]>> wrote: > I am having a problem with brooklyn. If I start/stop the service things > are ok. If I snapshot and revertto snapshot I see the following: > > > [root@localhost ~]# systemctl status brooklyn > brooklyn.service - Apache Brooklyn Service > Loaded: loaded (/etc/systemd/system/multi-user.target.wants/brooklyn. > service) > Active: active (running) since Sun 2017-07-30 17:19:55 EDT; 44s ago > Docs: https://brooklyn.apache.org/documentation/index.html > Main PID: 651 (java) > CGroup: /system.slice/brooklyn.service > └─651 /usr/bin/java -Dbrooklyn.location.localhost.address=127.0.0.1 > -XX:SoftRefLRUPolicyMSPerMB=1 > -Dlogback.configurationFile=/etc/brooklyn/logback.xml > -Xms256m -Xmx1g -XX:MaxP... > > Jul 30 17:20:11 localhost.localdomain java[651]: 2017-07-30 17:20:11,553 > INFO Started Brooklyn console at http://127.0.0.1:8081/, running > classpath://brooklyn.war@/ > Jul 30 17:20:13 localhost.localdomain java[651]: 2017-07-30 17:20:13,401 > INFO Geo info lookup for 127.0.0.1/127.0.0.1<http://127.0.0.1/127.0.0.1> > returned: HostGeoInfo[RCN Corporation, Chicago (US): 127...4096374512 > <tel:(409)%20637-4512>)] > Jul 30 17:20:13 localhost.localdomain java[651]: 2017-07-30 17:20:13,736 > ERROR Subsystem for persistence had startup error (continuing with > startup): java.lang.IllegalStateExc...was scanning > Jul 30 17:20:13 localhost.localdomain java[651]: > java.lang.IllegalStateException: > Node record nodes/vmL5HEpG could not be read when upxGnvJq was scanning > Jul 30 17:20:13 localhost.localdomain java[651]: at > org.apache.brooklyn.core.mgmt.ha.ManagementPlaneSyncRecordPersi > sterToObjectStore.loadSyncRecord(ManagementPlaneSyncRecordPe... > .jar:0.11.0] > Jul 30 17:20:13 localhost.localdomain java[651]: 2017-07-30 17:20:13,736 > WARN Loading catalog for INITIALIZING as part of launch sequence (it was > not loaded as part of the rebind sequence) > Jul 30 17:20:18 localhost.localdomain java[651]: 2017-07-30 17:20:18,851 > INFO Launched Brooklyn; will now block until shutdown command received via > GUI/API (recommended) or p...s interrupt. > Jul 30 17:20:28 localhost.localdomain java[651]: 2017-07-30 17:20:28,309 > WARN Disallowing web request as server not in required HA hot state: > http://192.168.1.14:8081/v1/catalog/applicat... > Jul 30 17:20:28 localhost.localdomain java[651]: 2017-07-30 17:20:28,309 > WARN Disallowing web request as server not in required HA hot state: > http://192.168.1.14:8081/v1/loca...s' to force) > Jul 30 17:20:28 localhost.localdomain java[651]: 2017-07-30 17:20:28,309 > WARN Disallowing web request as server not in required HA hot state: > http://192.168.1.14:8081/v1/catalog/entities... > Hint: Some lines were ellipsized, use -l to show in full. > > >
