I would prefer if PBE was enabled at system start and never touched again. That I think should be our goal. In the next release we can see if we can reach it.
I guess this is only an issue with 1PBE IMM? So just moving enable to an earlier point in time seems fine to me and inline with the (my) end goal. /Hans > -----Original Message----- > From: Ingvar Bergström [mailto:ingvar.bergst...@ericsson.com] > Sent: den 19 december 2013 15:57 > To: Anders Björnerstedt; Bertil Engelholm > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] [PATCH 0 of 1] Review Request for SMF #677 > > This patch is only about moving the point where the IMM content will be > persistent. This will gain the system robustness during the > test period. > On this I think we can agree. > > What I want to know if we should let the quite few delete operations during > commit be executed with PBE on or if it will be better to > temporarily again turn off/on the PBE during the commit operation. > > The work to remove/handle the PRTO in another way is another enhancement > ticket. > > So which one would you recommend? > 1) let the (quite few) PRTO delete operations during commit execute with PBE > enabled. > 2) it is better to again turn off/on the PBE for a short time for the few > PRTO delete operations during commit > > /Ingvar > > -----Original Message----- > From: Anders Björnerstedt > Sent: den 19 december 2013 15:26 > To: Ingvar Bergström; Bertil Engelholm > Cc: opensaf-devel@lists.sourceforge.net > Subject: RE: [devel] [PATCH 0 of 1] Review Request for SMF #677 > > I am not worried about performance. Thats not the issue. > > I am worried about robustness (service unavailability), the degeneration of > other services. > You say you are not convinced that there is a difference in robustness. > Well I just explained the difference in robustness. The difference is > ERR_TIMEOUTS versus ERR_TRY_AGAIN. That translates to a > difference in robustness. > > Of course what I would really like to see is that you change the SMF design > so that it does not use PRTOs at all. You could either > convert to config data. > But more fundamentally, the campaign objects dont really need to be > persistent. > As I undestand it you start with a campaign file in some XML format (thats > the persistent start state). > This is then converted to campaign runtime objects, which are derivative > objects that contain the Same information pluss some > runtime data that almost never needs to be persistent. > For the few cases where you do need to performa a cluster restart step, you > should be able To persistify the limited ammount of > runtime data that you herer really need to have persistent. > That should be possible to do either in a CCB (or possibly in ONE PRTO). > > /AndersBj > > > -----Original Message----- > From: Ingvar Bergström > Sent: den 19 december 2013 15:14 > To: Anders Björnerstedt; Bertil Engelholm > Cc: opensaf-devel@lists.sourceforge.net > Subject: RE: [devel] [PATCH 0 of 1] Review Request for SMF #677 > > Campaigns I have seen have typically one or two procedures. So from SMF there > will be one PRTO delete operation for each > procedure. > Saying that, I'm not convinced the system will gain performance/robustness if > the PBE is turned off/on (again) during the SMF commit > operation. > > /Ingvar > > -----Original Message----- > From: Anders Björnerstedt > Sent: den 19 december 2013 14:57 > To: Ingvar Bergström; Bertil Engelholm > Cc: opensaf-devel@lists.sourceforge.net > Subject: RE: [devel] [PATCH 0 of 1] Review Request for SMF #677 > > How many subtrees would you typically have? > > If the entire campaign is ONE tree you could delete the entire campaign as > one cascading delete. > That would actually be a better solution than deleteing lots of objects > individually since one cascading delete will be one Sqlite > wrtite/transaction. Of course the only reason you need to delete the > persistent structure, is because you made it persistent. The > irony is that the Persistence is a total waste of resources for all campaigns > except where there is a step that has a planned cluster > restart. > > The term "safe" is relative. What I am worried about is the risk that a burst > of individual PRTO writes on the order of say 100 or more, > that could hog the PBE indefinitely. The problem here is the queue buildup, > where the application would get ERR_TIMEOUT instead of > ERR_TRY_AGAIN. Particlarly if an SC is restarting and DRBD syncing at the > same time. > > Yes enable of PBE will regenerate the database and that also takes time, but > services will get TRY_AGAIN in this state when > attempting persistent writes. So this is much better behaved and services > must be Designed so that they can deal with this. But few > services are equiped to deal well with getting repeated ERR_TIMEOUT (on > persistent write requests). > > /AndersBj > > > -----Original Message----- > From: Ingvar Bergström > Sent: den 19 december 2013 14:04 > To: Anders Björnerstedt; Bertil Engelholm > Cc: opensaf-devel@lists.sourceforge.net > Subject: RE: [devel] [PATCH 0 of 1] Review Request for SMF #677 > > If we switch on PBE in completed state, then switch off PBE at start of > commit, then again switch on PBE at end of commit. > Is this more effective than keep the PBE on during commit? I guess the IMM > content is dumped twice if turned off during commit? > I hope IMM is safe, so what would be safer? I thought the performance was the > main concern. > > I don't know if it matters but SMF PTRO are deleted subtree by subtree where > the top of the trees are the procedure objects. So > there are a very limited number of delete operations to delete SMF internal > objects. > > You see no problem switching PBE on/off/on in sequence? If the user commits > the campaign fast, the first PBE on may not be finished > before PBE off and then PBE on are received again. > > /Ingvar > > -----Original Message----- > From: Anders Björnerstedt > Sent: den 19 december 2013 11:20 > To: Ingvar Bergström; Bertil Engelholm > Cc: opensaf-devel@lists.sourceforge.net > Subject: RE: [devel] [PATCH 0 of 1] Review Request for SMF #677 > > Well the problem I am worried about is if there is alreg number of PRTOs to > be deleted, Then that is definitely a risky operation in > itself with PBE turned on. > > But if you have deleted the bulk of the SMF PRTOs before the PBE is turned > on, then no problem. > The advantage of turning off the PBE again would then be IF there may be a > substantial number of PRTOs to be deleted, then that > would be safer (less risk of service unavailability). > > /AndersBj > > -----Original Message----- > From: Ingvar Bergström > Sent: den 19 december 2013 11:06 > To: Anders Björnerstedt; Bertil Engelholm > Cc: opensaf-devel@lists.sourceforge.net > Subject: RE: [devel] [PATCH 0 of 1] Review Request for SMF #677 > > That's a valid question. The difference to the existing solution is that the > wrapup actions (if any) which are executed at commit will be > executed with PBE turned on. > Normally a limited number of IMM operations are defined in the campaign > wrapup portion of the campaign. That's the reason PBE in > the proposal is not turned off again for the commit. > To me the advantage of turning off/on PBE for the commit operation is very > limited. > > /Ingvar > > -----Original Message----- > From: Anders Björnerstedt > Sent: den 19 december 2013 09:36 > To: Ingvar Bergström; Bertil Engelholm > Cc: opensaf-devel@lists.sourceforge.net > Subject: RE: [devel] [PATCH 0 of 1] Review Request for SMF #677 > > One question on this SMF enhancement. > > I understand that PBE will be switched on towards the end of a campaign, but > earlier than before. > > Does this mean that there remains a phase of cleanup/deletes of lots of PRTOs > to be done after testing ? > If so then *maybe* PBE should be turned off again, during the final cleanup > of campaign PRTOs, then on again. > > /AndersBj > > > -----Original Message----- > From: Ingvar Bergstrom [mailto:ingvar.bergst...@ericsson.com] > Sent: den 19 december 2013 08:49 > To: Bertil Engelholm > Cc: opensaf-devel@lists.sourceforge.net > Subject: [devel] [PATCH 0 of 1] Review Request for SMF #677 > > Summary: smfd: turn on PBE in state completed Review request for Trac > Ticket(s): 677 Peer Reviewer(s): bertil Pull request to: > Affected branch(es): default > Development branch: default > > -------------------------------- > Impacted area Impact y/n > -------------------------------- > Docs n > Build system n > RPM/packaging n > Configuration files n > Startup scripts n > SAF services n > OpenSAF services y > Core libraries n > Samples n > Tests n > Other n > > > Comments (indicate scope for each "y" above): > --------------------------------------------- > smfd: turn on PBE in state completed > > changeset bf8be15d6ab7daa218e2cf530d0064a0efabf0f5 > Author: Ingvar Bergstrom <ingvar.bergst...@ericsson.com> > Date: Thu, 19 Dec 2013 08:45:32 +0100 > > smfd: turn on PBE in state completed [677] > > To avoid the need of restore in case of cluster reboot, SMF turn on PBE > in > completed state. > > > Complete diffstat: > ------------------ > osaf/services/saf/smfsv/smfd/SmfCampState.cc | 14 +++++++++++--- > osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc | 2 +- > 2 files changed, 12 insertions(+), 4 deletions(-) > > > Testing Commands: > ----------------- > > > Testing, Expected Results: > -------------------------- > > > Conditions of Submission: > ------------------------- > > > Arch Built Started Linux distro > ------------------------------------------- > mips n n > mips64 n n > x86 n n > x86_64 y y > powerpc n n > powerpc64 n n > > > Reviewer Checklist: > ------------------- > [Submitters: make sure that your review doesn't trigger any checkmarks!] > > > Your checkin has not passed review because (see checked entries): > > ___ Your RR template is generally incomplete; it has too many blank entries > that need proper data filled in. > > ___ You have failed to nominate the proper persons for review and push. > > ___ Your patches do not have proper short+long header > > ___ You have grammar/spelling in your header that is unacceptable. > > ___ You have exceeded a sensible line length in your headers/comments/text. > > ___ You have failed to put in a proper Trac Ticket # into your commits. > > ___ You have incorrectly put/left internal data in your comments/files > (i.e. internal bug tracking tool IDs, product names etc) > > ___ You have not given any evidence of testing beyond basic build tests. > Demonstrate some level of runtime or other sanity testing. > > ___ You have ^M present in some of your files. These have to be removed. > > ___ You have needlessly changed whitespace or added whitespace crimes > like trailing spaces, or spaces before tabs. > > ___ You have mixed real technical changes with whitespace and other > cosmetic code cleanup changes. These have to be separate commits. > > ___ You need to refactor your submission into logical chunks; there is > too much content into a single commit. > > ___ You have extraneous garbage in your review (merge commits etc) > > ___ You have giant attachments which should never have been sent; > Instead you should place your content in a public tree to be pulled. > > ___ You have too many commits attached to an e-mail; resend as threaded > commits, or place in a public tree for a pull. > > ___ You have resent this content multiple times without a clear indication > of what has changed between each re-send. > > ___ You have failed to adequately and individually address all of the > comments and change requests that were proposed in the initial review. > > ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc) > > ___ Your computer have a badly configured date and time; confusing the > the threaded patch review. > > ___ Your changes affect IPC mechanism, and you don't present any results > for in-service upgradability test. > > ___ Your changes affect user manual and documentation, your patch series > do not contain the patch that updates the Doxygen manual. > > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application > performance affects their revenue. With AppDynamics, you get 100% visibility > into your Java,.NET, & PHP application. Start your 15- > day FREE TRIAL of AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel