[tickets] [opensaf:tickets] #2674 pyosaf: add README for high level python interfaces

2017-11-20 Thread Long H Buu Nguyen via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit 89c560f5694007cde72f54e6af0834cd137bf506 (HEAD, origin/develop, develop)
Author: Long H Buu Nguyen 
Date:   Fri Nov 10 10:44:59 2017 +0700

pyosaf: add README for high level python interfaces [#2674]

commit 0aca746777bd4e5454cdd5dabdc04375ebe1f675 (HEAD, origin/release, release)
Author: Long H Buu Nguyen 
Date:   Fri Nov 10 10:44:59 2017 +0700

pyosaf: add README for high level python interfaces [#2674]



---

** [tickets:#2674] pyosaf: add README for high level python interfaces**

**Status:** fixed
**Milestone:** 5.17.11
**Created:** Fri Nov 10, 2017 03:40 AM UTC by Long H Buu Nguyen
**Last Updated:** Fri Nov 10, 2017 12:02 PM UTC
**Owner:** Long H Buu Nguyen


This ticket is to add a README file for high level python interface.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2688 NTF: Cold sync misses to sync alarm logged notification

2017-11-20 Thread Minh Hon Chau via Opensaf-tickets



---

** [tickets:#2688] NTF: Cold sync misses to sync alarm logged notification**

**Status:** accepted
**Milestone:** 5.18.01
**Created:** Tue Nov 21, 2017 04:29 AM UTC by Minh Hon Chau
**Last Updated:** Tue Nov 21, 2017 04:29 AM UTC
**Owner:** Minh Hon Chau


Steps to reproduce:
- Lock SI to raise alarm
- Unlock SI to cease alarm
- ntfread, alarms are printed
- reboot standby sc
- swap 2N Opensaf SI
- ntfread, no alarm is printed

After reboot the standby SC, the alarm that was logged in active SC are not 
synced to standby SC, thus after switchover, ntfread shows no alarm


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1398 smf: Add capability to redo CCBs that fail

2017-11-20 Thread elunlen via Opensaf-tickets
Creation of a handler is ongoing that will contain all IMM handling needed to 
make a midification of the IMM model. This includes:
* Create, Modify and Delete of objects
* An easy to use generic C++ API where no IMM APIs has to be handled
* Handling all needed IMM (C) APIs
* Handling all rules associated with usage of the IMM APIs
* Handling all possible recovery when IMM APIs returns something else than OK
* Etc...

Attached is a .h file with a proposed API for this handling


Attachments:

- 
[immccb.h](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/2179c610/dd3c/attachment/immccb.h)
 (14.1 kB; application/octet-stream)


---

** [tickets:#1398] smf: Add capability to redo CCBs that fail **

**Status:** accepted
**Milestone:** 5.18.01
**Created:** Wed Jul 01, 2015 02:07 PM UTC by Rafael Odzakow
**Last Updated:** Mon Nov 20, 2017 03:46 PM UTC
**Owner:** elunlen


CCBs may fail for a variety of resource related reasons. SMF campaigns can
be made more robust if they are capable of redoing/replaying a CCB that has 
been aborted. A CCB that is aborted due to validation error will not succeed
when replayed, but no damage will be done either. A CCB that is aborted due to
resource reasons may succeed when replayed, avoiding the abandonement of the
whole campaign.


During the final stages of an upgrade campaign PBE is enabled. PBE is not ready 
until it attaches, so CCB operations will get TRY_AGAIN in that window. Once the
PBE has attached the IMM is persistent-write-available and CCB operations are
allowed again.

Any CCB started and adding operations *before* the PBE was enabled by a CCB,
will be a doomed CCB. This since the CCBs generated operations before the PBE
was enabled and thus before the PBE was even starting and thus the PBE will be
unaware of these pre-PBE-enable operations. Such a CCB would fail on an op-count
check in the CCB commit processing of that CCB in the PBE. 

In 4.7-tentative an enhancement #1261 was implemented in the IMM service
to make this abort cleaner, i.e. to avoid the ugly op-count error in the PBE.
The PBE generates an admin-operation to abort *all* open CCBs (all CCBs that
are active but not critical), just before attaching. The problem was that the
first implementation of #1261 resulted in the PBE often attaching as OI *before*
the abort of non-critical CCBs had been processed. When the abort requested by 
the PBE was finally processed it aborted also "innocent" CCBs that had actually
started *after* the PBE was attached as PBE-OI.

The syndrome as such, i.e. attach of PBE causing the abort of a valid CCB,
could still happen on earlier releases but was quite rare. The syslog
would then show the op-count error reported by the PBE. 

A possible improvement in SMF is to read the runtime-attribute:

   opensafImmNostdFlags

in the OpenSAF IMM object opensafImm=opensafImm,safApp=safImmService

and check that it is not  which would mean that PBE is attached.
But it is not really clear why this is needed in 4.7-tentative when it was
not needed earlier. 

CCBs may actually get aborted due to resource error at any time and not only in
conjunction with PBE enable. A general increase of the robustness of SMF 
campaigns
could be achieved by adding logic for redoing CCBs that fail unexpectedly.
If such a CCB was valid, i.e. it was aborted due to resource error and not
validation error, then it has a high probability of succeeding when retried.


IMM ticked related to this: #1261


Jun 29 10:36:35 SC-2-2 osafimmpbed: IN Admop for aborting CCBs result: 1, immsv 
returned 1
Jun 29 10:36:35 SC-2-2 osafimmpbed: NO Update epoch 63 committing with 
ccbId:10185/4294967685
Jun 29 10:36:36 SC-2-2 osafsmfd[4726]: NO CAMP: Start campaign complete actions 
(95)
Jun 29 10:36:36 SC-2-2 osafimmnd[4476]: NO Create of PERSISTENT runtime object 
'smfRollbackElement=CampComplete,safSmfCampaign=ERIC-CMWUpgrade,safApp=safSmfService'
 (safSmfCampaign).
Jun 29 10:36:36 SC-2-2 osafimmnd[4476]: NO Ccb 305 COMMITTED 
(immcfg_SC-2-1_14718)
Jun 29 10:36:36 SC-2-2 osafimmnd[4476]: NO Ccb 306 COMMITTED 
(immcfg_SC-2-1_14741)
Jun 29 10:36:36 SC-2-2 osafimmnd[4476]: NO Ccb 307 COMMITTED 
(immcfg_SC-2-1_14764)
Jun 29 10:36:36 SC-2-2 osafimmnd[4476]: NO Ccb 308 COMMITTED 
(immcfg_SC-2-1_14787)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Ccb 309 COMMITTED 
(immcfg_SC-2-1_14810)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Ccb 310 COMMITTED 
(immcfg_SC-2-1_14833)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Ccb 311 COMMITTED 
(immcfg_SC-2-1_14856)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Ccb 312 COMMITTED 
(immcfg_SC-2-1_14879)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Create of PERSISTENT runtime object 
'smfRollbackElement=ccb_0002,smfRollbackElement=CampComplete,safSmfCampaign=ERIC-CMWUpgrade,safApp=safSmfService'
 (safSmfCampaign).
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO PBE-OI established on this SC. 
Dumping incrementally to file imm.db
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO CCB 313 aborted 

[tickets] [opensaf:tickets] #2646 dtm: Add a tool for syncing log messages to disk

2017-11-20 Thread Anders Widell via Opensaf-tickets
- **status**: unassigned --> accepted
- **assigned_to**: Anders Widell



---

** [tickets:#2646] dtm: Add a tool for syncing log messages to disk**

**Status:** accepted
**Milestone:** 5.18.01
**Created:** Thu Oct 19, 2017 11:20 AM UTC by Anders Widell
**Last Updated:** Mon Oct 30, 2017 05:42 PM UTC
**Owner:** Anders Widell


Add a command-line tool that can flush and sync all log streams to disk. Use 
this tool in the opensaf_reboot script before rebooting.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1398 smf: Add capability to redo CCBs that fail

2017-11-20 Thread elunlen via Opensaf-tickets
- **status**: unassigned --> accepted
- **assigned_to**: elunlen
- **Blocker**:  --> False
- **Milestone**: future --> 5.18.01



---

** [tickets:#1398] smf: Add capability to redo CCBs that fail **

**Status:** accepted
**Milestone:** 5.18.01
**Created:** Wed Jul 01, 2015 02:07 PM UTC by Rafael Odzakow
**Last Updated:** Wed Jul 15, 2015 12:02 PM UTC
**Owner:** elunlen


CCBs may fail for a variety of resource related reasons. SMF campaigns can
be made more robust if they are capable of redoing/replaying a CCB that has 
been aborted. A CCB that is aborted due to validation error will not succeed
when replayed, but no damage will be done either. A CCB that is aborted due to
resource reasons may succeed when replayed, avoiding the abandonement of the
whole campaign.


During the final stages of an upgrade campaign PBE is enabled. PBE is not ready 
until it attaches, so CCB operations will get TRY_AGAIN in that window. Once the
PBE has attached the IMM is persistent-write-available and CCB operations are
allowed again.

Any CCB started and adding operations *before* the PBE was enabled by a CCB,
will be a doomed CCB. This since the CCBs generated operations before the PBE
was enabled and thus before the PBE was even starting and thus the PBE will be
unaware of these pre-PBE-enable operations. Such a CCB would fail on an op-count
check in the CCB commit processing of that CCB in the PBE. 

In 4.7-tentative an enhancement #1261 was implemented in the IMM service
to make this abort cleaner, i.e. to avoid the ugly op-count error in the PBE.
The PBE generates an admin-operation to abort *all* open CCBs (all CCBs that
are active but not critical), just before attaching. The problem was that the
first implementation of #1261 resulted in the PBE often attaching as OI *before*
the abort of non-critical CCBs had been processed. When the abort requested by 
the PBE was finally processed it aborted also "innocent" CCBs that had actually
started *after* the PBE was attached as PBE-OI.

The syndrome as such, i.e. attach of PBE causing the abort of a valid CCB,
could still happen on earlier releases but was quite rare. The syslog
would then show the op-count error reported by the PBE. 

A possible improvement in SMF is to read the runtime-attribute:

   opensafImmNostdFlags

in the OpenSAF IMM object opensafImm=opensafImm,safApp=safImmService

and check that it is not  which would mean that PBE is attached.
But it is not really clear why this is needed in 4.7-tentative when it was
not needed earlier. 

CCBs may actually get aborted due to resource error at any time and not only in
conjunction with PBE enable. A general increase of the robustness of SMF 
campaigns
could be achieved by adding logic for redoing CCBs that fail unexpectedly.
If such a CCB was valid, i.e. it was aborted due to resource error and not
validation error, then it has a high probability of succeeding when retried.


IMM ticked related to this: #1261


Jun 29 10:36:35 SC-2-2 osafimmpbed: IN Admop for aborting CCBs result: 1, immsv 
returned 1
Jun 29 10:36:35 SC-2-2 osafimmpbed: NO Update epoch 63 committing with 
ccbId:10185/4294967685
Jun 29 10:36:36 SC-2-2 osafsmfd[4726]: NO CAMP: Start campaign complete actions 
(95)
Jun 29 10:36:36 SC-2-2 osafimmnd[4476]: NO Create of PERSISTENT runtime object 
'smfRollbackElement=CampComplete,safSmfCampaign=ERIC-CMWUpgrade,safApp=safSmfService'
 (safSmfCampaign).
Jun 29 10:36:36 SC-2-2 osafimmnd[4476]: NO Ccb 305 COMMITTED 
(immcfg_SC-2-1_14718)
Jun 29 10:36:36 SC-2-2 osafimmnd[4476]: NO Ccb 306 COMMITTED 
(immcfg_SC-2-1_14741)
Jun 29 10:36:36 SC-2-2 osafimmnd[4476]: NO Ccb 307 COMMITTED 
(immcfg_SC-2-1_14764)
Jun 29 10:36:36 SC-2-2 osafimmnd[4476]: NO Ccb 308 COMMITTED 
(immcfg_SC-2-1_14787)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Ccb 309 COMMITTED 
(immcfg_SC-2-1_14810)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Ccb 310 COMMITTED 
(immcfg_SC-2-1_14833)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Ccb 311 COMMITTED 
(immcfg_SC-2-1_14856)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Ccb 312 COMMITTED 
(immcfg_SC-2-1_14879)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Create of PERSISTENT runtime object 
'smfRollbackElement=ccb_0002,smfRollbackElement=CampComplete,safSmfCampaign=ERIC-CMWUpgrade,safApp=safSmfService'
 (safSmfCampaign).
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO PBE-OI established on this SC. 
Dumping incrementally to file imm.db
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO CCB 313 aborted by: immadm -o 202 
safRdn=immManagement,safApp=safImmService
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: WA Timeout while waiting for 
implementer, aborting ccb:313
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Ccb 313 ABORTED (SMFSERVICE)
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: WA >>s_info->to_svc == 0<< reply 
context destroyed before this reply could be made
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: WA Failed to send response to 
agent/client over MDS
Jun 29 10:36:37 SC-2-2 osafimmnd[4476]: NO Ccb <313> not in 

[tickets] [opensaf:tickets] #2637 base: Trace messages can be dropped

2017-11-20 Thread Anders Widell via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit 37bd253efec52d924eb0757fe6ce049f5a264d86 (HEAD -> release, 
origin/release)
Author: Anders Widell 
Date:   Mon Nov 20 13:46:17 2017 +0100

base: Send logtrace message in blocking mode to avoid dropped messages 
[#2637]

Add a the possibility to select between blocking and non-blocking send and
receive operations on the UnixSocket, and use blocking mode when sending
logtrace trace messages. Trace messages will thus not be lost, but excessive
tracing can slow down the service being traced. This patch also contains a 
small
optimization to reduce the size of each trace log message, by avoiding fixed
field widths and redundant characters.




---

** [tickets:#2637] base: Trace messages can be dropped**

**Status:** fixed
**Milestone:** 5.18.01
**Created:** Thu Oct 19, 2017 10:13 AM UTC by Anders Widell
**Last Updated:** Mon Nov 13, 2017 03:26 PM UTC
**Owner:** Anders Widell


Since the new logtrace implementation uses non-blocking send, trace messages 
may be dropped in case a large amount of trace messages are generated in a 
burst. The suggestion is to use blocking send instead, to avoid loss of trace 
messages.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets