TC-4
Attachments:
-
[TC-4.rar](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/a760/attachment/TC-4.rar)
(353.9 kB; application/octet-stream)
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** fixed
**Milestone:** 5.1.FC
Hi Praveen, Nagu,
Thanks for reminding me this, I will add these points to PR Doc and README
In #1725 part 2 series, there's a patch that is trying to detect inappropriate
RTAs read from IMM after headless. It could happen for AdminState also since
the IMM update is queued at AMFD. Decision is
Yes, these points need to be mentioned in the Amf PR doc.
Minh, you can add some more scenario if you want and please provide some
details when writing these points.
Thanks
-Nagu
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** fixed
**Milestone:**
Hi Minh,
Since this ticlet is marked fixed, we need to document some discussed
conclusion (limitation) where application will not be recovered. These were
based on following cases discussed in this ticket.
1)When ssytem bcomes headless When AMFD sends some assignment message because
of admin
Attached file is a rebased version of pending patch on top cs 8091, plus one
bug fix is 08_donot_add_su_list_if_no_pending_susi.diff
Apply order:
03_V4_failover_absent_susi_longDn.diff
04_V2_headless_validation.diff
05_V2_resend_oper_state.diff
06a_fullscope_escalation_headless.diff
Attached patch to fix the following tests from Nagu:
Configuration: SU1(act) and SU2(stanby) both on PL-3.
TC #1: Start SC-1, PL-3 and PL-5: Unlock SU1 and SU2. Stop SC-1 and stop PL-3,
start PL-3 and start SC-1.
After SC-1 and PL-3 comes back, ideally SU1 and SU2 should get assignments as
Act
Hi Nagu
The node restart had been sent out for review, the recovery faults patches has
been attached to this ticket.
https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/3408/attachment/1725_pending_review.tgz
The test cases has been attached to this ticket as well
1. Do you mean all the faults and recovery during admin operations in 2.b or
just node restart ?
2. Can you please patches in tar file for @2. Can you please upload the test
cases for phase @2.
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** fixed
I copy the implementation phases as mentioned in prior comments
@1. Admin op continuation without required recovery on faults during headless
@1.a) All CSI(s) callback completes during headless, but SUSI states are still
QUIESCED/QUIESCING
@1.b) One of CSI(s) callback is still ongoing after
Hi Minh,
Your phase #3 should be including the following(without SI Dep), please confirm:
3. Without Si Dep : Admin Op + node restart faults/escalations during headless.
4. Without Si Dep :
a.) All faults in normal flows.
b.) All faults during admin operation(minus node reboot during headless as
Attach all patches pending for review.
The issue of saAmfClusterStartupTimeout and support nodegroup is agreed to fix
after FC
Attachments:
-
[1725_pending_review.tgz](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/3408/attachment/1725_pending_review.tgz)
(12.1 kB;
- **status**: review --> fixed
- **assigned_to**: Minh Hon Chau --> nobody
- **Comment**:
Only support admin continuation for 2N, no SI dep, and a known issue of
saAmfClusterStartupTimeout less than 10 seconds.
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
Attach a patch to restore Ng state, that can help to pass TC NG from Nagu.
But it's not a completed solution to restore NG, because
- the patch assumes all SUs belong to one NG (ng_admin run under
AVD_SG_FSM_ADMIN). If not all SUs are in one nodegroup, than ng_admin run under
AVD_SG_FSM_SU_OPER)
Node Group TC
Attachments:
- [TC
NG.rar](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/21c5/attachment/TC%20NG.rar)
(282.0 kB; application/octet-stream)
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** review
**Milestone:**
one bug fix from Praveen, please apply it for failed TC
Attachments:
-
[1725_02_V2_bugfix_1_honor_cluster_sync_timer.diff](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/d5a5/attachment/1725_02_V2_bugfix_1_honor_cluster_sync_timer.diff)
(1.7 kB; text/x-patch)
---
**
With 1725_phase_1_V2.tgz also, the TC failed.
Attachments:
- [TC
#2.rar](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/c9e5/attachment/TC%20%232.rar)
(398.3 kB; application/octet-stream)
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
Attach series for fixing TC#1, apply orders:
1725_01_V5_intro_new_rta_states_longDn.diff
1725_02_V2_resend_su_si_assign_msg_longDn.diff
1725_02_V2_bugfix_resend_buffer_in_set_leds.diff
Attachments:
-
Attach longDN rebased version of #1725 part 2: Support recovery for node
restart/power off while headless
The attached patch is equivalent to : [PATCH 3 of 4] AMFD: Failover absent
assignment due to node restart or powered off while headless [#1725 part 2] +
some bug fixes
Attachments:
-
Hi Minh,
I am going through the patches 1725_phase1.tgz. Some initial comments:
1) In patch 2 avnd_diq_rec_send_buffered_msg() checks presence of SUSI then
only it sends buffered message to AMFD. In case removal of assignments
completes during headless , AMFND deletes the SUSIs in
Yes, these changes have been made recently. It's commented in the floated patch
for review (on 18/08/2016)
Just copy it here in case something wrong with the review requests email
"
If there's an admin operation running and at that time cluster goes into
headless stage, the normal admin
In 1725_phase1.tgz, I am seeing some following extra changes compare to V2
version:
1)Restoring suswitch.
2Taking assignmet state from amfnd. But it is not used and gets overwritten
from IMM state.
3)MDS version update because of assignment being taken from AMFND.
Thanks,
Praveen
---
**
TC #2
Attachments:
- [TC
#2.rar](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/342c/attachment/TC%20%232.rar)
(345.0 kB; application/octet-stream)
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** review
**Milestone:** 5.1.FC
- **Comment**:
Logs
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** review
**Milestone:** 5.1.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Tue Aug 23, 2016 08:00 AM UTC
**Owner:** Minh Hon Chau
This ticket is
V3 for review
Attachments:
-
[1725_V3.tgz](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/b2d4/attachment/1725_V3.tgz)
(22.5 kB; application/x-compressed)
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** review
**Milestone:**
Hi Praveen,
I'd like to hear from you about "AMF self triggering susi_succesS". Option (2)
looks not easy at all.
Thanks,
Minh
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** review
**Milestone:** 5.1.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC
Hi MInh,
The case "1)When comp completes assignments in headless state". can still occur
in some cases when AMFNDs have responded for assignmets to AMFD and cluster
becomes headless before AMFD processes these mail box messages. In this case
after headless state, AMFD will get SUSI FSM states
Hi Minh.
I understand that working with SUSI states deduced from AMFND required some
code addition at AMFD as they will be updated states and SG FSM needs to be
adjusted for that.
Since we are now reading from IMM, we have to be ready for any missing SUSI
information from IMM when AMFDs dies
Hi Praveen,
There should be a description on second patch of part 1 that I miss to do.
In case 1), AMFD can self trigger susi_success() to continue the admin
operation. However, AMFD needs to know the third parameter of susi_success(),
which is NULL or existing SUSI. These 2 cases come from
Hi Minh,
In the second patch, assignment messages are buffered. These will be sent to
AMFD after headless state.
There are two broad cases (Discusses earliar also):
1)When comp completes assignments in headless state.
2)When atleast one calback is pending after headless.
In case1), AMFD can
Hi Praveen,
I guess I will update the naming of new attribute in next version soon.
I have just realized that the failing test 144 in headless mode is because
AVD_SU::su_switch affects on si-swap sequence, and it's not stored in IMM
together with the other new attrs. I think I will float this
Hi Minh,
To differentiate with SAF attributes, "saAmf-" should be replaced by "osafAmf-"
for AMF internal attributes.
This can be taken care before pushing the patches.
Thanks,
Praveen
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** review
V2 version
Attachments:
-
[1725_01_intro_new_rta_states.diff](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/4529/attachment/1725_01_intro_new_rta_states.diff)
(44.2 kB; text/x-patch)
-
- **status**: accepted --> review
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** review
**Milestone:** 5.1.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Thu Aug 04, 2016 01:00 PM UTC
**Owner:** Minh Hon Chau
This
Test report for #1725 part 1: Admin operation continuation
Attachments:
-
[test_report_1725_p1.ods](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/5869/attachment/test_report_1725_p1.ods)
(15.5 kB; application/vnd.oasis.opendocument.spreadsheet)
---
** [tickets:#1725]
Thanks Minh
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** accepted
**Milestone:** 5.1.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Thu Jul 28, 2016 07:04 AM UTC
**Owner:** Minh Hon Chau
This ticket is more likely
Just to help us in reducing our testing time, please send the lists of tests,
that is executed in each patch series.
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** accepted
**Milestone:** 5.1.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon
Great Minh.
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** accepted
**Milestone:** 5.1.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Thu Jul 28, 2016 06:49 AM UTC
**Owner:** Minh Hon Chau
This ticket is more likely
I agree if you combine #3 and #4.
If I am not mistaken: #0 of yours maps to #1 of mine. #1 of yours maps to #2 of
mine. #2 of yours maps to #3 and #4 of mine.
Please confirm.
Also, please send the patches as per order as it helps us to review in that
sequence. And after ack you can keep pushing
- **Comment**:
For 2N red model, implementation can be done in the following phased manner.
It has advantages of being logically segregated and it continues from where we
left in 5.0.
(Phases #1, #2 and #3 is more related to ticket #1725 and phases #4 and #5 are
related to #1902)
1. Node
log/trace being tested with 1725_dumm_susi.diff and app3_twon3su3si_3sidep.xml
The test scenario as below:
Test case:
- Setup assignment before headless so that SU4 has ACTIVE, SU5 has STANDBY,
SU5B is spare SU
- Power off SC
- Reboot PL4 and PL5
- Restart SC1
- Result: After
model being tested
Attachments:
-
[app3_twon3su3si_3sidep.xml](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/4972/attachment/app3_twon3su3si_3sidep.xml)
(14.5 kB; text/xml)
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:**
basic patch for dummy susi approach
Attachments:
-
[1725_dummy_susi.diff](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/7b203666/1d50/attachment/1725_dummy_susi.diff)
(9.9 kB; text/x-patch)
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
- Description has changed:
Diff:
--- old
+++ new
@@ -1,5 +1,5 @@
This ticket is more likely an enhancement that targets on how AMFD detect and
recover the transients SUSI left over from headless. There are three major
situations:
-(1) - Cluster goes headless, su/node failover on any
This ticket was originaly a defect. It was changed to enhancement since the
realease of 5.0 has documented this as a limitation
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** accepted
**Milestone:** 5.1.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC
For a fault during headless, AMF is leaving the application in the same state
with the following update in syslog on SU hosted payload.
Sep 7 19:38:39 SCALE_SLOT-94 osafamfnd[5104]: CR SU-SI record addition failed,
SU= safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_TwoN :
thanks, I will check
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** accepted
**Milestone:** 5.1.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Thu May 05, 2016 05:28 AM UTC
**Owner:** Minh Hon Chau
This ticket is
Hi Minh,
In this patch 1620_12_amfd_adjust_ongoing_susi.diff, I am seeing the some code
related to the ticket #1752. Patch for #1752 was floated same day when this
ticket was updated. Please check if this updated patch is the correct one.
Thanks,
Praveen
---
** [tickets:#1725] AMF: Recover
I have attached prototype patch that apply the idea of resuming sg fsm state,
it needs patch of #1723
The patch can work for the case that ongoing csi callback returns after
recovery, and without failover happens during headless, in most of
lock/unlock/shutdown su/si/sg. There are still some
- **Milestone**: 5.0.RC2 --> 5.1.FC
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** accepted
**Milestone:** 5.1.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Wed Apr 20, 2016 10:48 AM UTC
**Owner:** Minh Hon Chau
- **Type**: defect --> enhancement
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** unassigned
**Milestone:** 5.0.RC2
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Mon Apr 11, 2016 08:55 PM UTC
**Owner:** nobody
This
- **Milestone**: 5.0.RC1 --> 5.0.RC2
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** unassigned
**Milestone:** 5.0.RC2
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Thu Apr 07, 2016 01:54 PM UTC
**Owner:** nobody
This
- **Version**: 5.0.RC1 --> 5.0 FC
- **Milestone**: 5.0.FC --> 5.0.RC1
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** unassigned
**Milestone:** 5.0.RC1
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Wed Apr 06, 2016
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** unassigned
**Milestone:** 5.0.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Wed Apr 06, 2016 07:16 AM UTC
**Owner:** nobody
This ticket is more likely an enhancement
53 matches
Mail list logo