- **status**: unassigned --> accepted
- **assigned_to**: Praveen
- **Component**: unknown --> amf
---
** [tickets:#1671] amfd: validations in ccb_complete_cb for CSI at standby may
crash it. **
**Status:** accepted
**Milestone:** 4.6.2
**Created:** Wed Jan 27, 2016 12:24 PM UTC by Praveen
**Last Updated:** Wed Jan 27, 2016 12:27 PM UTC
**Owner:** Praveen
AMFD may crash at standby when csi and su is deleted in a single CCB in the
following cases:
1) standby amfd deletes CSI during amfd checkpoining. When it gets ccb
completed callback it rejects it saying the "csi does not exist".
2) SG which is protecting SI becomes stable just before the reception of delete
callback. In this case standby amfd may process the CCB completed callback
before decoding the sg_fsm state. In this case also standby amfd will reject
the ccb.
In both the above cases, active amfd is accepting the CCB and standby amfd will
reject them. Thus standby amfd will not store the pointers of csi and su for
deletion purpose in CCB apply. When ccb apply for deletion comes apply callabck
for csi handles the case but amfd crashes for accessing illegal memroy for su
porinter.
There have been such issues earliar also becuase of race between mbcsv
checkpointing and CCBs.
Such issues are no always reproducible. Here are steps to reproduce case 1)
with some hack :
1)On standby contoller move if block contatining mbcsv dispatch below the
immdispatch.
2)Bring attached amf application up.
3)run the following command:
immcfg -d safCsi=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1
Note: Standby amfd is not allowed to execute completed callabck for CCB
creation and CCB modify because imm only honours implementor (active amfd) for
CCB deletion, modification and creation of objects. On standby amfd ccb
completed callback is allowed to be executed for remebering the optData i.e
obejct to be deleted.
amfd traces at standby :
1) deletion of csi in checkpointing :
Jan 27 17:15:14.773217 osafamfd [32722:ckpt_updt.cc:0400] >> avd_ckpt_siass:
'safSi=AmfDemo1,safApp=AmfDemo1' 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
Jan 27 17:15:14.773234 osafamfd [32722:ckpt_updt.cc:0466] TR compcsi remove for
'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
'safCsi=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1'
Jan 27 17:15:14.773245 osafamfd [32722:csi.cc:1255] >>
avd_compcsi_from_csi_and_susi_delete:
Csi'safCsi=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1', compcsi_cnt'1'
Jan 27 17:15:14.773255 osafamfd [32722:csi.cc:1170] TR Deleting
safCSIComp=safComp=AmfDemo\,safSu=SU1\,safSg=AmfDemo\,safApp=AmfDemo1,safCsi=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
Jan 27 17:15:14.773262 osafamfd [32722:imm.cc:1651] >>
avd_saImmOiRtObjectDelete:
safCSIComp=safComp=AmfDemo\,safSu=SU1\,safSg=AmfDemo\,safApp=AmfDemo1,safCsi=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
Jan 27 17:15:14.773270 osafamfd [32722:imm.cc:1580] TR Class
Type:SaAmfSIAssignment
Jan 27 17:15:14.773278 osafamfd [32722:imm.cc:1662] << avd_saImmOiRtObjectDelete
Jan 27 17:15:14.773284 osafamfd [32722:csi.cc:1300] <<
avd_compcsi_from_csi_and_susi_delete
Jan 27 17:15:14.773291 osafamfd [32722:pg.cc:0270] >> avd_pg_csi_node_del_all
Jan 27 17:15:14.773298 osafamfd [32722:pg.cc:0275] << avd_pg_csi_node_del_all
Jan 27 17:15:14.773304 osafamfd [32722:csi.cc:0064] >> avd_csi_delete:
safCsi=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
Jan 27 17:15:14.773323 osafamfd [32722:csi.cc:0091] << avd_csi_delete:
2) CCB request for deletion of csi and su;
Jan 27 17:15:18.989459 osafamfd [32722:imma_proc.c:2524] TR Ccb-object-delete
op callback
Jan 27 17:15:18.989465 osafamfd [32722:imm.cc:0877] >> ccb_object_delete_cb:
CCB ID 3, safCsi=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1
Jan 27 17:15:18.989485 osafamfd [32722:imm.cc:0897] << ccb_object_delete_cb: 1
Jan 27 17:15:18.989490 osafamfd [32722:imma_proc.c:2587] TR ccb-object-delete
callback returned RC:1
Jan 27 17:15:18.990063 osafamfd [32722:imma_proc.c:2524] TR Ccb-object-delete
op callback
Jan 27 17:15:18.990067 osafamfd [32722:imm.cc:0877] >> ccb_object_delete_cb:
CCB ID 3, safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1
Jan 27 17:15:18.990073 osafamfd [32722:imm.cc:0897] << ccb_object_delete_cb: 1
3)ccb completed callback:
Jan 27 17:15:18.992037 osafamfd [32722:csi.cc:0752] >> csi_ccb_completed_cb:
CCB ID 3, 'safCsi=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1'
Jan 27 17:15:18.992045 osafamfd [32722:csi.cc:0686] >>
csi_ccb_completed_delete_hdlr: CCB ID 3,
'safCsi=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1'
Jan 27 17:15:18.992056 osafamfd [32722:imm.cc:1902] TR CSI delete completed
(STDBY): 'safCsi=AmfDemo1,safSi=AmfDemo1,safApp=AmfDemo1' does not exist
Jan 27 17:15:18.992066 osafamfd [32722:lga_api.c:0764] >> saLogWriteLogAsync
Jan 27 17:15:18.992074 osafamfd [32722:lga_mds.c:1168] >> lga_mds_msg_async_send
Jan 27 17:15:18.992090 osafamfd [32722:lga_mds.c:0577] >> lga_mds_enc
Jan 27 17:15:18.992099 osafamfd [32722:lga_mds.c:0608] T2 msgtype: 0
Jan 27 17:15:18.992105 osafamfd [32722:lga_mds.c:0621] T2 api_info.type: 4
Jan 27 17:15:18.992112 osafamfd [32722:lga_mds.c:0649] << lga_mds_enc
Jan 27 17:15:18.992214 osafamfd [32722:lga_mds.c:1190] << lga_mds_msg_async_send
Jan 27 17:15:18.992233 osafamfd [32722:lga_api.c:0931] << saLogWriteLogAsync
Jan 27 17:15:18.992246 osafamfd [32722:imma_db.c:0402] >>
imma_oi_ccb_record_set_error
Jan 27 17:15:18.992253 osafamfd [32722:imma_db.c:0187] >>
imma_oi_ccb_record_find
Jan 27 17:15:18.992259 osafamfd [32722:imma_db.c:0194] TR Record for ccbid:0x3
handle:90002020f client:0xc63af0 found
Jan 27 17:15:18.992266 osafamfd [32722:imma_db.c:0198] <<
imma_oi_ccb_record_find
Jan 27 17:15:18.992279 osafamfd [32722:imma_db.c:0417] <<
imma_oi_ccb_record_set_error
Jan 27 17:15:18.992287 osafamfd [32722:imma_oi_api.c:3521] T2
ERR_BAD_OPERATION: Ccb 3, is not in a state that accepts an errorString
Jan 27 17:15:18.992294 osafamfd [32722:csi.cc:0744] <<
csi_ccb_completed_delete_hdlr: 20
4)Standby amfd crashes:
Jan 27 17:15:18.993598 osafamfd [32722:imm.cc:1164] >> ccb_apply_cb: CCB ID 3
Jan 27 17:15:18.993618 osafamfd [32722:compcstype.cc:0379] >>
compcstype_ccb_apply_cb: CCB ID 3,
'safSupportedCsType=safVersion=1\,safCSType=AmfDemo,safComp=AmfDemo,safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1'
Jan 27 17:15:18.993635 osafamfd [32722:compcstype.cc:0379] >>
compcstype_ccb_apply_cb: CCB ID 3,
'safSupportedCsType=safVersion=1\,safCSType=AmfDemo1,safComp=AmfDemo1,safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1'
Jan 27 17:15:18.993648 osafamfd [32722:comp.cc:1673] >> comp_ccb_apply_cb: CCB
ID 3, 'safComp=AmfDemo,safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1'
Jan 27 17:15:18.993656 osafamfd [32722:comp.cc:1640] >>
comp_ccb_apply_delete_hdlr
Jan 27 17:15:18.994296 osafamfd [32722:comp.cc:1666] <<
comp_ccb_apply_delete_hdlr
Jan 27 17:15:18.994312 osafamfd [32722:comp.cc:1692] << comp_ccb_apply_cb
Jan 27 17:15:18.994316 osafamfd [32722:comp.cc:1673] >> comp_ccb_apply_cb: CCB
ID 3, 'safComp=AmfDemo1,safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1'
Jan 27 17:15:18.994321 osafamfd [32722:comp.cc:1640] >>
comp_ccb_apply_delete_hdlr
Jan 27 17:15:18.994328 osafamfd [32722:imm.cc:1596] >>
avd_saImmOiRtObjectUpdate: 'safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1'
saAmfSUPreInstantiable
Jan 27 17:15:18.994334 osafamfd [32722:imm.cc:1580] TR Class Type:SaAmfSU
Jan 27 17:15:18.994348 osafamfd [32722:imm.cc:1616] << avd_saImmOiRtObjectUpdate
Jan 27 17:15:18.994353 osafamfd [32722:su.cc:2107] TR
safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1 saAmfSUPreInstantiable 0
Jan 27 17:15:18.994358 osafamfd [32722:su.cc:1968] TR Modified saAmfSUFailover
to '1' for 'safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1'
Jan 27 17:15:18.994365 osafamfd [32722:su.cc:1927] >> send_attribute_update
Jan 27 17:15:18.994372 osafamfd [32722:su.cc:1930] << send_attribute_update:
avd is not in active state
Jan 27 17:15:18.994382 osafamfd [32722:comp.cc:1666] <<
comp_ccb_apply_delete_hdlr
Jan 27 17:15:18.994389 osafamfd [32722:comp.cc:1692] << comp_ccb_apply_cb
Jan 27 17:15:18.994396 osafamfd [32722:su.cc:1856] >> su_ccb_apply_cb: CCB ID
3, 'safSu=SU3,safSg=AmfDemo,safApp=AmfDemo1'
bt
\#0 su_ccb_apply_delete_hdlr (opdata=0xcc13e4) at su.cc:1801
\#1 0x0000000000482319 in su_ccb_apply_cb (opdata=0xcc13e4) at su.cc:1868
\#2 0x0000000000430ff6 in ccb_apply_cb (immoi_handle=<optimized out>,
ccb_id=3) at imm.cc:1202
\#3 0x00007f7219997d19 in imma_process_callback_info
(cb=cb@entry=0x7f7219bb73a0 <imma_cb>, cl_node=0xc63af0,
callback=callback@entry=0x7f721001bd90, immHandle=38654837263) at
imma_proc.c:2196
\#4 0x00007f721999a806 in imma_hdl_callbk_dispatch_all (cb=0x7f7219bb73a0
<imma_cb>, immHandle=38654837263)
at imma_proc.c:1715
\#5 0x00007f721998eb31 in saImmOiDispatch (immOiHandle=38654837263,
dispatchFlags=SA_DISPATCH_ALL)
at imma_oi_api.c:609
\#6 0x0000000000405978 in main_loop () at main.cc:731
\#7 main (argc=<optimized out>, argv=<optimized out>) at main.cc:851
(gdb) fr 0
\#0 su_ccb_apply_delete_hdlr (opdata=0xcc13e4) at su.cc:1801
1801 AVD_SG *sg = su->sg_of_su;
(gdb) p opdata->userData
$1 = (void *) 0x0
(gdb)
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets