- **status**: review --> fixed
- **Comment**:

commit 0e1a6847c264ad5e34ca8413307b118066ae03eb (HEAD -> develop, 
origin/develop)
Author: thuan.tran <[email protected]>
Date:   Wed Sep 9 12:43:56 2020 +0700

    mbc: fix agent crash if mds sendto() error [#3217]
    
    - Fix #3208 to solve MBC memleak will cause agent crash if
    MDS sendto() error return.
    - Update a part of fix #3208 to check MDS encode callback
    done then not need to free memory as MDS already freed.





---

** [tickets:#3217] mbc: agent crash if mds sendto() error**

**Status:** fixed
**Milestone:** 5.20.11
**Created:** Wed Sep 09, 2020 06:18 AM UTC by Thuan Tran
**Last Updated:** Thu Sep 10, 2020 01:43 PM UTC
**Owner:** Thuan Tran


With #3208 fix, sometimes ntfd crash during cluster shutdown.
The back trace as following:
~~~
Thread 1 (Thread 0x7fc0a9b4a100 (LWP 276)):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007fc0a80bd8b1 in __GI_abort () at abort.c:79
#2  0x00007fc0a8106907 in __libc_message (action=action@entry=do_abort, 
fmt=fmt@entry=0x7fc0a8233dfa "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007fc0a810d97a in malloc_printerr (str=str@entry=0x7fc0a823206e 
"malloc(): memory corruption") at malloc.c:5350
#4  0x00007fc0a8111a04 in _int_malloc (av=av@entry=0x7fc0a8468c40 <main_arena>, 
bytes=bytes@entry=59) at malloc.c:3738
#5  0x00007fc0a8117121 in __libc_calloc (n=n@entry=1, 
elem_size=elem_size@entry=59) at malloc.c:3436
#6  0x00007fc0a8c9b40c in mds_mdtm_send_tipc (req=0x7ffc9f16ec60) at 
src/mds/mds_dt_tipc.c:2736
#7  0x00007fc0a8c88f07 in mcm_msg_encode_full_or_flat_and_send (to=to@entry=2 
'\002', to_msg=to_msg@entry=0x7ffc9f16ef50, to_svc_id=to_svc_id@entry=29, 
svc_cb=svc_cb@entry=0x5568f4afcea0, adest=adest@entry=564114769357041, 
dest_vdest_id=dest_vdest_id@entry=65535, snd_type=4, xch_id=116, 
pri=MDS_SEND_PRIORITY_HIGH) at src/mds/mds_c_sndrcv.c:1774
#8  0x00007fc0a8c8a5b7 in mds_mcm_send_msg_enc (to=<optimized out>, 
svc_cb=svc_cb@entry=0x5568f4afcea0, to_msg=to_msg@entry=0x7ffc9f16ef50, 
to_svc_id=to_svc_id@entry=29, dest_vdest_id=dest_vdest_id@entry=65535, 
req=req@entry=0x7ffc9f16eff0, xch_id=116, dest=564114769357041, 
pri=MDS_SEND_PRIORITY_HIGH) at src/mds/mds_c_sndrcv.c:1255
#9  0x00007fc0a8c8ac30 in mcm_pvt_red_snd_process_common 
(env_hdl=env_hdl@entry=65550, fr_svc_id=fr_svc_id@entry=28, to_msg=..., 
to_dest=to_dest@entry=564114769357041, to_svc_id=to_svc_id@entry=29, 
req=req@entry=0x7ffc9f16eff0, pri=pri@entry=MDS_SEND_PRIORITY_HIGH, xch_id=116, 
anchor=<optimized out>) at src/mds/mds_c_sndrcv.c:2664
#10 0x00007fc0a8c8dba3 in mcm_pvt_normal_svc_snd_rsp 
(pri=MDS_SEND_PRIORITY_HIGH, req=0x7ffc9f16eff0, to_svc_id=29, 
to_dest=564114769357041, msg=<optimized out>, fr_svc_id=28, env_hdl=65550) at 
src/mds/mds_c_sndrcv.c:3699
#11 mds_mcm_send (info=0x1d) at src/mds/mds_c_sndrcv.c:835
#12 mds_send (info=info@entry=0x7ffc9f16f0a0) at src/mds/mds_c_sndrcv.c:458
#13 0x00007fc0a8c9636c in ncsmds_api 
(svc_to_mds_info=svc_to_mds_info@entry=0x7ffc9f16f0a0) at src/mds/mds_papi.c:165
#14 0x00005568f2e7598f in ntfs_mds_msg_send (cb=<optimized out>, 
msg=msg@entry=0x7ffc9f16f130, dest=dest@entry=0x7ffc9f16f128, 
mds_ctxt=mds_ctxt@entry=0x7fc09c01278c, prio=prio@entry=MDS_SEND_PRIORITY_HIGH) 
at src/ntf/ntfd/ntfs_mds.c:1310
#15 0x00005568f2e75f68 in notfication_result_lib (error=error@entry=SA_AIS_OK, 
notificationId=182, mdsCtxt=0x7fc09c01278c, frDest=<optimized out>) at 
src/ntf/ntfd/ntfs_com.c:181
#16 0x00005568f2e809da in NtfClient::confirmNtfNotification 
(this=this@entry=0x5568f4afc440, notificationId=<optimized out>, 
mdsCtxt=mdsCtxt@entry=0x7fc09c01278c, mdsDest=mdsDest@entry=564114769357041) at 
src/ntf/ntfd/NtfClient.cc:341
#17 0x00005568f2e80c47 in NtfClient::notificationReceived (this=0x5568f4afc440, 
clientId=clientId@entry=2, notification=std::tr1::shared_ptr<NtfNotification> 
(use count 2, weak count 0) = {...}, mdsCtxt=mdsCtxt@entry=0x7fc09c01278c) at 
src/ntf/ntfd/NtfClient.cc:146
#18 0x00005568f2e86c32 in NtfAdmin::processNotification 
(this=this@entry=0x5568f4afb6a0, clientId=clientId@entry=2, 
notificationType=notificationType@entry=SA_NTF_TYPE_STATE_CHANGE, 
sendNotInfo=sendNotInfo@entry=0x7fc09c010bb0, 
mdsCtxt=mdsCtxt@entry=0x7fc09c01278c, notificationId=<optimized out>) at 
src/ntf/ntfd/NtfAdmin.cc:211
#19 0x00005568f2e86ec1 in NtfAdmin::notificationReceived (this=0x5568f4afb6a0, 
clientId=2, notificationType=SA_NTF_TYPE_STATE_CHANGE, 
sendNotInfo=sendNotInfo@entry=0x7fc09c010bb0, mdsCtxt=0x7fc09c01278c) at 
src/ntf/ntfd/NtfAdmin.cc:262
#20 0x00005568f2e86f52 in notificationReceived (clientId=<optimized out>, 
notificationType=<optimized out>, sendNotInfo=sendNotInfo@entry=0x7fc09c010bb0, 
mdsCtxt=mdsCtxt@entry=0x7fc09c01278c) at src/ntf/ntfd/NtfAdmin.cc:1127
#21 0x00005568f2e7086a in proc_send_not_msg (cb=<optimized out>, 
evt=0x7fc09c012780) at src/ntf/ntfd/ntfs_evt.c:474
#22 0x00005568f2e7033e in process_api_evt (evt=0x7fc09c012780) at 
src/ntf/ntfd/ntfs_evt.c:673
#23 0x00005568f2e70f19 in ntfs_process_mbx (mbx=<optimized out>) at 
src/ntf/ntfd/ntfs_evt.c:708
#24 0x00005568f2e6ebad in main (argc=<optimized out>, argv=<optimized out>) at 
src/ntf/ntfd/ntfs_main.c:400
~~~

The problem is MBC free buffer by #3208 that MDS already freed
~~~
<139>1 2020-09-08T16:16:48.284822+02:00 SC-1 osafntfd 276 mds.log [meta 
sequenceId="80"] MDTM: Failed to send message err :No route to host
<139>1 2020-09-08T16:16:48.284842+02:00 SC-1 osafntfd 276 mds.log [meta 
sequenceId="81"] MDTM: Unable to send the msg thru TIPC
<139>1 2020-09-08T16:16:48.284866+02:00 SC-1 osafntfd 276 mds.log [meta 
sequenceId="82"] MDS_SND_RCV: RED sndrsp message SEND Failed from svc_id = 
MBCSV(19), to svc_id = MBCSV(19)
~~~
Need update a part of solution #3208 to solve this issue.



---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to