Hi,

I'm faced the same problem in 4.4.0 release. I use the attached simple patch to 
avoid the coredump. Initial testing shows no problem after patching. 

Could you pelase confirm that the patch is valid? Is it possible that the patch 
will cause some side-effects?

When are you going to determine the milestone for the fix?


Attachment: ckpt.patch (724 Bytes; text/x-patch) 


---

** [tickets:#242] cpsv : ckptnd crashed while running multi thread application 
during section iteration get next**

**Status:** assigned
**Milestone:** future
**Created:** Thu May 16, 2013 06:31 AM UTC by A V Mahesh (AVM)
**Last Updated:** Thu May 16, 2013 06:31 AM UTC
**Owner:** A V Mahesh (AVM)

from http://devel.opensaf.org/ticket/2864


The issue is seen on SLES 64bit VMs


There are two threads in the application, a writer thread and a reader thread.


Writer thread does the follows:
1) Creates the checkpoint
2) In a loop opens the same checkpoint in write mode, creates a section, writes 
into the section and closes the checkpoint


Reader thread does as follows:


1) In a loop open the checkpoint created by writer thread, do a section 
iteration initialize and read the section returned by section descriptor of 
iterationNext() and close the checkpoint


Bt observed:


(gdb) bt
#0 0x0000000000417606 in cpnd_proc_fill_sec_desc (pTmpSecPtr=0x0, 
sec_des=0x7fffa9c28530) at cpnd_proc.c:1637
#1 0x0000000000417b42 in cpnd_proc_getnext_section (cp_node=0x64a810, 
get_next=0x654bb0, sec_des=0x7fffa9c28530, 


n_secs_trav=0x7fffa9c2852c) at cpnd_proc.c:1756


#2 0x000000000040f680 in cpnd_evt_proc_ckpt_iter_getnext (cb=0x637f30, 
evt=0x654ba0, sinfo=0x6551f8) at cpnd_evt.c:4122
#3 0x00000000004059df in cpnd_process_evt (evt=0x654b90) at cpnd_evt.c:241
#4 0x0000000000411619 in cpnd_main_process (cb=0x637f30) at cpnd_init.c:544
#5 0x00000000004118e3 in main (argc=1, argv=0x7fffa9c28e68) at cpnd_main.c:72
(gdb) fr 2
#2 0x000000000040f680 in cpnd_evt_proc_ckpt_iter_getnext (cb=0x637f30, 
evt=0x654ba0, sinfo=0x6551f8) at cpnd_evt.c:4122
4122 cpnd_evt.c: No such file or directory.


in cpnd_evt.c


(gdb) p *evt
$1 = {dont_free_me = false, error = 0, type = CPND_EVT_A2ND_CKPT_ITER_GETNEXT, 
info = {initReq = {version = {releaseCode = 51 '3', 


majorVersion = 0 '\0', minorVersion = 0 '\0'}}, finReq = {client_hdl = 51}, 
openReq = {client_hdl = 51, lcl_ckpt_hdl = 11, 


ckpt_name = {length = 61664, value = 
"d\000\000\000\000\000�\202a\000\000\000\000\000\005\000\000\000\t", '\0' 
<repeats 236 times>}, 
ckpt_attrib = {creationFlags = 0, checkpointSize = 0, retentionDuration = 0, 
maxSections = 0, maxSectionSize = 0, 


maxSectionIdSize = 0}, ckpt_flags = 0, invocation = 0, timeout = 0}, closeReq = 
{client_hdl = 51, ckpt_id = 11, 


ckpt_flags = 6615264}, ulinkReq = {ckpt_name = {length = 51, 


value = 
"\000\000\000\000\000\000\v\000\000\000\000\000\000\000��d\000\000\000\000\000�\202a\000\000\000\000\000\005\000\000\000\t",
 '\0' <repeats 220 times>}}, rdsetReq = {ckpt_id = 51, reten_time = 11}, 
arsetReq = {ckpt_id = 51}, statReq = {ckpt_id = 51}, 


refCntsetReq = {no_of_nodes = 51, ref_cnt_array = {{ckpt_id = 11, ckpt_ref_cnt 
= 6615264}, {ckpt_id = 6390432, ckpt_ref_cnt = 5}, {


ckpt_id = 0, ckpt_ref_cnt = 0} <repeats 98 times>}}, sec_creatReq = {ckpt_id = 
51, lcl_ckpt_id = 11, agent_mdest = 6615264, 


sec_attri = {sectionId = 0x6182a0, expirationTime = 38654705669}, init_data = 
0x0, init_size = 0}, sec_delReq = {ckpt_id = 51, 
sec_id = {idLen = 11, id = 0x64f0e0 "section_4_1"}, lcl_ckpt_id = 6390432, 
agent_mdest = 38654705669}, sec_expset = {ckpt_id = 51, 
sec_id = {idLen = 11, id = 0x64f0e0 "section_4_1"}, exp_time = 6390432}, 
iter_getnext = {ckpt_id = 51, section_id = {idLen = 11, 


id = 0x64f0e0 "section_4_1"}, iter_id = 6390432, filter = SA_CKPT_SECTIONS_ANY, 
n_secs_trav = 9, exp_tmr = 0}, arr_ntfy = {


client_hdl = 51}, ckpt_write = {type = 51, ckpt_id = 11, lcl_ckpt_id = 6615264, 
agent_mdest = 6390432, num_of_elmts = 5, 
all_repl_evt_flag = 9, data = 0x0, seqno = 0, last_seq = 0 '\0', ckpt_sync = 
{ckpt_id = 0, lcl_ckpt_hdl = 0, client_hdl = 0, 


invocation = 0, cpa_sinfo = {to_svc = 0, dest = 0, stype = MDS_SENDTYPE_SND, 
ctxt = {length = 0 '\0', 


data = '\0' <repeats 11 times>}}, is_ckpt_open = false}}, ckpt_read = {type = 
51, ckpt_id = 11, lcl_ckpt_id = 6615264, 


agent_mdest = 6390432, num_of_elmts = 5, all_repl_evt_flag = 9, data = 0x0, 
seqno = 0, last_seq = 0 '\0', ckpt_sync = {ckpt_id = 0, 


lcl_ckpt_hdl = 0, client_hdl = 0, invocation = 0, cpa_sinfo = {to_svc = 0, dest 
= 0, stype = MDS_SENDTYPE_SND, ctxt = {


length = 0 '\0', data = '\0' <repeats 11 times>}}, is_ckpt_open = false}}, 
ckpt_sync = {ckpt_id = 51, lcl_ckpt_hdl = 11, 


client_hdl = 6615264, invocation = 6390432, cpa_sinfo = {to_svc = 5, dest = 0, 
stype = MDS_SENDTYPE_SND, ctxt = {length = 0 '\0', 


data = '\0' <repeats 11 times>}}, is_ckpt_open = false}, ckpt_read_ack = 
{ckpt_id = 51, mds_dest = 11}, ckpt_info = {error = 51, 


ckpt_id = 11, is_active_exists = 224, active_dest = 6390432, dest_cnt = 5, 
dest_list = 0x0, attributes = {creationFlags = 0, 


checkpointSize = 0, retentionDuration = 0, maxSections = 0, maxSectionSize = 0, 
maxSectionIdSize = 0}, ckpt_rep_create = false}, 


ckpt_mem_size = {ckpt_id = 51, ckpt_used_size = 11, error = 0}, ckpt_sections = 
{ckpt_id = 51, ckpt_num_sections = 11, error = 0}, 
ckpt_add = {ckpt_id = 51, mds_dest = 11, active_dest = 6615264, attributes = 
{creationFlags = 6390432, checkpointSize = 38654705669, 


retentionDuration = 0, maxSections = 0, maxSectionSize = 0, maxSectionIdSize = 
0}, ckpt_flags = 0, is_cpnd_restart = false, 


dest_cnt = 0, dest_list = 0x0}, ckpt_del = {ckpt_id = 51, mds_dest = 11}, 
ckpt_create = {ckpt_name = {length = 51, 


value = 
"\000\000\000\000\000\000\v\000\000\000\000\000\000\000��d\000\000\000\000\000�\202a\000\000\000\000\000\005\000\000\000\t",
 '\0' <repeats 220 times>}, ckpt_info = {error = 0, ckpt_id = 0, 
is_active_exists = false, active_dest = 0, dest_cnt = 0, dest_list = 0x0, 
attributes = {creationFlags = 0, checkpointSize = 0, retentionDuration = 0, 
maxSections = 0, maxSectionSize = 0, 


maxSectionIdSize = 0}, ckpt_rep_create = false}}, ckpt_destroy = {ckpt_id = 
51}, ckpt_ulink = {ckpt_id = 51}, rdset = {


ckpt_id = 51, reten_time = 11, type = 6615264}, active_set = {ckpt_id = 51, 
mds_dest = 11}, cl_ack = {error = 51}, ulink_ack = {
error = 51}, rdset_ack = {error = 51}, crset_ack = {error = 51}, arep_ack = 
{error = 51}, destroy_ack = {error = 51}, 


cpnd_restart = {ckpt_id = 51}, cpnd_restart_done = {ckpt_id = 51, mds_dest = 
11, active_dest = 6615264, attributes = {


creationFlags = 6390432, checkpointSize = 38654705669, retentionDuration = 0, 
maxSections = 0, maxSectionSize = 0, 


—Type <return> to continue, or q <return> to quit—


maxSectionIdSize = 0}, ckpt_flags = 0, is_cpnd_restart = false, dest_cnt = 0, 
dest_list = 0x0}, stat_get = {ckpt_id = 51}, 


status = {error = 51, ckpt_id = 11, status = {checkpointCreationAttributes = 
{creationFlags = 6615264, checkpointSize = 6390432, 


retentionDuration = 38654705669, maxSections = 0, maxSectionSize = 0, 
maxSectionIdSize = 0}, numberOfSections = 0, 


memoryUsed = 0}}, active_sec_creat = {ckpt_id = 51, lcl_ckpt_id = 11, 
agent_mdest = 6615264, sec_attri = {sectionId = 0x6182a0, 
expirationTime = 38654705669}, init_data = 0x0, init_size = 0}, sec_creat_rsp = 
{error = 51}, active_sec_creat_rsp = {


ckpt_id = 51, sec_id = {idLen = 11, id = 0x64f0e0 "section_4_1"}, error = 
6390432, lcl_ckpt_id = 38654705669, agent_mdest = 0}, 


sec_delete_req = {ckpt_id = 51, sec_id = {idLen = 11, id = 0x64f0e0 
"section_4_1"}, error = 6390432, lcl_ckpt_id = 38654705669, 


agent_mdest = 0}, sec_delete_rsp = {error = 51}, sec_iter_req = {ckpt_id = 51}, 
sec_exp_set = {ckpt_id = 51, sec_id = {idLen = 11, 


id = 0x64f0e0 "section_4_1"}, exp_time = 6390432}, sec_exp_rsp = {error = 51}, 
sync_req = {ckpt_id = 51, lcl_ckpt_hdl = 11, 


client_hdl = 6615264, invocation = 6390432, cpa_sinfo = {to_svc = 5, dest = 0, 
stype = MDS_SENDTYPE_SND, ctxt = {length = 0 '\0', 


data = '\0' <repeats 11 times>}}, is_ckpt_open = false}, ckpt_nd2nd_sync = 
{type = 51, ckpt_id = 11, lcl_ckpt_id = 6615264, 


agent_mdest = 6390432, num_of_elmts = 5, all_repl_evt_flag = 9, data = 0x0, 
seqno = 0, last_seq = 0 '\0', ckpt_sync = {ckpt_id = 0, 


lcl_ckpt_hdl = 0, client_hdl = 0, invocation = 0, cpa_sinfo = {to_svc = 0, dest 
= 0, stype = MDS_SENDTYPE_SND, ctxt = {


length = 0 '\0', data = '\0' <repeats 11 times>}}, is_ckpt_open = false}}, 
active_sync_rsp = {error = 51}, ckpt_nd2nd_data = {


type = 51, ckpt_id = 11, lcl_ckpt_id = 6615264, agent_mdest = 6390432, 
num_of_elmts = 5, all_repl_evt_flag = 9, data = 0x0, 
seqno = 0, last_seq = 0 '\0', ckpt_sync = {ckpt_id = 0, lcl_ckpt_hdl = 0, 
client_hdl = 0, invocation = 0, cpa_sinfo = {to_svc = 0, 


dest = 0, stype = MDS_SENDTYPE_SND, ctxt = {length = 0 '\0', data = '\0' 
<repeats 11 times>}}, is_ckpt_open = false}}, 


ckpt_nd2nd_data_rsp = {type = 51, num_of_elmts = 0, size = 11, error = 0, 
ckpt_id = 6615264, error_index = 6390432, 


from_svc = 38654705669, info = {write_err_index = 0x0, read_mapping = 0x0, 
read_data = 0x0, ovwrite_error = {error = 0}}}, 


getnext_req = {ckpt_id = 51, section_id = {idLen = 11, id = 0x64f0e0 
"section_4_1"}, iter_id = 6390432, filter = SA_CKPT_SECTIONS_ANY, 


n_secs_trav = 9, exp_tmr = 0}, ckpt_nd2nd_getnext_rsp = {ckpt_id = 51, iter_id 
= 11, error = 6615264, sect_desc = {sectionId = {


idLen = 33440, id = 0x900000005 <Address 0x900000005 out of bounds>}, 
expirationTime = 0, sectionSize = 0, sectionState = 0, 


lastUpdate = 0}, n_secs_trav = 0}, mds_info = {change = 51, dest = 11, svc_id = 
6615264, node_id = 0, role = 6390432}, tmr_info = {


type = 51, ckpt_id = 11, lcl_sec_id = 6615264, agent_dest = 6390432, write_type 
= 5, sinfo = {to_svc = 0, dest = 0, 


stype = MDS_SENDTYPE_SND, ctxt = {length = 0 '\0', data = '\0' <repeats 11 
times>}}, invocation = 0, lcl_ckpt_hdl = 0, 


cpnd_tmr = 0x0}, ckptListUpdate = {client_hdl = 51, ckpt_name = {length = 11, 


value = 
"\000\000\000\000\000\000��d\000\000\000\000\000�\202a\000\000\000\000\000\005\000\000\000\t",
 '\0' <repeats 228 times>}}}}


(gdb) p *sinfo
$2 = {to_svc = 18, dest = 566314965155865, stype = MDS_SENDTYPE_SNDRSP, ctxt = 
{length = 12 '\f', 


data = "\000\000\001\n\000\002\003\017zT@\031"}}


The issue is reproducible with the attached application





---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to