[slurm-dev] slurmctld crash

Sam Lang Fri, 02 Sep 2011 13:31:18 -0700


Hello,

I'm trying to debug a problem I've seen where the slurmctld processcrashed. The usage scenario was a user trying to cancel and resubmit abunch (100 or so) jobs, but there's not a simple use case that makesthis easily reproducible, we've only seen it happen once so far. Thestack trace where the crash occurred is:


 (gdb) bt

#0 slurm_xfree (item=0x8, file=0x5408a5 "pack.c", line=127,func=0x53d483 "") at xmalloc.c:264

#1  0x0000000000494195 in free_buf (my_buf=0x0) at pack.c:127

#2 0x00000000004e9a61 in _handle_mult_rc_ret (x=<value optimized out>)at slurmdbd_defs.c:1664

#3  _agent (x=<value optimized out>) at slurmdbd_defs.c:2030
#4  0x00007f75aaa84971 in start_thread () from /lib/libpthread.so.0
#5  0x00007f75aa7e092d in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

The segfault looks to be caused by my_buf being null. From lookingthrough the source in slurmdbd_defs.c, it looks as though slurmdbd isreturning multiple response codes to slurmctld in a single message, butsomehow the request queue has fewer requests, resulting in it trying todequeue from an empty queue.

I'm not sure how the queue can be empty here, but the following patchshould prevent the crash by simply checking that the dequeue returns anon-null buffer to be freed.


Thanks,
-sam


diff --git a/src/common/slurmdbd_defs.c b/src/common/slurmdbd_defs.c
index a29c5b9..d0285e9 100644
--- a/src/common/slurmdbd_defs.c
+++ b/src/common/slurmdbd_defs.c

@@ -1672,7 +1672,8 @@ static int _handle_mult_rc_ret(uint16_trpc_version, int read_timeout)

                                    != SLURM_SUCCESS)
                                        break;

-                               free_buf(list_dequeue(agent_list));
+                               Buf b = list_dequeue(agent_list);
+                               if(b) free_buf(b);
                        }
                        list_iterator_destroy(itr);
                }

[slurm-dev] slurmctld crash

Reply via email to