Hello Vicente, Thank you for your reply, you'll find my answer below
On Thu, Nov 28, 2013 at 12:03 AM, Vicente Hernando < [email protected]> wrote: > Hello, > > also full steps to crash kamailio and reproduce the error would be good. > Here is the architecture A <--> Asterisk <--> Kamailio 1 <---> kamailio2 <--- ISP---> mobile Kamailio 1 & 2 are connected to a local redis server 1/ I restarted the redis server 2/ From the mobile I made a call to A then cancelled it. In the script of kamailio1, if a call has missed or failed, it sends a message to the redis. And in this case, it crashes > > > On 11/27/2013 11:35 PM, Daniel-Constantin Mierla wrote: > > Hello, > > can you give the full output for 'bt full' with gdb on the core file? You > gave only partial list of the frames, not being enough to see the execution > trace. > > Cheers, > Daniel > > On 11/27/13 6:52 PM, Tuan Viet Nguyen wrote: > > Hello, > > I'll try to shut down the redis server to test the behavior of kamailio > and it has crashed if a call is received and then cancelled. > > *1/The kamailio version is 4.0.4* > > *2/ Kamailio log * > /usr/local/sbin/kamailio[25333]: ERROR: ndb_redis [redis_client.c:364]: > redisc_exec(): Redis error: Server closed the connection > /usr/local/sbin/kamailio[25361]: : <core> [pass_fd.c:293]: receive_fd(): > ERROR: receive_fd: EOF on 13 > /usr/local/sbin/kamailio[25328]: ALERT: <core> [main.c:788]: > handle_sigs(): child process 25333 exited by a signal 11 > /usr/local/sbin/kamailio[25328]: ALERT: <core> [main.c:791]: > handle_sigs(): core was generated > > I assume you disconnect redis server and don't reconnect it. It is > that correct? > > Then this line is an error but it should recover from that. I probably > should set this as a warning instead an error. > > /usr/local/sbin/kamailio[25333]: ERROR: ndb_redis [redis_client.c:364]: > redisc_exec(): Redis error: Server closed the connection > Yes, it has been restarted > *3/ Interesting information in the core* > #3 0x00007fc79412893d in redisvCommand (c=0x64657461, format=0x9 <Address > 0x9 out of bounds>, ap=0x30, ap@entry=0x7fff0ff56aa8) at hiredis.c:1304 > No locals. > #4 0x00007fc794341713 in redisc_exec (srv=srv@entry=0x7fff0ff56be0, > res=res@entry=0x7fff0ff56c00, cmd=cmd@entry=0x7fff0ff56bf0) at > redis_client.c:368 > rsrv = 0x7fc794565150 > rpl = 0x7fc7946fab70 > c = 0 '\000' > ap = {{gp_offset = 48, fp_offset = 48, overflow_arg_area = > 0x7fff0ff56bb0, reg_save_area = 0x7fff0ff56ac0}} > __FUNCTION__ = "redisc_exec" > #5 0x00007fc79433b781 in w_redis_cmd5 (msg=<optimized out>, > ssrv=<optimized out>, scmd=<optimized out>, sargv1=<optimized out>, > sargv2=0x7fc7946f7bf0 "p\243_\224\307\177", sres=0x7fc7946f7c50 " > \253_\224\307\177") at ndb_redis_mod.c:250 > s = {{s = 0x7fc7945fb300 "kamailio_redis", len = 14}, {s = > 0x7fc7945f5f50 "PUBLISH %s %s", len = 13}, {s = 0x7fc7945fab20 "r", len = > 1}} > arg1 = {s = 0x7fc7945f5f80 "notification", len = 12} > arg2 = { > s = 0x7fc794551c60 "info XXX"..., > len = 212} > c1 = 0 '\000' > c2 = 0 '\000' > __FUNCTION__ = "w_redis_cmd5" > > > In the source code: > > rpl->rplRedis = redisvCommand(rsrv->ctxRedis, cmd->s, ap ); > if(rpl->rplRedis == NULL) > { > /* null reply, reconnect and try again */ > if(rsrv->ctxRedis->err) > { > LM_ERR("Redis error: %s\n", rsrv->ctxRedis->errstr); > } > if(redisc_reconnect_server(rsrv)==0) > { > rpl->rplRedis = redisvCommand(rsrv->ctxRedis, cmd->s, ap); > } > } > > First redisvCommand executes but returns nothing. Then it shows a redis > error. > > It tries to reconnect and it manages to connect ?? because it shows no > more errors. > > And then executes redisvCommand again and crashes. > > If server is down it should not be able to connect and so not to execute > redisvCommand again. > According to the core, we MUST be in this case *if(redisc_reconnect_server(rsrv)==0)* But I am wondering how the first redisvCommand can succeed before the reconnection ? (the connection kamailio1 <-> redis has already been taken down). Does all the redis context always there when we first call redisvCommand? > > > May be I would get more clues with more information. > > Regards, > Vicente. > Thank you Regards, > > > I've found one of post that this issue has been fixed but it seems > that it's always the case .. > > http://www.mail-archive.com/[email protected]&q=subject:%22Re%3A+%5BSR-Users%5D+ndb_redis+module+fails+after+a+while%22 > > Do you have any idea? > Thank you > > > _______________________________________________ > sr-dev mailing > [email protected]http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev > > > -- > Daniel-Constantin Mierla - http://www.asipto.comhttp://twitter.com/#!/miconda > - http://www.linkedin.com/in/miconda > > > > _______________________________________________ > sr-dev mailing > [email protected]http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev > > >
_______________________________________________ sr-dev mailing list [email protected] http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
