OK, thank you for your help, i've updated the source. Regards,
On Thu, Nov 28, 2013 at 12:36 PM, Vicente Hernando < [email protected]> wrote: > Hello Nguyen, > > I have uploaded the patch in devel, 4.0, and 4.1 versions. > > > Regards, > Vicente. > > > On 11/28/2013 12:07 PM, Tuan Viet Nguyen wrote: > > Hi Vicente, > > It works now. Thank you for the patch. In which version will we have this > one integrated ? > > > Regards, > > > On Thu, Nov 28, 2013 at 11:36 AM, Vicente Hernando < > [email protected]> wrote: > >> Hello, >> >> could you test this patch and confirm the bug has disappeared? >> >> Thanks, >> Vicente. >> >> >> On 11/28/2013 11:10 AM, Tuan Viet Nguyen wrote: >> >> Hi Vicente, >> >> Thank you for your quick reply. >> >> I'm ready to retest the patch. >> >> Regards, >> >> >> On Thu, Nov 28, 2013 at 11:07 AM, Vicente Hernando < >> [email protected]> wrote: >> >>> Hello, >>> >>> I think you have discovered a bug I made using variadic functions. >>> >>> Very soon I gonna send a patch to correct it. >>> >>> >>> Thanks, >>> Vicente. >>> >>> >>> On 11/28/2013 10:14 AM, Tuan Viet Nguyen wrote: >>> >>> Hello Vicente, >>> >>> Thank you for your reply, you'll find my answer below >>> >>> On Thu, Nov 28, 2013 at 12:03 AM, Vicente Hernando < >>> [email protected]> wrote: >>> >>>> Hello, >>>> >>>> also full steps to crash kamailio and reproduce the error would be good. >>>> >>> >>> Here is the architecture >>> >>> A <--> Asterisk <--> Kamailio 1 <---> kamailio2 <--- ISP---> mobile >>> >>> Kamailio 1 & 2 are connected to a local redis server >>> 1/ I restarted the redis server >>> 2/ From the mobile I made a call to A then cancelled it. In the script >>> of kamailio1, if a call has missed or failed, it sends a message to the >>> redis. And in this case, it crashes >>> >>> >>> >>>> >>>> >>>> On 11/27/2013 11:35 PM, Daniel-Constantin Mierla wrote: >>>> >>>> Hello, >>>> >>>> can you give the full output for 'bt full' with gdb on the core file? >>>> You gave only partial list of the frames, not being enough to see the >>>> execution trace. >>>> >>>> Cheers, >>>> Daniel >>>> >>>> On 11/27/13 6:52 PM, Tuan Viet Nguyen wrote: >>>> >>>> Hello, >>>> >>>> I'll try to shut down the redis server to test the behavior of >>>> kamailio and it has crashed if a call is received and then cancelled. >>>> >>>> *1/The kamailio version is 4.0.4* >>>> >>>> *2/ Kamailio log * >>>> /usr/local/sbin/kamailio[25333]: ERROR: ndb_redis >>>> [redis_client.c:364]: redisc_exec(): Redis error: Server closed the >>>> connection >>>> /usr/local/sbin/kamailio[25361]: : <core> [pass_fd.c:293]: >>>> receive_fd(): ERROR: receive_fd: EOF on 13 >>>> /usr/local/sbin/kamailio[25328]: ALERT: <core> [main.c:788]: >>>> handle_sigs(): child process 25333 exited by a signal 11 >>>> /usr/local/sbin/kamailio[25328]: ALERT: <core> [main.c:791]: >>>> handle_sigs(): core was generated >>>> >>>> I assume you disconnect redis server and don't reconnect it. It is >>>> that correct? >>>> >>>> Then this line is an error but it should recover from that. I probably >>>> should set this as a warning instead an error. >>>> >>>> /usr/local/sbin/kamailio[25333]: ERROR: ndb_redis [redis_client.c:364]: >>>> redisc_exec(): Redis error: Server closed the connection >>>> >>> >>> Yes, it has been restarted >>> >>> >>>> *3/ Interesting information in the core* >>>> #3 0x00007fc79412893d in redisvCommand (c=0x64657461, format=0x9 >>>> <Address 0x9 out of bounds>, ap=0x30, ap@entry=0x7fff0ff56aa8) at >>>> hiredis.c:1304 >>>> No locals. >>>> #4 0x00007fc794341713 in redisc_exec (srv=srv@entry=0x7fff0ff56be0, >>>> res=res@entry=0x7fff0ff56c00, cmd=cmd@entry=0x7fff0ff56bf0) at >>>> redis_client.c:368 >>>> rsrv = 0x7fc794565150 >>>> rpl = 0x7fc7946fab70 >>>> c = 0 '\000' >>>> ap = {{gp_offset = 48, fp_offset = 48, overflow_arg_area = >>>> 0x7fff0ff56bb0, reg_save_area = 0x7fff0ff56ac0}} >>>> __FUNCTION__ = "redisc_exec" >>>> #5 0x00007fc79433b781 in w_redis_cmd5 (msg=<optimized out>, >>>> ssrv=<optimized out>, scmd=<optimized out>, sargv1=<optimized out>, >>>> sargv2=0x7fc7946f7bf0 "p\243_\224\307\177", sres=0x7fc7946f7c50 " >>>> \253_\224\307\177") at ndb_redis_mod.c:250 >>>> s = {{s = 0x7fc7945fb300 "kamailio_redis", len = 14}, {s = >>>> 0x7fc7945f5f50 "PUBLISH %s %s", len = 13}, {s = 0x7fc7945fab20 "r", len = >>>> 1}} >>>> arg1 = {s = 0x7fc7945f5f80 "notification", len = 12} >>>> arg2 = { >>>> s = 0x7fc794551c60 "info XXX"..., >>>> len = 212} >>>> c1 = 0 '\000' >>>> c2 = 0 '\000' >>>> __FUNCTION__ = "w_redis_cmd5" >>>> >>>> >>>> In the source code: >>>> >>>> rpl->rplRedis = redisvCommand(rsrv->ctxRedis, cmd->s, ap ); >>>> if(rpl->rplRedis == NULL) >>>> { >>>> /* null reply, reconnect and try again */ >>>> if(rsrv->ctxRedis->err) >>>> { >>>> LM_ERR("Redis error: %s\n", rsrv->ctxRedis->errstr); >>>> } >>>> if(redisc_reconnect_server(rsrv)==0) >>>> { >>>> rpl->rplRedis = redisvCommand(rsrv->ctxRedis, cmd->s, ap); >>>> } >>>> } >>>> >>>> First redisvCommand executes but returns nothing. Then it shows a redis >>>> error. >>>> >>>> It tries to reconnect and it manages to connect ?? because it shows no >>>> more errors. >>>> >>>> And then executes redisvCommand again and crashes. >>>> >>>> If server is down it should not be able to connect and so not to >>>> execute redisvCommand again. >>>> >>> >>> According to the core, we MUST be in this case >>> *if(redisc_reconnect_server(rsrv)==0) * >>> But I am wondering how the first redisvCommand can succeed before the >>> reconnection ? (the connection kamailio1 <-> redis has already been taken >>> down). Does all the redis context always there when we first call >>> redisvCommand? >>> >>> >>>> >>>> >>>> May be I would get more clues with more information. >>>> >>>> Regards, >>>> Vicente. >>>> >>> >>> Thank you >>> Regards, >>> >>> >>>> >>>> >>>> I've found one of post that this issue has been fixed but it seems >>>> that it's always the case .. >>>> >>>> http://www.mail-archive.com/[email protected]&q=subject:%22Re%3A+%5BSR-Users%5D+ndb_redis+module+fails+after+a+while%22 >>>> >>>> Do you have any idea? >>>> Thank you >>>> >>>> >>>> _______________________________________________ >>>> sr-dev mailing >>>> [email protected]http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev >>>> >>>> >>>> -- >>>> Daniel-Constantin Mierla - >>>> http://www.asipto.comhttp://twitter.com/#!/miconda - >>>> http://www.linkedin.com/in/miconda >>>> >>>> >>>> >>>> _______________________________________________ >>>> sr-dev mailing >>>> [email protected]http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev >>>> >>>> >>>> >>> >>> >> >> > >
_______________________________________________ sr-dev mailing list [email protected] http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
