Hello,

could you test this patch and confirm the bug has disappeared?

Thanks,
Vicente.

On 11/28/2013 11:10 AM, Tuan Viet Nguyen wrote:
Hi Vicente,

Thank you for your quick reply.

I'm ready to retest the patch.

Regards,


On Thu, Nov 28, 2013 at 11:07 AM, Vicente Hernando <[email protected] <mailto:[email protected]>> wrote:

    Hello,

    I think you have discovered a bug I made using variadic functions.

    Very soon I gonna send a patch to correct it.


    Thanks,
    Vicente.


    On 11/28/2013 10:14 AM, Tuan Viet Nguyen wrote:
    Hello Vicente,

    Thank you for your reply, you'll find my answer below

    On Thu, Nov 28, 2013 at 12:03 AM, Vicente Hernando
    <[email protected] <mailto:[email protected]>>
    wrote:

        Hello,

        also full steps to crash kamailio and reproduce the error
        would be good.


    Here is the architecture

    A <--> Asterisk <--> Kamailio 1 <---> kamailio2 <--- ISP---> mobile

    Kamailio 1 & 2 are connected to a local redis server
    1/ I restarted the redis server
    2/ From the mobile I made a call to A then cancelled it. In the
    script of kamailio1, if a call has missed or failed, it sends a
    message to the redis. And in this case, it crashes





        On 11/27/2013 11:35 PM, Daniel-Constantin Mierla wrote:
        Hello,

        can you give the full output for 'bt full' with gdb on the
        core file? You gave only partial list of the frames, not
        being enough to see the execution trace.

        Cheers,
        Daniel

        On 11/27/13 6:52 PM, Tuan Viet Nguyen wrote:
        Hello,

        I'll try to shut down the redis server to test the behavior
        of kamailio and it has crashed if a call is received and
        then cancelled.

        *1/The kamailio version is 4.0.4*

        *2/ Kamailio log *
        /usr/local/sbin/kamailio[25333]: ERROR: ndb_redis
        [redis_client.c:364]: redisc_exec(): Redis error: Server
        closed the connection
        /usr/local/sbin/kamailio[25361]: : <core> [pass_fd.c:293]:
        receive_fd(): ERROR: receive_fd: EOF on 13
        /usr/local/sbin/kamailio[25328]: ALERT: <core>
        [main.c:788]: handle_sigs(): child process 25333 exited by
        a signal 11
        /usr/local/sbin/kamailio[25328]: ALERT: <core>
        [main.c:791]: handle_sigs(): core was generated

        I assume you disconnect redis server and don't reconnect it.
        It is that correct?

        Then this line is an error but it should recover from that. I
        probably should set this as a warning instead an error.

        /usr/local/sbin/kamailio[25333]: ERROR: ndb_redis
        [redis_client.c:364]: redisc_exec(): Redis error: Server
        closed the connection


    Yes, it has been restarted


        _*3/ Interesting information in the core*_
        #3  0x00007fc79412893d in redisvCommand (c=0x64657461,
        format=0x9 <Address 0x9 out of bounds>, ap=0x30,
        ap@entry=0x7fff0ff56aa8) at hiredis.c:1304
        No locals.
        #4  0x00007fc794341713 in redisc_exec
        (srv=srv@entry=0x7fff0ff56be0,
        res=res@entry=0x7fff0ff56c00, cmd=cmd@entry=0x7fff0ff56bf0)
        at redis_client.c:368
                rsrv = 0x7fc794565150
                rpl = 0x7fc7946fab70
                c = 0 '\000'
                ap = {{gp_offset = 48, fp_offset = 48,
        overflow_arg_area = 0x7fff0ff56bb0, reg_save_area =
        0x7fff0ff56ac0}}
                __FUNCTION__ = "redisc_exec"
        #5  0x00007fc79433b781 in w_redis_cmd5 (msg=<optimized
        out>, ssrv=<optimized out>, scmd=<optimized out>,
        sargv1=<optimized out>, sargv2=0x7fc7946f7bf0
        "p\243_\224\307\177", sres=0x7fc7946f7c50 "
        \253_\224\307\177") at ndb_redis_mod.c:250
                s = {{s = 0x7fc7945fb300 "kamailio_redis", len =
        14}, {s = 0x7fc7945f5f50 "PUBLISH %s %s", len = 13}, {s =
        0x7fc7945fab20 "r", len = 1}}
                arg1 = {s = 0x7fc7945f5f80 "notification", len = 12}
                arg2 = {
                  s = 0x7fc794551c60 "info XXX"...,
                  len = 212}
                c1 = 0 '\000'
                c2 = 0 '\000'
                __FUNCTION__ = "w_redis_cmd5"


        In the source code:

            rpl->rplRedis = redisvCommand(rsrv->ctxRedis, cmd->s, ap );
            if(rpl->rplRedis == NULL)
            {
                /* null reply, reconnect and try again */
                if(rsrv->ctxRedis->err)
                {
                    LM_ERR("Redis error: %s\n", rsrv->ctxRedis->errstr);
                }
        if(redisc_reconnect_server(rsrv)==0)
                {
                    rpl->rplRedis = redisvCommand(rsrv->ctxRedis,
        cmd->s, ap);
                }
            }

        First redisvCommand executes but returns nothing. Then it
        shows a redis error.

        It tries to reconnect and it manages to connect ?? because it
        shows no more errors.

        And then executes redisvCommand again and crashes.

        If server is down it should not be able to connect and so not
        to execute redisvCommand again.


    According to the core, we MUST be in this case
    *if(redisc_reconnect_server(rsrv)==0)
    *
    But I am wondering how the first redisvCommand can succeed before
    the reconnection ? (the connection kamailio1 <-> redis has
    already been taken down). Does all the redis context always there
    when we first call redisvCommand?



        May be I would get more clues with more information.

        Regards,
        Vicente.


    Thank you
    Regards,



        I've found one of post that this issue has been fixed but
        it seems that it's always the case ..
        
http://www.mail-archive.com/[email protected]&q=subject:%22Re%3A+%5BSR-Users%5D+ndb_redis+module+fails+after+a+while%22

        Do you have any idea?
        Thank you


        _______________________________________________
        sr-dev mailing list
        [email protected]  <mailto:[email protected]>
        http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev

-- Daniel-Constantin Mierla -http://www.asipto.com
        http://twitter.com/#!/miconda  <http://twitter.com/#%21/miconda>  
-http://www.linkedin.com/in/miconda


        _______________________________________________
        sr-dev mailing list
        [email protected]  <mailto:[email protected]>
        http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev





diff --git a/modules/ndb_redis/redis_client.c b/modules/ndb_redis/redis_client.c
index 0ba083a..6cf7f6f 100644
--- a/modules/ndb_redis/redis_client.c
+++ b/modules/ndb_redis/redis_client.c
@@ -316,9 +316,10 @@ int redisc_exec(str *srv, str *res, str *cmd, ...)
 	redisc_server_t *rsrv=NULL;
 	redisc_reply_t *rpl;
 	char c;
-	va_list ap;
+	va_list ap, ap2;
 
 	va_start(ap, cmd);
+	va_copy(ap2, ap);
 
 	rsrv = redisc_get_server(srv);
 	if(srv==NULL || cmd==NULL || res==NULL)
@@ -365,7 +366,7 @@ int redisc_exec(str *srv, str *res, str *cmd, ...)
 		}
 		if(redisc_reconnect_server(rsrv)==0)
 		{
-			rpl->rplRedis = redisvCommand(rsrv->ctxRedis, cmd->s, ap);
+			rpl->rplRedis = redisvCommand(rsrv->ctxRedis, cmd->s, ap2);
 		} else {
 			LM_ERR("unable to reconnect to redis server: %.*s\n", srv->len, srv->s);
 			cmd->s[cmd->len] = c;
@@ -374,10 +375,12 @@ int redisc_exec(str *srv, str *res, str *cmd, ...)
 	}
 	cmd->s[cmd->len] = c;
 	va_end(ap);
+	va_end(ap2);
 	return 0;
 
 error_exec:
 	va_end(ap);
+	va_end(ap2);
 	return -1;
 
 }
_______________________________________________
sr-dev mailing list
[email protected]
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev

Reply via email to