Hi Ben,

Thank you for update. Have you tried to compile the memory debugging support ? it might speed up the detection of the error.


Regards,

Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com

On 05.08.2016 23:00, Newlin, Ben wrote:

Bogdan,

I tried to reproduce this with some simple instructions, but it didn’t reproduce. There must be some dependency on other functions or configurations we are using. It would be too hard to try to figure out exactly what that is, so I will have to capture the data for you. It will just take a while for me to revive an older configuration that allows me to enable the memory debugger.

Ben Newlin

*From: *Bogdan-Andrei Iancu <bog...@opensips.org>
*Date: *Tuesday, August 2, 2016 at 3:47 AM
*To: *"Newlin, Ben" <ben.new...@inin.com>, OpenSIPS users mailling list <users@lists.opensips.org>
*Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog crashes

Ben,

To make it easier, please send me the instructions on how to reproduce the crash.

Thanks and Regards,


Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com

On 01.08.2016 20:17, Newlin, Ben wrote:

    Bogdan,

    I am not familiar with gdb, so I double check what you’ve
    assessed. If there are some other steps with gdb you would like me
    to perform, just let me know what to do.

    Is there a way to compile the memory debugger without using the
    interactive `make menuconfig` command? Our build system is
    completely automated, so it impossible for me to do it this way.
    Can I pass the options as build parameters or alter the makefile
    in some way?

    I can provide a SIPp scenario which should reproduce the issue on
    any basic script that uses Dialog with topology hiding, if that
    would be easier.

    Ben Newlin

    *From: *Bogdan-Andrei Iancu <bog...@opensips.org>
    <mailto:bog...@opensips.org>
    *Date: *Monday, August 1, 2016 at 10:57 AM
    *To: *OpenSIPS users mailling list <users@lists.opensips.org>
    <mailto:users@lists.opensips.org>, "Newlin, Ben"
    <ben.new...@inin.com> <mailto:ben.new...@inin.com>
    *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog crashes

    Hi Ben,

    According to the BT, the crash is in a pkg_malloc() call:
                        route = pkg_malloc(size);
    Please double check this with gdb info.

    If so, this indicate a memory corruption and we have 2 options here:
        - you compile with memory debugger (see my previous emails)
        - provide step-by-step indications on how to reproduce this crash.

    Thanks and Regards,


    Bogdan-Andrei Iancu

    OpenSIPS Founder and Developer

    http://www.opensips-solutions.com

    On 29.07.2016 15:54, Newlin, Ben wrote:

        This is 1.11.6, running on CentOS 7.

        Ben Newlin

        *From: *<users-boun...@lists.opensips.org>
        <mailto:users-boun...@lists.opensips.org> on behalf of
        Bogdan-Andrei Iancu <bog...@opensips.org>
        <mailto:bog...@opensips.org>
        *Reply-To: *OpenSIPS users mailling list
        <users@lists.opensips.org> <mailto:users@lists.opensips.org>
        *Date: *Friday, July 29, 2016 at 8:50 AM
        *To: *"Newlin, Ben" <ben.new...@inin.com>
        <mailto:ben.new...@inin.com>, OpenSIPS users mailling list
        <users@lists.opensips.org> <mailto:users@lists.opensips.org>
        *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog crashes

        Ben,

        What OpenSIPS version is this (the crashing one) ? 1.11 or 2.1 ?

        Regards,



        Bogdan-Andrei Iancu

        OpenSIPS Founder and Developer

        http://www.opensips-solutions.com

        On 27.07.2016 19:02, Newlin, Ben wrote:

            I have identified that these crashes are occurring when
            the far end system is not returning the Record-Route
            headers in the 200 OK response. The headers are present in
            the 180 response, but not the 200 OK. I have reproduced
            the scenario using SIPp and captured a SIP trace:
            http://pastebin.com/ckKk3EhY <http://pastebin.com/ckKk3EhY>

            The crash occurs on receipt of the ACK request and attempt
            to match the dialog.

            I also captured a BT for this scenario as well, in case
            anything specific in the trace made the issue easier to
            find: http://pastebin.com/cM3FhPiw

            I am working with the other system to try to fix their
            behavior.

            Ideally the Record-Route headers from previous replies
            could be used in this case to allow the call to succeed,
            but I don’t know if that is possible.

            Thanks,

            Ben Newlin

            *From: *"Newlin, Ben" <ben.new...@inin.com>
            <mailto:ben.new...@inin.com>
            *Date: *Wednesday, July 27, 2016 at 9:44 AM
            *To: *Bogdan-Andrei Iancu <bog...@opensips.org>
            <mailto:bog...@opensips.org>, OpenSIPS users mailling list
            <users@lists.opensips.org> <mailto:users@lists.opensips.org>
            *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog
            crashes

            Bogdan,

            This is a different scenario than the other you responded
            to. As I said, we have two types of servers that work
            together. One is a load-balancer and runs as a proxy. It
            uses double Record-Route because it sends messages between
            public and private networks. Then we have our other
            servers using TH which receive those requests. We are not
            using TH and RR on the same server (although I would like to).

            If validate_dialog() and fix_route_dialog() (and possibly
            loose_route()) should not be called when using TH, I
            believe the documentation should reference that. It states
            that match_dialog() must be used with TH, but does not
            indicate that the other functions should not be used or
            that the functionality won’t work. There is also no
            documentation of the incompatibility between RR and TH.

            Either way, I ran a test where I removed all calls to
            loose_route(), validate_dialog(), and fix_route_dialog()
            from my script. The crash still occurred and the BT still
            pointed to fix_route_dialog() function. So it must be
            getting called from within Dialog module somewhere. That
            BT is here: http://pastebin.com/wu2X2Hxh

            I collected this BT with loose_route() being called from
            my script, but not validate_dialog() or
            fix_route_dialog(): http://pastebin.com/6V7yPaHF

            This BT was collected with all three functions being
            called from my script: http://pastebin.com/fZYYdndn

            Ben Newlin

            *From: *Bogdan-Andrei Iancu <bog...@opensips.org>
            <mailto:bog...@opensips.org>
            *Date: *Wednesday, July 27, 2016 at 3:57 AM
            *To: *OpenSIPS users mailling list
            <users@lists.opensips.org>
            <mailto:users@lists.opensips.org>, "Newlin, Ben"
            <ben.new...@inin.com> <mailto:ben.new...@inin.com>
            *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog
            crashes

            Hi Ben,

            First, if you use TH, makes no sense to do Record-Routing
            - there are 2 SIP concepts that overlaps. You either act
            as an end-point (by doing TH), either as a proxy (doing RR).

            If doing TH, makes no sense to use validate + fix as these
            functions check and repair the routing information in the
            request (like Route and Contact headers). if you do TH,
            this routing info is actually hidden and added by
            OpenSIPS, so there is nothing to fix and repair.

            Nevertheless, this should not crash or corrupt OpenSIPS.
            HAve you managed to get a corefile ?

            Also if you suspect memory corruption, you can compile-in
            the memory debugger - see
            http://www.opensips.org/Documentation/TroubleShooting-OutOfMem
            .

            Regards,





            Bogdan-Andrei Iancu

            OpenSIPS Founder and Developer

            http://www.opensips-solutions.com

            On 26.07.2016 23:20, Newlin, Ben wrote:

                I have had 3 OpenSIPS server crashes in the last week.
                All were due to segmentation faults. I was not able to
                capture core dumps; I am configuring that now to catch
                the next crash.

                My logs leading up to the crash are full of errors
                from fix_route_dialog() complaining about invalid URIs
                for sequential requests:

                Jul 26 19:34:02 [220] ERROR:dialog:fix_route_dialog:
                Failed to parse SIP uri

                Jul 26 19:34:02 [220] ERROR:core:parse_uri: bad uri,
                state 0 parsed: <ip:1> (4) /
                <ip:10.18.8.18:5060;ftag=gK0448f137;lr;r2=on>> (44)

                Jul 26 19:11:19 [218] ERROR:dialog:fix_route_dialog:
                Failed to parse SIP uri

                Jul 26 19:11:19 [218] ERROR:core:parse_uri: bad uri,
                state 0 parsed: <b0i2> (4) /
                
<b0i2yjor;transport=udp<sip:10.18.8.17:5060;ftag=7207ce89;lr;r2=on>
                (65)

                Jul 26 17:43:19 [220] ERROR:dialog:fix_route_dialog:
                Failed to parse SIP uri

                Jul 26 17:43:19 [220] ERROR:core:parse_uri: bad uri,
                state 0 parsed: <ervi> (4) /
                <ervice_id6fdbc70f-2438-4726-807c-0d081df4d87> (44)

                Many times the “URI” displayed in the error message is
                actually internal OpenSIPS variables, as in the last
                error above. When they are from the SIP message, I
                have verified that the messages themselves are
                correctly formatted. This leads me to believe there is
                memory corruption occurring.

                This all started when I updated my load-balancer
                servers to use Record-Routing, specifically the
                “double_rr” mechanism for when multiple interfaces
                exist. The Record-Routing is occurring on different
                servers which have not crashed. Only the servers
                receiving the Record-Routed messages are experiencing
                the errors.

                Here is a piece of the code processing sequential
                requests. I am using the topology_hiding()
                functionality of the Dialog module. Are
                validate_dialog() and fix_route_dialog() still valid
                in a topology_hiding scenario?

                if (t_check_trans())

                setflag(SEQ_REQUEST);

                  if (has_totag())

                  {

                loose_route();

                    if (match_dialog())

                    {

                if (!validate_dialog())

                fix_route_dialog();

                if (is_method("BYE"))

                setflag(ACC_FLAG);

                setflag(SEQ_REQUEST);

                    }

                else if (!isflagset(SEQ_REQUEST))

                    {

                if (!is_method("ACK")) {

                route(rlog, LV_ERROR, "check_sequential", "Sequential
                request not matched");

                  route(reply_error, "481", "Call Does Not Exist");

                      }

                return(EXIT);

                    }

                  }

                I will attempt to get core dumps of future crashes.

                Thanks,

                Ben Newlin








                _______________________________________________

                Users mailing list

                Users@lists.opensips.org <mailto:Users@lists.opensips.org>

                http://lists.opensips.org/cgi-bin/mailman/listinfo/users





        _______________________________________________

        Users mailing list

        Users@lists.opensips.org <mailto:Users@lists.opensips.org>

        http://lists.opensips.org/cgi-bin/mailman/listinfo/users


_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users

Reply via email to