Hi Yury,

You need to check the TCP setting and to be sure your OpenSIPS will (1) not try to perform TCP connect against destination known not to be able to accept (like TCP/WS end points behind NAT) - see the tcp_no_new_conn_bflag [1] - or (2) not block for long time while attempting a connect - see the tcp_connect_timeout [2] or consider enabling async [3].

[1] https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_no_new_conn_bflag [2] https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_connect_timeout
[3] https://opensips.org/html/docs/modules/3.2.x/proto_tcp.html#idp168992


Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
OpenSIPS Summit 27-30 Sept 2022, Athens

On 9/13/22 12:01 PM, Yury Kirsanov wrote:
Hi Bogdan,
Thanks for this update, but it looks like I can't check autoscaler because of this first issue with blocking TCP connect. Is there a way to resolve it? Am I doing something wrong? Or is that something to do with OpenSIPS code? As yes, you're right, as soon as I restart OpenSIPS having a lot of SIP devices trying to connect to it - it goes crazy, starts to consume memory and stops to forward packets sitting there at 100% load until it runs out of memory and segfaults. Sometimes I can't even restart it to come to normal state to make it work, it just loops into same crash whatever I try to do.

I've compiled OpenSIPS 3.3.1 with your patch and was able to start it but not sure, maybe I was just lucky this time.

What should I do? Thanks!

Best regards,

On Tue, 13 Sept 2022, 18:56 Bogdan-Andrei Iancu, <bog...@opensips.org <mailto:bog...@opensips.org>> wrote:

    Hi Yury,

    it looks like you some multiple issues, overlapping here. The
    traps you sent here have nothing to do with the auto-scaling, but
    with a blocking TCP connect for SIP - most of the procs get
    blocked into a sync TCP connect.


    Bogdan-Andrei Iancu

    OpenSIPS Founder and Developer
       https://www.opensips-solutions.com  <https://www.opensips-solutions.com>
    OpenSIPS Summit 27-30 Sept 2022, Athens

    On 9/12/22 4:39 PM, Yury Kirsanov wrote:
    Hi Bogdan,
    I've applied the patch (had to find where to apply it manually
    for 3.2.8 downloaded from Web page, line 1568 instead of 1652)
    and restarted the server with only about 300-350 SIP devices and
    immediately got into same issue. I'm attaching two GDB dumps made
    within several minutes from each other. Autoscale was now OFF,
    please see my previous message as currently for some reason I'm
    experiencing lockups even when it's off :(

    Best regards,

    On Mon, Sep 12, 2022 at 7:48 PM Bogdan-Andrei Iancu
    <bog...@opensips.org <mailto:bog...@opensips.org>> wrote:

        Hi Yuri,

        Could you give this patch a try? it should fix the blocking
        you experience (it should apply on 3.2 too).

        Best regards,

        Bogdan-Andrei Iancu

        OpenSIPS Founder and Developer
        OpenSIPS Summit 27-30 Sept 2022, Athens

        On 9/7/22 2:54 PM, Bogdan-Andrei Iancu wrote:
        Hi Yury,

        Thanks for the details info here - let me do a review of
        some code and run some tests, as at this point I have a good
        idea on the direction to dig into.

        I will update here.

        Best regards,
        Bogdan-Andrei Iancu

        OpenSIPS Founder and Developer
        OpenSIPS Summit 27-30 Sept 2022, Athens
        On 9/6/22 11:24 AM, Yury Kirsanov wrote:
        Hi Bogdan,
        Yes, I'm listening on all types of sockets including UDP,
        TCP and TLS on the outside public interface and then
        forward traffic into internal LAN via UDP only.

        Previously it was getting stuck quite easily, now I had to
        wait for a while before this actually happened. I've routed
        part of my customers to this server to obtain this result
        so I will have to do that again.

        As soon as I see one of the processes stuck I'll dot the
        trap command and send you all the details including
        processes load, ps output and so on.

        For now I had to switch autoscaling off and just create
        many listeners. Do I understand correctly that I need to
        restart OpenSIPS in order to apply autoscaling profiles and
        reload-routes is not sufficient?

        Also, do I need separate UDP profiles for public and
        private interfaces? And do I need to apply autoscaling
        profile just to a socket or I need to specify udp or
        tcp_workers with autoscaler too?

        Thanks and best regards,

        On Tue, 6 Sept 2022, 18:18 Bogdan-Andrei Iancu,
        <bog...@opensips.org <mailto:bog...@opensips.org>> wrote:

            Hi Yury,

            Thanks for the info. I see that the stuck process (24)
            is an auto-scalled one (based on its id). Do you have
            SIP traffic from UDP to TCP or doing some HEP capturing
            for SIP ? I saw a recent similar report where a UDP
            auto-scalled worked got stuck when trying to do some
            communication with the TCP main/manager process (in
            order to handle a TCP operation).

            BTW, any chance to do a "opensips-cli -x trap" when you
            have that stuck process, just to see where is it stuck?
            and is it hard to reproduce? as I may ask you to
            extract some information from the running process....


            Bogdan-Andrei Iancu

            OpenSIPS Founder and Developer
            OpenSIPS Summit 27-30 Sept 2022, Athens

            On 9/3/22 6:54 PM, Yury Kirsanov wrote:

        Users mailing list
        Users@lists.opensips.org  <mailto:Users@lists.opensips.org>

Users mailing list

Reply via email to