<!--
Kamailio Project uses GitHub Issues only for bugs in the code or feature 
requests. Please use this template only for bug reports.

If you have questions about using Kamailio or related to its configuration 
file, ask on sr-users mailing list:

  * 
https://lists.kamailio.org/mailman3/postorius/lists/sr-users.lists.kamailio.org/

If you have questions about developing extensions to Kamailio or its existing C 
code, ask on sr-dev mailing list:

  * 
https://lists.kamailio.org/mailman3/postorius/lists/sr-dev.lists.kamailio.org/

Please try to fill this template as much as possible for any issue. It helps 
the developers to troubleshoot the issue.

Note that an issue report may be closed automatically after about 2 months
if there is no interest from developers or community users on pursuing it, being
considered expired. In such case, it can be reopened by writing a comment that 
includes
the token `/notexpired`. About two weeks before considered expired, the issue is
marked with the label `stale`, trying to notify the submitter and everyone else
that might be interested in it. To remove the label `stale`, write a comment 
that
includes the token `/notstale`. Also, any comment postpone the `expire` 
timeline,
being considered that there is interest in pursuing the issue.

If there is no content to be filled in a section, the entire section can be 
removed.

You can delete the comments from the template sections when filling.

You can delete next line and everything above before submitting (it is a 
comment).
-->

### Description

We are now using kamailio on the edge as a presence proxy to scale presence. We 
have 3 internal kamailio instances handling the actual subscriptions and 
publishing. This works well, however, once per 24-36hours we experience a crash 
on the border proxy instance. We did not experience this when using the border 
proxy as a presence server (there was no proxying SUBSCRIBE).

### Troubleshooting

#### Reproduction

I believe this is happening when a customers network randomly experiences 
issues where it doesn't receive our replies to subscribes (and continues to 
send a subscribe). Logs are below.

#### Debugging Data

<!--
If you got a core dump, use gdb to extract troubleshooting data - full 
backtrace,
local variables and the list of the code at the issue location.

  gdb /path/to/kamailio /path/to/corefile
  bt full
  info locals
  list

If you are familiar with gdb, feel free to attach more of what you consider to
be relevant.
-->

Here's the output while replacing any sensitive info in the payload.
```
GNU gdb (Debian 13.1-3) 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/kamailio...
Reading symbols from 
/usr/lib/debug/.build-id/ce/c19d0ed1a928e4e25a84ae20d77e9c904a4892.debug...

warning: Can't open file /dev/zero (deleted) during file-backed mapping note 
processing
[New LWP 26]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `kamailio -DD -E -m 8000 -M 512 -f 
/etc/kamailio/kamailio.cfg -P /var/run/kamail'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  via_builder (len=len@entry=0x7ffed7e3ecb8, msg=msg@entry=0x7ff9529548b0, 
send_info=send_info@entry=0x7ff764d20bb0, branch=branch@entry=0x7ffed7e3ecc0, 
extra_params=0x0, 
    hp=hp@entry=0x7ffed7e3ebb0) at core/msg_translator.c:2910
2910    core/msg_translator.c: No such file or directory.
(gdb) bt full
#0  via_builder (len=len@entry=0x7ffed7e3ecb8, msg=msg@entry=0x7ff9529548b0, 
send_info=send_info@entry=0x7ff764d20bb0, branch=branch@entry=0x7ffed7e3ecc0, 
extra_params=0x0, 
    hp=hp@entry=0x7ffed7e3ebb0) at core/msg_translator.c:2910
        via_len = <optimized out>
        extra_len = <optimized out>
        line_buf = <optimized out>
        max_len = <optimized out>
        via_prefix_len = <optimized out>
        address_str = <optimized out>
        port_str = 0x0
        send_sock = 0x0
        comp_len = <optimized out>
        comp_name_len = <optimized out>
        port = <optimized out>
        proto = <optimized out>
        ip = {af = 536870912, len = 0, u = {addrl = {534820896, 1586192}, 
addr32 = {534820896, 0, 1586192, 0}, addr16 = {47136, 8160, 0, 0, 13328, 24, 0, 
0}, 
            addr = " \270\340\037\000\000\000\000\0204\030\000\000\000\000"}}
        from = 0x0
        local_addr = {s = {sa_family = 16, sa_data = 
"#\000\000\000\000\000\020\000\000\000\000\000\000"}, sin = {sin_family = 16, 
sin_port = 35, sin_addr = {s_addr = 0}, 
            sin_zero = "\020\000\000\000\000\000\000"}, sin6 = {sin6_family = 
16, sin6_port = 35, sin6_flowinfo = 0, sin6_addr = {__in6_u = {
                __u6_addr8 = 
"\020\000\000\000\000\000\000\000\003\000\000\000\000\000\000", __u6_addr16 = 
{16, 0, 0, 0, 3, 0, 0, 0}, __u6_addr32 = {16, 0, 3, 0}}}, 
            sin6_scope_id = 3000082688}, sas = {ss_family = 16, 
            __ss_padding = 
"#\000\000\000\000\000\020\000\000\000\000\000\000\000\003\000\000\000\000\000\000\000\000\241\321\262cg\357`@\000\000\000\000\000\000\000X\f-\177!V\000\000\020\016\217R\371\177\000\000\001\000\000\000!V\000\000*\000\000\000\000\000\000\000\001\000\000\000\376\177\000\000\004\360;\177!V\000\000\250,A\177!V\000\000\000\355\343\327\376\177\000\000\000\241\321\262cg\357`h\256\006R\371\177\000",
 __ss_align = 94701866020241}}
        con = 0x0
        rxavp = 0x0
        xname = {s = 0x0, len = -2141140176}
        __func__ = "via_builder"
#1  0x000056217f155030 in create_via_hf (len=len@entry=0x7ffed7e3ecb8, 
msg=msg@entry=0x7ff9529548b0, send_info=send_info@entry=0x7ff764d20bb0, 
branch=branch@entry=0x7ffed7e3ecc0)
    at core/msg_translator.c:3232
        via = <optimized out>
        extra_params = {s = 0x0, len = 0}
        hp = {host = 0x56217f4fb380 <default_global_address>, port = 
0x56217f4fb370 <default_global_port>}
        sbuf = "\331\023\000\000\001", '\000' <repeats 18 times>
--Type <RET> for more, q to quit, c to continue without paging--info locals
        slen = <optimized out>
        xparams = <optimized out>
        id_buf = <optimized out>
        id_len = 0
        __func__ = "create_via_hf"
        __llevel = <optimized out>
        __kld = <optimized out>
#2  0x000056217f1592f7 in build_req_buf_from_sip_req 
(msg=msg@entry=0x7ff9529548b0, returned_len=returned_len@entry=0x7ffed7e3eeac, 
send_info=send_info@entry=0x7ff764d20bb0, 
    mode=mode@entry=128) at core/msg_translator.c:2096
        len = 581
        new_len = <optimized out>
        received_len = 0
        rport_len = 0
        uri_len = 0
        via_len = 0
        body_delta = 0
        line_buf = 0x0
        received_buf = 0x0
        rport_buf = 0x0
        new_buf = 0x0
        buf = 0x56217f5866c0 <buf> "SUBSCRIBE sip:7...@sip.domain.co 
SIP/2.0\r\nVia: SIP/2.0/UDP 
12.171.207.82:25744;branch=z9hG4bK1805670803;rport\r\nFrom: \"Heather Myers\" 
<sip:1086...@sip.domain.co>;tag=1805518213\r\nTo: 
<sip:7...@sip.domain.co>\r\nCa"...
        path_buf = {s = 0x0, len = 0}
        offset = 0
        s_offset = 0
        size = <optimized out>
        via_anchor = 0x7ff952951300
        via_lump = <optimized out>
        via_rm = <optimized out>
        via_insert_param = 0x0
        path_anchor = <optimized out>
        path_lump = <optimized out>
        branch = {s = 0x7ff952954ed8 
"z9hG4bKc7ed.5819b4295fe83f981a2fc19b7f739d7e.0", len = 46}
        flags = 262273
--Type <RET> for more, q to quit, c to continue without paging--list
        udp_mtu = <optimized out>
        di = {send_sock = 0x0, to = {s = {sa_family = 18608, sa_data = 
"\225R\371\177\000\000\002\000\023\331\n4\001\r"}, sin = {sin_family = 18608, 
sin_port = 21141, sin_addr = {
                s_addr = 32761}, sin_zero = "\002\000\023\331\n4\001\r"}, sin6 
= {sin6_family = 18608, sin6_port = 21141, sin6_flowinfo = 32761, sin6_addr = 
{__in6_u = {
                  __u6_addr8 = 
"\002\000\023\331\n4\001\r\000\000\000\000\000\000\000", __u6_addr16 = {2, 
55571, 13322, 3329, 0, 0, 0, 0}, __u6_addr32 = {3641901058, 218182666, 0, 0}}}, 
              sin6_scope_id = 0}, sas = {ss_family = 18608, __ss_padding = 
"\225R\371\177\000\000\002\000\023\331\n4\001\r", '\000' <repeats 103 times>, 
__ss_align = 0}}, id = 0, 
          send_flags = {f = 0, blst_imask = 0}, proto = 0 '\000', proto_pad0 = 
0 '\000', proto_pad1 = 0}
        ret = <optimized out>
        __func__ = "build_req_buf_from_sip_req"
        error00 = <optimized out>
#3  0x00007ff951fbe90d in prepare_new_uac (t=t@entry=0x7ff764d20880, 
i_req=i_req@entry=0x7ff9529548b0, branch=branch@entry=0, uri=<optimized out>, 
uri@entry=0x7ff9529548e8, 
    path=<optimized out>, path@entry=0x7ff952954f60, next_hop=<optimized out>, 
fsocket=0x7ff9528f0e10, snd_flags=..., fproto=<optimized out>, flags=<optimized 
out>, 
    instance=<optimized out>, ruid=<optimized out>, location_ua=<optimized 
out>) at ./src/modules/tm/t_fwd.c:482
        shbuf = 0x0
        add_rm_backup = <optimized out>
        body_lumps_backup = <optimized out>
        parsed_uri_bak = {user = {
            s = 0x56217f5866ce <buf+14> "7...@sip.domain.co SIP/2.0\r\nVia: 
SIP/2.0/UDP 12.171.207.82:25744;branch=z9hG4bK1805670803;rport\r\nFrom: 
\"Heather Myers\" <sip:1086...@sip.domain.co>;tag=1805518213\r\nTo: 
<sip:7...@sip.domain.co>\r\nCall-ID: 0_18056"..., len = 2}, passwd = {s = 0x0, 
len = 0}, host = {
            s = 0x56217f5866d1 <buf+17> "sip.domain.co SIP/2.0\r\nVia: 
SIP/2.0/UDP 12.171.207.82:25744;branch=z9hG4bK1805670803;rport\r\nFrom: 
\"Heather Myers\" <sip:1086...@sip.domain.co>;tag=1805518213\r\nTo: 
<sip:7...@sip.domain.co>\r\nCall-ID: 0_18056078"..., len = 11}, port = {s = 
0x0, len = 0}, params = {s = 0x0, len = 0}, sip_params = {s = 0x0, len = 0}, 
headers = {
            s = 0x0, len = 0}, port_no = 0, proto = 0, type = SIP_URI_T, flags 
= 0, transport = {s = 0x0, len = 0}, ttl = {s = 0x0, len = 0}, user_param = {s 
= 0x0, len = 0}, maddr = {
            s = 0x0, len = 0}, method = {s = 0x0, len = 0}, lr = {s = 0x0, len 
= 0}, r2 = {s = 0x0, len = 0}, gr = {s = 0x0, len = 0}, transport_val = {s = 
0x0, len = 0}, ttl_val = {
            s = 0x0, len = 0}, user_param_val = {s = 0x0, len = 0}, maddr_val = 
{s = 0x0, len = 0}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x0, len = 
0}, r2_val = {s = 0x0, 
            len = 0}, gr_val = {s = 0x0, len = 0}}
        ret = -1
        len = 32761
        parsed_uri_ok_bak = <optimized out>
        free_new_uri = 1
        msg_uri_bak = {s = 0x0, len = 0}
        dst_uri_bak = {s = 0x7ff9529525f0 "sip:10.52.1.13:5081", len = 19}
        dst_uri_backed_up = 1
        path_bak = {s = 0x0, len = 0}
        free_path = 1
        instance_bak = {s = 0x0, len = 0}
```

#### Log Messages

<!--
Check the syslog file and if there are relevant log messages printed by 
Kamailio, add them next, or attach to issue, or provide a link to download them 
(e.g., to a pastebin site).
-->

```
ERROR 2025-01-10T06:00:31.847348563Z [resource.labels.containerName: kamailio] 
8(28) CRITICAL: {1 1 SUBSCRIBE 0_1907143137@192.168.100.28} tm [timer.h:188]: 
_set_fr_retr(): already added: 0x7ff763e62250 , tl=0x7ff763e62270!!!
ERROR 2025-01-10T06:00:31.847388483Z [resource.labels.containerName: kamailio] 
8(28) CRITICAL: {1 1 SUBSCRIBE 0_1907143137@192.168.100.28} tm [t_fwd.c:1613]: 
t_send_branch(): BUG: retransmission already started for: 0x7ff763e62250
ERROR 2025-01-10T06:00:31.847394809Z [resource.labels.containerName: kamailio] 
8(28) ERROR: {1 1 SUBSCRIBE 0_1907143137@192.168.100.28} sl [sl_funcs.c:428]: 
sl_reply_error(): stateless error reply used: No error (0/SL)
ERROR 2025-01-10T06:00:31.847400738Z [resource.labels.containerName: kamailio] 
8(28) BUG: {1 1 SUBSCRIBE 0_1907143137@192.168.100.28} tm [t_lookup.c:2052]: 
t_unref(): REQ_ERR DELAYED should have been caught much earlier for 
0x7ff763e61f70: 24 (hex 18)
ERROR 2025-01-10T06:00:47.346744773Z [resource.labels.containerName: kamailio] 
45(65) ERROR: <core> [core/tcp_main.c:4818]: tcpconn_main_timeout(): connect 
64.63.142.90:12190 failed (timeout)
ERROR 2025-01-10T06:00:58.274915089Z [resource.labels.containerName: kamailio] 
45(65) CRITICAL: <core> [core/pass_fd.c:281]: receive_fd(): EOF on 20
ERROR 2025-01-10T06:01:02.715789076Z [resource.labels.containerName: kamailio] 
0(1) ALERT: <core> [main.c:805]: handle_sigs(): child process 26 exited by a 
signal 11
ERROR 2025-01-10T06:01:02.715827333Z [resource.labels.containerName: kamailio] 
0(1) ALERT: <core> [main.c:809]: handle_sigs(): core was generated
ERROR 2025-01-10T06:01:02.718815287Z [resource.labels.containerName: kamailio] 
45(65) CRITICAL: <core> [core/pass_fd.c:281]: receive_fd(): EOF on 25
ERROR 2025-01-10T06:01:02.720985493Z [resource.labels.containerName: kamailio] 
45(65) CRITICAL: <core> [core/pass_fd.c:281]: receive_fd(): EOF on 23
ERROR 2025-01-10T06:01:02.725450552Z [resource.labels.containerName: kamailio] 
45(65) CRITICAL: <core> [core/pass_fd.c:281]: receive_fd(): EOF on 30
```

#### SIP Traffic

<!--
If the issue is exposed by processing specific SIP messages, grab them with 
ngrep or save in a pcap file, then add them next, or attach to issue, or 
provide a link to download them (e.g., to a pastebin site).
-->

```
(paste your sip traffic here)
```

### Possible Solutions

I can atleast say that when this segfault happens that these logs are always 
present right before the crash. However, I also see instances of these logs 
without a crash immediately following.

On the proxy we are doing this is withindlg route to send subscribes to the 
presence route:
```
if KSR.is_SUBSCRIBE() and KSR.is_myself_ruri() then
      -- in-dialog subscribe requests
      ksr_route_presence();
      KSR.x.exit();
    end
```

Any then this is our ksr_route_presence():
```
function ksr_route_presence()
  if not KSR.is_SUBSCRIBE() then return 1; end

  local fuser = KSR.kx.get_fuser();

  if fuser=='' then
    KSR.xlog.xinfo('Subscription attempt without username from IP ' .. 
KSR.kx.get_srcip() .. ' - Rejecting\n');
    KSR.sl.sl_send_reply(404, 'Not Found');
    KSR.x.exit();
  end

  KSR.pv.sets('$avp(s:tenant)', string.match(fuser, '-(.*)'));
  local setId = '6' .. KSR.kx.get_def('DEFAULT_DISPATCH_SET');
  local modeId = '7';

  if KSR.dispatcher.ds_select_dst(setId, modeId)<0 then
    KSR.sl.send_reply(404, 'No destination');
    KSR.x.exit();
  end

  KSR.corex.set_send_socket(KSR.kx.get_def('PRIVATE_LISTEN_IP') .. ':' .. 
KSR.kx.get_def('INTERNAL_PORT'));

  KSR.tm.t_on_failure('ksr_route_rtf_dispatch');

  ksr_route_relay();
end
```

### Additional Information

  * **Kamailio Version** - output of `kamailio -v`

```
version: kamailio 5.8.4 (x86_64/linux) 
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, 
USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE, Q_MALLOC, 
F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, 
USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, 
TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_SEND_BUFFER_SIZE 
262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: unknown 
compiled with gcc 12.2.0
```

* **Operating System**:

<!--
Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 
16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...;
Kernel details (output of `lsb_release -a` and `uname -a`)
-->

```
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/";
SUPPORT_URL="https://www.debian.org/support";
BUG_REPORT_URL="https://bugs.debian.org/";
```
```
Linux gke-us-south1-external-sip-800ca69e-qyb3 5.15.0-1048-gke #53-Ubuntu SMP 
Tue Nov 28 00:39:01 UTC 2023 x86_64 GNU/Linux
```


-- 
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/4102
You are receiving this because you are subscribed to this thread.

Message ID: <kamailio/kamailio/issues/4...@github.com>
_______________________________________________
Kamailio - Development Mailing List -- sr-dev@lists.kamailio.org
To unsubscribe send an email to sr-dev-le...@lists.kamailio.org
Important: keep the mailing list in the recipients, do not reply only to the 
sender!

Reply via email to