## Summary

Fix race condition in `link_dlg_profile()` that causes SIGSEGV or infinite loop 
in `get_profile_size()` under concurrent load with dialog expirations.

- Move `link_profile()` inside the dialog entry lock so the linker is 
atomically visible to both the dialog's profile list and the profile hash 
table
- Prevents `destroy_linkers()` from freeing a linker before `link_profile()` 
inserts it into the hash table
- Lock ordering (dialog lock → profile lock) is preserved

## Root Cause

In `link_dlg_profile()`, the dialog entry lock is released at line 521 before 
`link_profile()` is called at line 529. Between these two points, the linker is 
on the dialog's profile list (visible to `destroy_linkers()`) but not yet 
in the profile hash table (`hash_linker.next` is still NULL).

When `destroy_linkers()` runs concurrently (e.g., dialog timeout expiration), 
it checks `l->hash_linker.next` to decide whether to unlink from the profile 
hash table. Since `link_profile()` hasn't run yet, `next` is NULL, so 
`destroy_linkers()` skips the unlink and calls `shm_free()`. The original 
worker then calls `link_profile()` on freed memory, corrupting the hash table.

## Reproduction

Tested on unmodified master at commit 6ca2cf8, built with `gcc 14.2.0`, 
`CFLAGS="-g -O0"`, on x86_64 Linux.

**Without fix:** SIGSEGV in `get_profile_size()` within seconds at 1000 
calls/sec.
**With fix:** Survived 30,000 calls at 1000 cps and 46,000+ calls at 2000 cps 
with zero crashes.

Two servers: Kamailio on 10.0.0.40:5060, SIPp UAS on 10.0.0.41:5080. SIPp UAC 
sends INVITE → ACK → 4 rapid INFO requests near the 1-second dialog timeout 
boundary.

<details>
<summary>Kamailio configuration</summary>

```
##
## Kamailio configuration to reproduce dialog profile race condition
## https://github.com/kamailio/kamailio/issues/2923
##

#!define LISTEN_IP "10.0.0.40"
#!define LISTEN_PORT 5060
#!define UAS_IP "10.0.0.41"
#!define UAS_PORT 5080

debug=0
log_stderror=no
log_facility=LOG_LOCAL0
fork=yes
children=16
auto_aliases=no

listen=udp:LISTEN_IP:LISTEN_PORT

mpath="/usr/local/kamailio-master/lib64/kamailio/modules/"

loadmodule "tm.so"
loadmodule "sl.so"
loadmodule "rr.so"
loadmodule "maxfwd.so"
loadmodule "textops.so"
loadmodule "siputils.so"
loadmodule "pv.so"
loadmodule "dialog.so"
loadmodule "xlog.so"
loadmodule "jsonrpcs.so"
loadmodule "kex.so"
loadmodule "rtimer.so"
loadmodule "htable.so"

modparam("jsonrpcs", "fifo_name", 
"/var/run/kamailio/kamailio_rpc.fifo")
modparam("jsonrpcs", "dgram_socket", 
"/var/run/kamailio/kamailio_rpc.sock")

modparam("dialog", "timeout_avp", 
"$avp(dlg_timeout)")
modparam("dialog", "default_timeout", 1)
modparam("dialog", "profiles_with_value", "carrier ; 
region ; service ; tier")
modparam("dialog", "profiles_no_value", "calls ; 
active ; premium ; standard")

modparam("tm", "fr_timer", 5000)
modparam("tm", "fr_inv_timer", 10000)

modparam("rr", "enable_full_lr", 1)
modparam("rr", "append_fromtag", 1)

modparam("rtimer", "timer", 
"name=prof_check;interval=50000;mode=1")
modparam("rtimer", "exec", 
"timer=prof_check;route=PROFILE_CHECK")

modparam("htable", "htable", 
"stats=>size=4;autoexpire=300")

request_route {
    if (!mf_process_maxfwd_header("10")) {
        sl_send_reply("483", "Too Many Hops");
        exit;
    }

    if (is_method("INVITE|UPDATE")) {
        record_route();
    }

    if (has_totag()) {
        if (is_method("UPDATE|BYE|PRACK|INFO")) {
            $var(idx) = $Ts mod 20;

            set_dlg_profile("carrier", "carrier-$var(idx)");
            set_dlg_profile("region", "region-$var(idx)");
            set_dlg_profile("service", "svc-$var(idx)");
            set_dlg_profile("tier", "tier-$var(idx)");
            set_dlg_profile("calls");
            set_dlg_profile("active");
            set_dlg_profile("premium");
            set_dlg_profile("standard");

            get_profile_size("carrier", 
"carrier-$var(idx)", "$var(cnt1)");
            get_profile_size("region", "region-$var(idx)", 
"$var(cnt2)");
            get_profile_size("service", "svc-$var(idx)", 
"$var(cnt3)");
            get_profile_size("calls", "$var(cnt4)");
            get_profile_size("active", "$var(cnt5)");
        }

        if (loose_route()) {
            route(RELAY);
            exit;
        }
        if (is_method("ACK")) {
            if (t_check_trans()) {
                route(RELAY);
                exit;
            }
            exit;
        }
        if (is_method("INFO|UPDATE")) {
            sl_send_reply("200", "OK");
            exit;
        }
        sl_send_reply("404", "Not Here");
        exit;
    }

    if (is_method("CANCEL")) {
        if (t_check_trans()) {
            t_relay();
        }
        exit;
    }

    if (is_method("INVITE")) {
        dlg_manage();

        $var(idx) = $Ts mod 20;
        set_dlg_profile("carrier", "carrier-$var(idx)");
        set_dlg_profile("region", "region-$var(idx)");
        set_dlg_profile("service", "svc-$var(idx)");
        set_dlg_profile("tier", "tier-$var(idx)");
        set_dlg_profile("calls");
        set_dlg_profile("active");
        set_dlg_profile("premium");
        set_dlg_profile("standard");

        get_profile_size("carrier", "carrier-$var(idx)", 
"$var(cnt)");
        get_profile_size("calls", "$var(total)");

        $avp(dlg_timeout) = 1;

        $ru = "sip:test@" + UAS_IP + ":" + UAS_PORT;
        route(RELAY);
        exit;
    }

    if (is_method("OPTIONS")) {
        sl_send_reply("200", "OK");
        exit;
    }

    sl_send_reply("405", "Method Not Allowed");
    exit;
}

route[RELAY] {
    if (!t_relay()) {
        sl_reply_error();
    }
}

onreply_route {
    if (is_method("INVITE") && status =~ 
"2[0-9][0-9]") {
        $var(idx) = $Ts mod 20;
        set_dlg_profile("carrier", "carrier-$var(idx)");
        set_dlg_profile("region", "region-$var(idx)");
        get_profile_size("carrier", "carrier-$var(idx)", 
"$var(cnt)");
        get_profile_size("calls", "$var(total)");
    }
}

route[PROFILE_CHECK] {
    $var(i) = 0;
    while ($var(i) < 20) {
        get_profile_size("carrier", "carrier-$var(i)", 
"$var(cnt)");
        get_profile_size("region", "region-$var(i)", 
"$var(cnt)");
        $var(i) = $var(i) + 1;
    }
    get_profile_size("calls", "$var(total)");
    get_profile_size("active", "$var(total2)");
}
```

</details>

<details>
<summary>SIPp UAC scenario</summary>

```xml
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE scenario SYSTEM "sipp.dtd">

<scenario name="UAC - aggressive race trigger">

  <send retrans="500">
    <![CDATA[
      INVITE sip:[service]@[remote_ip]:[remote_port] SIP/2.0
      Via: SIP/2.0/[transport] [local_ip]:[local_port];branch=[branch]
      From: sipp 
<sip:sipp@[local_ip]:[local_port]>;tag=[pid]SIPpTag[call_number]
      To: test <sip:[service]@[remote_ip]:[remote_port]>
      Call-ID: [call_id]
      CSeq: 1 INVITE
      Contact: sip:sipp@[local_ip]:[local_port]
      Max-Forwards: 70
      Content-Type: application/sdp
      Content-Length: [len]

      v=0
      o=user1 53655765 2353687637 IN IP[local_ip_type] [local_ip]
      s=-
      c=IN IP[media_ip_type] [media_ip]
      t=0 0
      m=audio [media_port] RTP/AVP 0
      a=rtpmap:0 PCMU/8000
    ]]>
  </send>

  <recv response="100" optional="true" />
  <recv response="180" optional="true" />
  <recv response="183" optional="true" />
  <recv response="200" rtd="true" />

  <send>
    <![CDATA[
      ACK sip:[service]@[remote_ip]:[remote_port] SIP/2.0
      Via: SIP/2.0/[transport] [local_ip]:[local_port];branch=[branch]
      From: sipp 
<sip:sipp@[local_ip]:[local_port]>;tag=[pid]SIPpTag[call_number]
      To: test <sip:[service]@[remote_ip]:[remote_port]>[peer_tag_param]
      Call-ID: [call_id]
      CSeq: 1 ACK
      Contact: sip:sipp@[local_ip]:[local_port]
      Max-Forwards: 70
      [routes]
      Content-Length: 0
    ]]>
  </send>

  <pause milliseconds="700" />

  <send retrans="500">
    <![CDATA[
      INFO sip:[service]@[remote_ip]:[remote_port] SIP/2.0
      Via: SIP/2.0/[transport] [local_ip]:[local_port];branch=[branch]
      From: sipp 
<sip:sipp@[local_ip]:[local_port]>;tag=[pid]SIPpTag[call_number]
      To: test <sip:[service]@[remote_ip]:[remote_port]>[peer_tag_param]
      Call-ID: [call_id]
      CSeq: 2 INFO
      Contact: sip:sipp@[local_ip]:[local_port]
      Max-Forwards: 70
      [routes]
      Content-Length: 0
    ]]>
  </send>

  <recv response="200" timeout="2000" />
  <pause milliseconds="50" />

  <send retrans="500">
    <![CDATA[
      INFO sip:[service]@[remote_ip]:[remote_port] SIP/2.0
      Via: SIP/2.0/[transport] [local_ip]:[local_port];branch=[branch]
      From: sipp 
<sip:sipp@[local_ip]:[local_port]>;tag=[pid]SIPpTag[call_number]
      To: test <sip:[service]@[remote_ip]:[remote_port]>[peer_tag_param]
      Call-ID: [call_id]
      CSeq: 3 INFO
      Contact: sip:sipp@[local_ip]:[local_port]
      Max-Forwards: 70
      [routes]
      Content-Length: 0
    ]]>
  </send>

  <recv response="200" timeout="2000" />
  <pause milliseconds="50" />

  <send retrans="500">
    <![CDATA[
      INFO sip:[service]@[remote_ip]:[remote_port] SIP/2.0
      Via: SIP/2.0/[transport] [local_ip]:[local_port];branch=[branch]
      From: sipp 
<sip:sipp@[local_ip]:[local_port]>;tag=[pid]SIPpTag[call_number]
      To: test <sip:[service]@[remote_ip]:[remote_port]>[peer_tag_param]
      Call-ID: [call_id]
      CSeq: 4 INFO
      Contact: sip:sipp@[local_ip]:[local_port]
      Max-Forwards: 70
      [routes]
      Content-Length: 0
    ]]>
  </send>

  <recv response="200" timeout="2000" />
  <pause milliseconds="200" />

  <send retrans="500">
    <![CDATA[
      INFO sip:[service]@[remote_ip]:[remote_port] SIP/2.0
      Via: SIP/2.0/[transport] [local_ip]:[local_port];branch=[branch]
      From: sipp 
<sip:sipp@[local_ip]:[local_port]>;tag=[pid]SIPpTag[call_number]
      To: test <sip:[service]@[remote_ip]:[remote_port]>[peer_tag_param]
      Call-ID: [call_id]
      CSeq: 5 INFO
      Contact: sip:sipp@[local_ip]:[local_port]
      Max-Forwards: 70
      [routes]
      Content-Length: 0
    ]]>
  </send>

  <recv response="200" timeout="2000" />

</scenario>
```

</details>

<details>
<summary>GDB backtrace (crash on unmodified master)</summary>

```
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fd1c88a1e73 in get_profile_size (profile=0x7fd1c4997b70, 
value=0x7fff8f522910) at dlg_profile.c:868
868                     if(value->len == ph->value.len

bt full (frame #0):
    n = 280
    i = 7
    ph = 0x0    <-- NULL pointer dereference

bt full (frame #2):
    val_s = {s = "svc-14", len = 6}

bt full (frame #10):
    buf = "INFO sip:[email protected]:5060 SIP/2.0..."
    len = 339

bt full (frame #12):
    si_desc = "udp receiver child=1 sock=10.0.0.40:5060"
    nrprocs = 16

dmesg:
    kamailio[2479212]: segfault at 8 ip 00007fd1c88a1e73 sp 00007fff8f522790
    error 4 in dialog.so[a5e73,7fd1c8805000+cc000]
```

</details>

<details>
<summary>Test results comparison</summary>

| | Before fix | After fix |
|---|---|---|
| Test rate | 1000 cps | 1000 cps AND 2000 cps |
| Crashed? | YES, in <10 seconds | NO |
| Core dumps | SIGSEGV in dialog.so | None |
| Total calls survived | ~few thousand | 30,000 at 1000 cps, 46,000+ at 2000 
cps |
| dmesg segfaults | Yes | None |

</details>

GH #2923
You can view, comment on, or merge this pull request online at:

  https://github.com/kamailio/kamailio/pull/4591

-- Commit Summary --

  * dialog: fix race condition in link_dlg_profile

-- File Changes --

    M src/modules/dialog/dlg_profile.c (3)

-- Patch Links --

https://github.com/kamailio/kamailio/pull/4591.patch
https://github.com/kamailio/kamailio/pull/4591.diff

-- 
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/pull/4591
You are receiving this because you are subscribed to this thread.

Message ID: <kamailio/kamailio/pull/[email protected]>
_______________________________________________
Kamailio - Development Mailing List -- [email protected]
To unsubscribe send an email to [email protected]
Important: keep the mailing list in the recipients, do not reply only to the 
sender!

Reply via email to