Hi there,

This is a very detailed report. I appreciate it.

I've tried to replicate the crash, but was unsuccessful so far. Things I've done to match your setup:
* Used the python script to send the config-set.
* Used the same config you posted below, with the exception of the interface and address in "interfaces-config" which had to be changed because the interface did not exist on my system and the configuration was being rejected (possibly before the crash could have happened).
* Sent the config-set command to the standby server, just as you did.

On top of that, as exploratory measures, I've tried:
* sending the config-set to the primary server instead
* having DHCP traffic sent to the HA nodes while sending the command
* configuring reclamation to delete leases as frequently as possible, and as many leases as possible so that it takes a longer time. This is after noticing that the handling of the config-set command and the reclamation seem to be intermingling in the logs you shared. So I figured that maybe replicating it requires the config-set to be received during a reclamation.
* sending config-set repeatedly at various rates

The result was always that Kea received the config-set and the config was accepted and Kea was reconfigured. Needless to say, there was no crash.

We would like to get to the bottom of it. Would you be willing to create an issue on Kea's Gitlab[1] with the details you provided here so that we can track it more easily? We could also use a bit more information, if it can be easily provided: * a stack trace for the signaled thread or for all threads would be most helpful. If using a debugger, this could be "thread apply all bt" or "thread backtrace all" or something else.
* the core dump if you still have it
* the starting configuration - do I understand correctly that it is the same configuration you sent via config-set, except it has some additional shared networks?

Thank you!
Andrei Pavel,
Software Engineer,
Internet Systems Consortium

[1] https://gitlab.isc.org/isc-projects/kea/-/issues/new


On 21/09/2022 23:51, Caciano dos Santos Machado wrote:
Hi,

I am trying to configure a pair of kea servers running in high
availability mode. However, the servers crash with segfault when I try
to use the HTTP RESP API for some commands. I figured out how to
reproduce the problem in one situation described below.

The server is a Ubuntu 22.04 LTS, running kea 2.2.0 from the repository
https://dl.cloudsmith.io/public/isc/kea-2-2/deb/ubuntu jammy main.

The server is a Xen VM with 2GB of RAM, kernel 5.15.0-43-generic
#46-Ubuntu SMP and glibc 2.35.

This following python script loads a configuration file and calls
config-set using the HTTP API.
------------------------------------------------------------------------------
#!/usr/bin/python3
#coding: utf-8

import json, requests

with open('bug-config.json') as f:
      config = json.load(f)

url = 'http://192.168.89.12:8000/'
headers = {'content-type': 'application/json'}
payload = {}
payload['command'] = 'config-set'
payload['service'] = [ 'dhcp4' ]
payload['arguments'] =  config
res = requests.post(url, headers=headers, data=json.dumps(payload))
response = res.json()[0]
      print('Set config: ' + response['text'])
------------------------------------------------------------------------------

The configuration file sent using the API is this one, without the
shared-networks section. The problem also happens when I send a complete
configuration, but this small one is enough to reproduce the problem.
The configuration file running in the server when I send the config-set
is the same, but has about 19000 host reservations identified by the
hw-address, distributed in 180 networks that are in 90 shared-networks.
------------------------------------------------------------------------------
{
    "Dhcp4": {
      "authoritative": true,
      "client-classes": [
        {
          "name": "718",
          "option-data": [
            {
              "data": "192.168.1.52,192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "708",
          "option-data": [
            {
              "data": "192.168.137.7,192.168.1.52,192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "533",
          "option-data": [
            {
              "data": "192.168.137.7,192.168.1.52,192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "791",
          "option-data": [
            {
              "data": "192.168.148.30,192.168.1.52,192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "792",
          "option-data": [
            {
              "data": "192.168.148.30,192.168.1.52,192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "793",
          "option-data": [
            {
              "data": "192.168.148.30,192.168.1.52,192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "794",
          "option-data": [
            {
              "data": "192.168.148.30,192.168.1.52,192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "795",
          "option-data": [
            {
              "data": "192.168.148.30,192.168.1.52,192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "532",
          "option-data": [
            {
              "data":
"192.168.148.22,192.168.148.30,192.168.1.52,192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "561",
          "option-data": [
            {
              "data": "192.168.2.165,192.168.2.166",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "568",
          "option-data": [
            {
              "data": "192.168.162.105, 192.168.1.52, 192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        },
        {
          "name": "118",
          "option-data": [
            {
              "data": "ifch.ufrgs.br",
              "name": "domain-name"
            }
          ]
        },
        {
          "name": "527",
          "option-data": [
            {
              "data":
"192.168.150.20,192.168.150.23,192.168.1.52,192.168.1.53",
              "name": "domain-name-servers"
            }
          ]
        }
      ],
      "control-socket": {
        "socket-name": "/tmp/kea4-ctrl-socket",
        "socket-type": "unix"
      },
      "dhcp-ddns": {
        "enable-updates": false
      },
      "expired-leases-processing": {
        "hold-reclaimed-time": 401
      },
      "hooks-libraries": [
        {
          "library":
"/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_lease_cmds.so"
        },
        {
          "library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_bootp.so"
        },
        {
          "library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_ha.so",
          "parameters": {
            "high-availability": [
              {
                "heartbeat-delay": 10000,
                "max-unacked-clients": 20,
                "min-ack-delay": 5000,
                "mode": "hot-standby",
                "peers": [
                  {
                    "auto-failover": true,
                    "name": "dhcp",
                    "role": "primary",
                    "url": "http://192.168.89.14:8000/";
                  },
                  {
                    "auto-failover": true,
                    "name": "dhcp-standby",
                    "role": "standby",
                    "url": "http://192.168.89.12:8000/";
                  }
                ],
                "send-lease-updates": false,
                "sync-leases": false,
                "this-server-name": "dhcp-standby"
              }
            ]
          }
        }
      ],
      "host-reservation-identifiers": [
        "hw-address"
      ],
      "interfaces-config": {
        "dhcp-socket-type": "udp",
        "interfaces": [
          "eth0/192.168.89.12"
        ]\"multi-threading-enabled\": true,"
      },
      "ip-reservations-unique": true,
      "lease-database": {
        "lfc-interval": 3600,
        "max-row-errors": 100,
        "name": "/var/lib/kea/kea-leases4.csv",
        "type": "memfile"
      },
      "loggers": [
        {
          "debuglevel": 15,
          "name": "kea-dhcp4",
          "output_options": [
            {
              "maxver": 32,
              "output": "/var/log/kea/kea-dhcp4-server.log"
            }
          ],
          "severity": "INFO"
        }
      ],
      "match-client-id": false,
      "max-valid-lifetime": 1800,
      "min-valid-lifetime": 360,
      "option-def": [
        {
          "code": 252,
          "name": "wpad",
          "type": "string"
        },
        {
          "code": 128,
          "name": "option-128",
          "type": "string"
        },
        {
          "code": 129,
          "name": "etherboot_options",
          "type": "string"
        }
      ],
      "reservations-global": false,
      "reservations-in-subnet": true,
      "reservations-lookup-first": true,
      "reservations-out-of-pool": true,
      "sanity-checks": {
        "lease-checks": "fix-del"
      },
      "store-extended-info": true,
      "valid-lifetime": 1800
    }
}
------------------------------------------------------------------------------

When the server crashes then "systemctl status" gives the following output:
------------------------------------------------------------------------------
root@keadhcp-dev-02:/etc/kea# systemctl status isc-kea-dhcp4-server.service
× isc-kea-dhcp4-server.service - Kea IPv4 DHCP daemon
       Loaded: loaded (/lib/systemd/system/isc-kea-dhcp4-server.service;
enabled; vendor preset: enabled)
       Active: failed (Result: core-dump) since Wed 2022-09-21 18:57:31
UTC; 16min ago
         Docs: man:kea-dhcp4(8)
      Process: 1231471 ExecStart=/usr/sbin/kea-dhcp4 -c
/etc/kea/kea-dhcp4.conf (code=dumped, signal=SEGV)
     Main PID: 1231471 (code=dumped, signal=SEGV)
          CPU: 7.494s

Sep 21 18:57:19 keadhcp-dev-02 systemd[1]: Started Kea IPv4 DHCP daemon.
Sep 21 18:57:31 keadhcp-dev-02 systemd[1]: isc-kea-dhcp4-server.service:
Main process exited, code=dumped, status=11/SEGV
Sep 21 18:57:31 keadhcp-dev-02 systemd[1]: isc-kea-dhcp4-server.service:
Failed with result 'core-dump'.
Sep 21 18:57:31 keadhcp-dev-02 systemd[1]: isc-kea-dhcp4-server.service:
Consumed 7.494s CPU time.
------------------------------------------------------------------------------

The core dump file has the following data in the first lines:
------------------------------------------------------------------------------
root@keadhcp-dev-02:/etc/kea# head -n 73
/var/crash/_usr_sbin_kea-dhcp4.113.crash
ProblemType: Crash
Architecture: amd64
CrashCounter: 1
Date: Mon Sep 19 18:19:50 2022
DistroRelease: Ubuntu 22.04
ExecutablePath: /usr/sbin/kea-dhcp4
ExecutableTimestamp: 1653069600
ProcCmdline: /usr/sbin/kea-dhcp4 -c /etc/kea/kea-dhcp4.conf
ProcEnviron: Error: [Errno 13] Permission denied: 'environ'
ProcMaps: Error: [Errno 13] Permission denied: 'maps'
ProcStatus:
   Name:    kea-dhcp4
   Umask:    0022
   State:    S (sleeping)
   Tgid:    1159118
   Ngid:    0
   Pid:    1159118
   PPid:    1
   TracerPid:    0
   Uid:    113    113    113    113
   Gid:    118    118    118    118
   FDSize:    64
   Groups:    118
   NStgid:    1159118
   NSpid:    1159118
   NSpgid:    1159118
   NSsid:    1159118
   VmPeak:      213420 kB
   VmSize:      184160 kB
   VmLck:           0 kB
   VmPin:           0 kB
   VmHWM:      155992 kB
   VmRSS:      124736 kB
   RssAnon:      110380 kB
   RssFile:       14356 kB
   RssShmem:           0 kB
   VmData:      149112 kB
   VmStk:         132 kB
   VmExe:         628 kB
   VmLib:       17112 kB
   VmPTE:         348 kB
   VmSwap:           0 kB
   HugetlbPages:           0 kB
   CoreDumping:    1
   THP_enabled:    1
   Threads:    5
   SigQ:    0/7428
   SigPnd:    0000000000000000
   ShdPnd:    0000000000000000
   SigBlk:    0000000000000000
   SigIgn:    0000000000001000
   SigCgt:    0000000100004003
   CapInh:    0000000000002400
   CapPrm:    0000000000002400
   CapEff:    0000000000002400
   CapBnd:    000001ffffffffff
   CapAmb:    0000000000002400
   NoNewPrivs:    0
   Seccomp:    0
   Seccomp_filters:    0
   Speculation_Store_Bypass:    vulnerable
   SpeculationIndirectBranch:    always enabled
   Cpus_allowed:    000f
   Cpus_allowed_list:    0-3
   Mems_allowed:
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
   Mems_allowed_list:    0
   voluntary_ctxt_switches:    136
   nonvoluntary_ctxt_switches:    304
Signal: 11
Uname: Linux 5.15.0-43-generic x86_64
UserGroups: N/A
CoreDump: base64
   H4sICAAAAAAC/0NvcmVEdW1wAA==
   ......
------------------------------------------------------------------------------

The last messages in the log file /var/log/kea/kea-dhcp4-server.log are
these:
------------------------------------------------------------------------------
2022-09-21 19:30:04.973 DEBUG
[kea-dhcp4.commands/1232318.139712599805568]
COMMAND_SOCKET_CONNECTION_OPENED Opened socket 26 for incoming command
connection
2022-09-21 19:30:04.973 DEBUG
[kea-dhcp4.dhcpsrv/1232318.139712599805568]
DHCPSRV_TIMERMGR_RUN_TIMER_OPERATION running operation for timer:
flush-reclaimed-leases
2022-09-21 19:30:04.973 DEBUG
[kea-dhcp4.alloc-engine/1232318.139712599805568]
ALLOC_ENGINE_V4_RECLAIMED_LEASES_DELETE begin deletion of reclaimed
leases expired more than 3600 seconds ago
2022-09-21 19:30:04.973 DEBUG
[kea-dhcp4.dhcpsrv/1232318.139712599805568]
DHCPSRV_MEMFILE_DELETE_EXPIRED_RECLAIMED4 deleting reclaimed IPv4 leases
that expired more than 3600 seconds ago
2022-09-21 19:30:04.973 DEBUG
[kea-dhcp4.alloc-engine/1232318.139712599805568]
ALLOC_ENGINE_V4_RECLAIMED_LEASES_DELETE_COMPLETE successfully deleted 0
expired-reclaimed leases
2022-09-21 19:30:04.973 DEBUG
[kea-dhcp4.dhcpsrv/1232318.139712599805568] DHCPSRV_TIMERMGR_START_TIMER
starting timer: flush-reclaimed-leases
2022-09-21 19:30:04.973 DEBUG
[kea-dhcp4.commands/1232318.139712599805568] COMMAND_SOCKET_READ
Received 3597 bytes over command socket 26
2022-09-21 19:30:04.976 INFO
[kea-dhcp4.commands/1232318.139712599805568] COMMAND_RECEIVED Received
command 'config-set'
2022-09-21 19:30:04.977 INFO [kea-dhcp4.hosts/1232318.139712599805568]
HOSTS_BACKENDS_REGISTERED the following host backend types are
available: mysql postgresql
2022-09-21 19:30:04.977 INFO [kea-dhcp4.dhcpsrv/1232318.139712599805568]
DHCPSRV_CFGMGR_SOCKET_TYPE_SELECT using socket type udp
2022-09-21 19:30:04.977 INFO [kea-dhcp4.dhcpsrv/1232318.139712599805568]
DHCPSRV_CFGMGR_USE_ADDRESS listening on address 192.168.89.12, on
interface eth0
2022-09-21 19:30:04.977 INFO [kea-dhcp4.hooks/1232318.139712599805568]
HOOKS_LIBRARY_CLOSED hooks library
/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_lease_cmds.so successfully
closed
2022-09-21 19:30:04.978 INFO [kea-dhcp4.hooks/1232318.139712599805568]
HOOKS_LIBRARY_CLOSED hooks library
/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_bootp.so successfully closed
2022-09-21 19:30:04.978 INFO [kea-dhcp4.hooks/1232318.139712599805568]
HOOKS_LIBRARY_CLOSED hooks library
/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_ha.so successfully closed
2022-09-21 19:30:04.979 INFO
[kea-dhcp4.ha-hooks/1232318.139712599805568] HA_DEINIT_OK unloading High
Availability hooks library successful
2022-09-21 19:30:04.979 INFO
[kea-dhcp4.bootp-hooks/1232318.139712599805568] BOOTP_UNLOAD Bootp hooks
library has been unloaded
2022-09-21 19:30:04.979 INFO
[kea-dhcp4.lease-cmds-hooks/1232318.139712599805568]
LEASE_CMDS_DEINIT_OK unloading Lease Commands hooks library successful
2022-09-21 19:30:04.980 INFO [kea-dhcp4.hooks/1232318.139712599805568]
HOOKS_LIBRARY_CLOSED hooks library
/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_ha.so successfully closed
2022-09-21 19:30:04.980 INFO [kea-dhcp4.hooks/1232318.139712599805568]
HOOKS_LIBRARY_CLOSED hooks library
/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_bootp.so successfully closed
2022-09-21 19:30:04.980 INFO [kea-dhcp4.hooks/1232318.139712599805568]
HOOKS_LIBRARY_CLOSED hooks library
/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_lease_cmds.so successfully
closed
2022-09-21 19:30:04.982 INFO
[kea-dhcp4.lease-cmds-hooks/1232318.139712599805568] LEASE_CMDS_INIT_OK
loading Lease Commands hooks library successful
2022-09-21 19:30:04.982 INFO [kea-dhcp4.hooks/1232318.139712599805568]
HOOKS_LIBRARY_LOADED hooks library
/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_lease_cmds.so successfully
loaded
2022-09-21 19:30:04.982 INFO
[kea-dhcp4.bootp-hooks/1232318.139712599805568] BOOTP_LOAD Bootp hooks
library has been loaded
2022-09-21 19:30:04.982 INFO [kea-dhcp4.hooks/1232318.139712599805568]
HOOKS_LIBRARY_LOADED hooks library
/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_bootp.so successfully loaded
2022-09-21 19:30:04.984 INFO
[kea-dhcp4.ha-hooks/1232318.139712599805568] HA_CONFIGURATION_SUCCESSFUL
HA hook library has been successfully configured
2022-09-21 19:30:04.984 WARN
[kea-dhcp4.ha-hooks/1232318.139712599805568]
HA_CONFIG_LEASE_UPDATES_DISABLED lease updates will not be generated
2022-09-21 19:30:04.984 WARN
[kea-dhcp4.ha-hooks/1232318.139712599805568]
HA_CONFIG_LEASE_SYNCING_DISABLED lease database synchronization between
HA servers is disabled
2022-09-21 19:30:04.984 INFO
[kea-dhcp4.ha-hooks/1232318.139712599805568] HA_INIT_OK loading High
Availability hooks library successful
2022-09-21 19:30:04.984 INFO [kea-dhcp4.hooks/1232318.139712599805568]
HOOKS_LIBRARY_LOADED hooks library
/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_ha.so successfully loaded
2022-09-21 19:30:04.985 INFO [kea-dhcp4.dhcp4/1232318.139712599805568]
DHCP4_CONFIG_COMPLETE DHCPv4 server has completed configuration: no IPv4
subnets!; DDNS: disabled
2022-09-21 19:30:04.985 INFO [kea-dhcp4.dhcpsrv/1232318.139712599805568]
DHCPSRV_MEMFILE_DB opening memory file lease database: lfc-interval=3600
max-row-errors=100 name=/var/lib/kea/kea-leases4.csv type=memfile universe=4
2022-09-21 19:30:04.985 INFO [kea-dhcp4.dhcpsrv/1232318.139712599805568]
DHCPSRV_MEMFILE_LEASE_FILE_LOAD loading leases from file
/var/lib/kea/kea-leases4.csv.2
2022-09-21 19:30:04.985 INFO [kea-dhcp4.dhcpsrv/1232318.139712599805568]
DHCPSRV_MEMFILE_LEASE_FILE_LOAD loading leases from file
/var/lib/kea/kea-leases4.csv
2022-09-21 19:30:04.985 INFO [kea-dhcp4.dhcpsrv/1232318.139712599805568]
DHCPSRV_MEMFILE_LFC_SETUP setting up the Lease File Cleanup interval to
3600 sec
2022-09-21 19:30:04.986 INFO
[kea-dhcp4.ha-hooks/1232318.139712599805568] HA_LOCAL_DHCP_DISABLE local
DHCP service is disabled while the dhcp-standby is in the WAITING state
2022-09-21 19:30:04.986 INFO
[kea-dhcp4.ha-hooks/1232318.139712599805568] HA_SERVICE_STARTED started
high availability service in hot-standby mode as standby server
------------------------------------------------------------------------------

The problem never occurs when I remove the ha hook library section from
the configuration.

Anyone else seen this problem?

Thanks,
Caciano Machado
PhD in Computer Science and Network Engineer
Universidade Federal do Rio Grande do Sul

--
ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.

To unsubscribe visit https://lists.isc.org/mailman/listinfo/kea-users.

Kea-users mailing list
[email protected]
https://lists.isc.org/mailman/listinfo/kea-users

Reply via email to