Hi Nick, I've done some experiments without the ulimit setting. I can reproduce it. I'm turning on EC2 MATE instances and connecting to them via a single guacamole host. I connected to 20 instances before I saw the error. Then I turned on debug logging and restarted the guacd container and I reproduced it after connecting to 8 instances. This time no connections would work after the error, even after I closed all the connections and saw the number of guacd processes shrink to 1. The coredump info looks like this: https://pastebin.com/ukkLf1XC. Here is the core dump zipped -> https://russ-public.s3.amazonaws.com/projects/core.zip
There isn't an obvious error in the guacd logs where it is going wrong. But I do see some errors trying to clean up client processes. Presumably because they've failed and exited already (see below). On Guacamole side I see errors trying to connect to guacd: ERROR o.a.g.w.GuacamoleWebSocketTunnelEndpoint - Connection to guacd terminated abnormally: Connection to guacd timed out. In guacd I see this: INFO: Connection "$7a6badfa-1cd9-4b21-ad00-c90e598ee2f9" removed. DEBUG: Unable to request termination of client process: No such process Thanks, Russ On Sun, Dec 31, 2023 at 1:00 PM Nick Couchman <vn...@apache.org> wrote: > On Sat, Dec 30, 2023 at 1:59 AM Russell Sayers <russell.say...@gmail.com> > wrote: > >> Hello all, >> >> I've recently upgraded a guacd container from 1.2.0 to 1.5.4. I also >> upgraded the underlying OS, and docker version. >> >> Now I'm seeing frequent core dumps from guacd and SIGSEGV messages in the >> audit.log. I've tried increasing the "ulimit" for stack size to 65536kb >> and I'm now not seeing the core dumps (yet). Has anyone else seen a >> similar issue? >> >> > Have you established any pattern to when the segfaults happen during the > connection process? I'm not sure where the Docker containers stash the core > dump files, but I would guess they are inside the Docker container and > you'll have to retrieve them from there. > > -Nick >