We have been hitting this bug quite often while running Tomcat 8.5 on Amazon AWS Linux 2 with a kernel of 4.14.268-205.500.amzn2.x86_64
I wanted to see if the bug could be reproduced using an updated kernel so I attempted to repro it using the server code and methodology provided by Mark Thomas on Ubuntu Server 21.10 (running on a Raspberry Pi 4 with 4GB RAM) and was NOT able to repro the bug (kernel 5.13.0-1008-raspi). I then installed Ubuntu Server 20.04 LTS on the same machine and WAS able to repro the bug (kernel 5.4.0-1052-raspi). The bug was fairly easy to repro and did not take multiple times to repro. Since then I have been able to repro the bug using the server code on AWS Linux 2 with the 4.14.268-205.500.amzn2.x86_64 kernel, but not on AWS Linux 2 with a 5.10.109-104.500.amzn2.x86_64 kernel. I think there is a slight problem with the server code used in the repro, as it is calling `pthread_create` with no thread attributes, which will create joinable threads instead of detached threads. The documentation for `pthread_create` says that "Only when a terminated joinable thread has been joined are the last of its resources released back to the system." Because the server code never joins the threads I think this is preventing the OS from releasing the thread resources. This results in the server eventually running out of memory and the server program returning a "pthread_create: Cannot allocate memory" as mentioned by Brooke Hedrick in their comment. I was also not able to repro the bug on WSL (kernel 4.4.0-19041-Microsoft), but perhaps their underlying network drivers are different? I also was running into this issue when running the server code. I made a slight modification to the server code to set the pthread attribute to create the new threads in a detached state. This seemed to solve the memory issue and I was able to repro the bug with this server. I've attached the code. Additionally, I found it useful to use `prlimit` to update the maximum number of open files for the server process, once it was running. This made the server less likely to run into an EMFILE error when calling `accept`. ** Attachment added: "Updated server to demonstrate bug" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1924298/+attachment/5582247/+files/server.c -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1924298 Title: accept returns duplicate endpoints under load To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1924298/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs