Hi, On Tue, Oct 02, 2018 at 08:26:12PM +0530, Soji Antony wrote: > Hello, > > We are currently using haproxy 1.8.3 with single process multithreaded > configuration. > We have 1 process and 10 threads each mapped to a separate core [0-9]. We > are running our haproxy instances on a c4.4xlarge aws ec2 instance. The > only other CPU intensive process running on this server is a log shipper > which is explicity mapped to cpu cores 13 - 16 explicitly using taskset > command. Also we have given 'SCHED_RR' priority 99 for haproxy processes. > > OS: Ubuntu 14 > Kernel: 4.4.0-134-generic > > The issue we are seeing with Haproxy is all of a sudden CPU usage spikes to > 100% on cores which haproxy is using & causing latency spikes and high load > on the server. We are seeing the following error messages in system / > kernel logs when this issue happens. > > haproxy[92558]: segfault at 8 ip 000055f04b1f5da2 sp 00007ffdab2bdd40 error > 6 in haproxy[55f04b10100 > 0+170000] > > Sep 29 12:21:02 marathonlb-int21 kernel: [2223350.996059] sched: RT > throttling activated > > We are using marathonlb for auto discovery and reloads are quite frequent > on this server. Last time when this issue happened we had seen haproxy > using 750% of CPU and it went into D state. Also the old process was also > taking cpu. > > hard-stop-after was not set in our hap configuration and we were seeing > multiple old pid's running on the server. After the last outage we had with > CPU we set 'hard-stop-after' to 10s and now we are not seeing multiple hap > instances running after reload. I would really appreciate if some one can > explain us why the CPU usage spikes with the above segfault error & what > this error exactly means. > > FYI: There was no traffic spike on this hap instance when the issue > happened. We have even seen the same issue in a non-prod hap where no > traffic was coming & system went down due to CPU usage & found the same > segfault error in the logs. >
A good first step would probably to upgrade to the latest version if possible. 1.8.3 is quite old, and a bunch of bugs have been fixed since then, especially when using multithreading. Regards, Olivier

