On Tue, Oct 02, 2018 at 08:26:12PM +0530, Soji Antony wrote:
> Hello,
> We are currently using haproxy 1.8.3 with single process multithreaded
> configuration.
> We have 1 process and 10 threads each mapped to a separate core [0-9]. We
> are running our haproxy instances on a c4.4xlarge aws ec2 instance. The
> only other CPU intensive process running on this server is a log shipper
> which is explicity mapped to cpu cores 13 - 16 explicitly using taskset
> command. Also we have given 'SCHED_RR' priority 99 for haproxy processes.
> OS: Ubuntu 14
> Kernel: 4.4.0-134-generic
> The issue we are seeing with Haproxy is all of a sudden CPU usage spikes to
> 100% on cores which haproxy is using & causing latency spikes and high load
> on the server. We are seeing the following error messages in system /
> kernel logs when this issue happens.
> haproxy[92558]: segfault at 8 ip 000055f04b1f5da2 sp 00007ffdab2bdd40 error
> 6 in haproxy[55f04b10100
> 0+170000]
> Sep 29 12:21:02 marathonlb-int21 kernel: [2223350.996059] sched: RT
> throttling activated
> We are using marathonlb for auto discovery and reloads are quite frequent
> on this server. Last time when this issue happened we had seen haproxy
> using 750% of CPU and it went into D state. Also the old process was also
> taking cpu.
> hard-stop-after was not set in our hap configuration and we were seeing
> multiple old pid's running on the server. After the last outage we had with
> CPU we set 'hard-stop-after' to 10s and now we are not seeing multiple hap
> instances running after reload. I would really appreciate if some one can
> explain us why the CPU usage spikes with the above segfault error & what
> this error exactly means.
> FYI: There was no traffic spike on this hap instance when the issue
> happened. We have even seen the same issue in a non-prod hap where no
> traffic was coming & system went down due to CPU usage & found the same
> segfault error in the logs.

A good first step would probably to upgrade to the latest version if possible.
1.8.3 is quite old, and a bunch of bugs have been fixed since then,
especially when using multithreading.



Reply via email to