Just wanted to provide some information on what appears to be lock contention around ACL lookups.
We recently upgraded from haproxy-1.6 to haproxy-1.8.20 and switched from 'nbprocs 8' to 'nbprocs 1, nbthreads 12' We have quite a few ACLs files to sift through for domain matching -- about 19MB total. After switching to 1.8.20 we saw elevated CPU under our normal peak load, to the point where in some cases all 12 threads were consuming 100% CPU. Looking at the 'perf' numbers: 90.71% haproxy [.] pat_match_str 0.49% libc-2.12.so [.] __strncasecmp_l_sse42 0.41% haproxy [.] pat_match_beg 0.34% haproxy [.] lru64_get 0.26% [kernel] [k] _spin_lock_bh 0.17% haproxy [.] pat_match_sub 0.16% libc-2.12.so [.] _int_malloc 0.13% [kernel] [k] _spin_lock 0.10% haproxy [.] process_runnable_tasks 0.09% libc-2.12.so [.] malloc We looked up that function, and it seemed like all requests were going through one spin lock [0]. The LRU tree seemed to be conditionally initialized based on 'global.tune.pattern_cache' which seemed to be tied to the 'tune.pattern.cache-size' config value [1] [2]. So, we specified 'tune.pattern.cache-size 0' to disable the cache code path and avoid the spinlock, and the CPU numbers dropped back down to where they had been previously (and lru64_get was no longer in the execution traces). I realize our use case is probably an edge case (that's a lot of ACL data -- we have plans to upgrade these to map files now that we're on haproxy-1.8). Just wanted to make the developers aware and help anyone who may run into this. Thank you, -Brian [0] https://github.com/haproxy/haproxy/blob/33ccf1cce04069f27ebb868f5617672ed4e21cf4/src/pattern.c#L487 [1] https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#3.2-tune.pattern.cache-size [2] https://github.com/haproxy/haproxy/blob/33ccf1cce04069f27ebb868f5617672ed4e21cf4/src/pattern.c#L2688