On Tue, Jun 12, 2018 at 04:00:25PM +0200, William Dauchy wrote:
> Hello William L,
> 
> On Fri, Jun 08, 2018 at 04:31:30PM +0200, William Lallemand wrote:
> > That's great news!
> >
> > Here's the new patches. It shouldn't change anything to the fix, it only
> > changes the sigprocmask to pthread_sigmask.
> 
> In fact, I now have a different but similar issue.
> 

:(

> root     18547  3.2  1.3 986660 898844 ?       Ss   Jun08 182:12 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 2063 1903 1763 1445 14593 29663 4203 18290 -x /var/lib/haproxy/stats
> haproxy  14593  299  1.3 1251216 920480 ?      Rsl  Jun11 5882:01  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 14582 14463 -x /var/lib/haproxy/stats
> haproxy  18290  299  1.4 1265028 935288 ?      Ssl  Jun11 3425:51  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 18281 18271 18261 14593 -x /var/lib/haproxy/stats
> haproxy  29663 99.9  1.4 1258024 932796 ?      Ssl  Jun11 1063:08  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 29653 29644 18290 14593 -x /var/lib/haproxy/stats
> haproxy   4203 99.9  1.4 1258804 933216 ?      Ssl  Jun11 1009:27  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 4194 4182 18290 29663 14593 -x /var/lib/haproxy/stats
> haproxy   1445 25.9  1.4 1261680 929516 ?      Ssl  13:51   0:42  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 1436 29663 4203 18290 14593 -x /var/lib/haproxy/stats
> haproxy   1763 18.9  1.4 1260500 931516 ?      Ssl  13:52   0:15  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 1445 14593 29663 4203 18290 -x /var/lib/haproxy/stats
> haproxy   1903 25.0  1.4 1261472 931064 ?      Ssl  13:53   0:14  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 1763 1445 14593 29663 4203 18290 -x /var/lib/haproxy/stats
> haproxy   2063 52.5  1.4 1259568 927916 ?      Ssl  13:53   0:19  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 1903 1763 1445 14593 29663 4203 18290 -x /var/lib/haproxy/stats
> haproxy   2602 62.0  1.4 1262220 928776 ?      Rsl  13:54   0:02  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 2063 1903 1763 1445 14593 29663 4203 18290 -x /var/lib/haproxy/stats
> 
> 

Those processes are still using a lot of CPU...

> # cat /proc/14593/status | grep Sig
> SigQ:   0/257120
> SigPnd: 0000000000000000
> SigBlk: 0000000000000800
> SigIgn: 0000000000001800
> SigCgt: 0000000180300205
> 
> kill -USR1 14593 has no effect:
> 
> # strace -ffff -p 14593
> strace: Process 14593 attached with 3 threads


> strace: [ Process PID=14595 runs in x32 mode. ]

This part is particularly interesting, I suppose you are not running in x32, 
right? 
I had this problem at some point but was never able to reproduce it...

We might find something interesting by looking further..

> [pid 14593] --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=18547, 
> si_uid=0} ---
> [pid 14593] rt_sigaction(SIGUSR1, {0x558357660020, [USR1], 
> SA_RESTORER|SA_RESTART, 0x7f0e87671270}, {0x558357660020, [USR1], 
> SA_RESTORER|SA_RESTART, 0x7f0e87671270}, 8) = 0
> [pid 14593] rt_sigreturn({mask=[USR2]}) = 7


At least you managed to strace when the process was seen as an x32 one, it 
wasn't my case.

> 
> however, the unix socket is on the correct process:
> 
> # lsof | grep "haproxy/stats" ; ps auxwwf | grep haproxy
> haproxy    2602        haproxy    5u     unix 0xffff880f902e8000       0t0 
> 3333061798 /var/lib/haproxy/stats.18547.tmp
> haproxy    2602  2603  haproxy    5u     unix 0xffff880f902e8000       0t0 
> 3333061798 /var/lib/haproxy/stats.18547.tmp
> haproxy    2602  2604  haproxy    5u     unix 0xffff880f902e8000       0t0 
> 3333061798 /var/lib/haproxy/stats.18547.tmp
> haproxy    2602  2605  haproxy    5u     unix 0xffff880f902e8000       0t0 
> 3333061798 /var/lib/haproxy/stats.18547.tmp
> 
> So it means, it does not cause any issue for the provisioner which talks
> to the correct process, however, they are remaining process.

Are they still delivering traffic?


> Should I start a different thread for that issue?
>

That's not necessary, thanks.
 
> it seems harder to reproduce, I got the issue ~2 days after pushing back.
> 
> Thanks,
> 

I'll try to reproduce this again...

-- 
William Lallemand

Reply via email to