Hi Willy, It reproduced again: root@WD-G0-SRP1:~# uname -a Linux WD-G0-SRP1 3.2.0-95-generic #135-Ubuntu SMP Tue Nov 10 13:33:29 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
HA-Proxy version 1.6.2 2015/11/03 Copyright 2000-2015 Willy Tarreau <wi...@haproxy.org> Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Wformat-security -Werror=format-security -D_FORTIFY_SOURCE=2 OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Encrypted password support via crypt(3): yes Built with zlib version : 1.2.3.4 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with OpenSSL version : OpenSSL 1.0.1 14 Mar 2012 Running on OpenSSL version : OpenSSL 1.0.1 14 Mar 2012 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports prefer-server-ciphers : yes Built with PCRE version : 8.12 2011-01-15 PCRE library supports JIT : no (USE_PCRE_JIT not set) Built with Lua version : Lua 5.3.1 Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. I've captured the strace call log successfully this time. You are right, there are full of epoll_wait and gettimeofday calls. Here is the broken request: Nov 22 01:06:32 WD-G0-SRP1 haproxy[1259]: 2.145.41.3:56014 [22/Nov/2015:00:50:01.683] https-in~ g0n2/n2 1263/0/0/3/991286 200 84517 - - CD-- 8/8/1/1/0 0/0 "GET /lib/ext/ext-all.js HTTP/1.1" I have been uploaded the total 40 seconds strace log and full haproxy.log to: http://baiy.cn/tmp/log-1122.rar Thanks :-) -- Best Regards BaiYang baiy...@gmail.com http://baiy.cn **** < END OF EMAIL > **** From: Willy Tarreau Date: 2015-11-20 20:54 To: baiyang CC: Lukas Tribus; haproxy Subject: Re: Re: CPU 100% when waiting for the client timeout On Fri, Nov 20, 2015 at 08:15:03PM +0800, baiyang wrote: > > But the kernel's build date dates 2012, that's what troubles me. Are you > > sure you're not running on a locally built kernel that is never updated > > anymore or any such thing ? > Yes I sure we are using the official kernel, and did a reboot after upgrade. > But just like I said, Ubuntu seems need an explicitly command to do the > kernal upgrade (apt-get dist-upgrade). OK, I agree that it's not intuitive and may result in having many people exposed to years of bugs and vulnerabilities. > I have done it few minutes ago, and now: > root@WD-G0-SRP1:~# uname -a > Linux WD-G0-SRP1 3.2.0-95-generic #135-Ubuntu SMP Tue Nov 10 13:33:29 UTC > 2015 x86_64 x86_64 x86_64 GNU/Linux Ah much better :-) > I've been re-enabled the three options and done a "/etc/init.d/haproxy > reload". Let's wait to see what's happen. :-) Great! > > Later you decide to close the fd. It's supposed to be removed from the fd > > list because you closed the last user, but.... > Ok, got it. I found the fd always be deleted explicitly from the epoll queue > before it going to be closed in our framework. So we never faced this issue. OK. > > And doing so comes with a cost as you have to resubscribe it everytime you > > drain that event. > Yeah, re-enable it in the epoll queue everytime or get EAGAIN in the last > read/write call are all need some additional overheads. You can only avoid > one of them sometime, Things get more complex when you want process the > events concurrently in a thread pool from one single epoll queue. With threads, you're certainly encouraged to use EPOLL_ET because the model will limit the interactions between threads and the need to synchronize. > Of course > maybe we can do it a slightly better under a single thread model? Exactly :-) > > Don't worry, we know all this quite well. > I difinitely do think so, and I belive you are the expert of it. Just a > little quick reminder for you :-) No problem :-) > > do you remember if you rebooted after upgrading haproxy to 1.6 the first > > time ? > Not very clear, I only remembered I reboot the server every time "apt-get > upgrade" finished. And I'm very clear we have many reboot of the server this > week due to the troubleshooting needed. OK. Let's wait for the problem to appear again, that's the only thing which will tell us if the problem is fixed. Cheers, Willy