Hi, On 08.01.2013 18:41, Christian Becker wrote: > Hello, > > today we´ve upgraded one of our loadbalancers to linux kernel 3.7.1 and > haproxy 1.5 dev 17 - coming from kernel 3.0.1 and haproxy 1.5 dev 7. > > After the upgrade, the system is delivering traffic as usual and we don´t see > any traffic issues. But now there are constantly 4 CPUs 100% busy with about > 30 % user and 70 % system load. > > We haven´t seen this behaviour in the past. Additionally we get the following > messages every couple of minutes: > > Jan 8 18:30:59 srv11 kernel: [ 3878.272003] ------------[ cut here > ]------------ > Jan 8 18:30:59 srv11 kernel: [ 3878.295572] WARNING: at net/ipv4/tcp.c:1330 > tcp_cleanup_rbuf+0x4d/0xfc() > Jan 8 18:30:59 srv11 kernel: [ 3878.319107] Hardware name: System x3690 X5 > -[7148Z68]- > Jan 8 18:30:59 srv11 kernel: [ 3878.340686] cleanup rbuf bug: copied > 7B02E4F6 seq 7B01F558 rcvnxt 7B02E4F6 > Jan 8 18:30:59 srv11 kernel: [ 3878.363160] Modules linked in: 8021q garp > stp llc nls_utf8 nls_cp437 vfat fat acpi_cpufreq snd_pcm cdc_ether usbnet mii > coretemp kvm_intel kvm snd_timer snd crc32c_intel evdev joydev hid_generic > soundcore microcode snd_page_alloc serio_raw pcspkr mperf tpm_tis processor > ioatdma lpc_ich i2c_i801 tpm shpchp mfd_core tpm_bios pci_hotplug i2c_core > dca thermal_sys button ext4 mbcache jbd2 crc16 dm_mod sg sr_mod cdrom sd_mod > crc_t10dif ata_generic usbhid hid uhci_hcd ata_piix libata megaraid_sas > ehci_hcd bnx2 usbcore scsi_mod usb_common be2net > Jan 8 18:30:59 srv11 kernel: [ 3878.513462] Pid: 30307, comm: haproxy > Tainted: G W 3.7.1 #1 > Jan 8 18:30:59 srv11 kernel: [ 3878.540064] Call Trace: > Jan 8 18:30:59 srv11 kernel: [ 3878.564947] [<ffffffff8103ef70>] ? > warn_slowpath_common+0x78/0x8c > Jan 8 18:30:59 srv11 kernel: [ 3878.591417] [<ffffffff8103f023>] ? > warn_slowpath_fmt+0x45/0x4a > Jan 8 18:30:59 srv11 kernel: [ 3878.617965] [<ffffffff812d3e02>] ? > tcp_cleanup_rbuf+0x4d/0xfc > Jan 8 18:30:59 srv11 kernel: [ 3878.645352] [<ffffffff812d4034>] ? > tcp_read_sock+0x183/0x194 > Jan 8 18:30:59 srv11 kernel: [ 3878.670621] [<ffffffff812d487d>] ? > tcp_sendpage+0x45b/0x45b > Jan 8 18:30:59 srv11 kernel: [ 3878.696935] [<ffffffff812d4118>] ? > tcp_splice_read+0xd3/0x223 > Jan 8 18:30:59 srv11 kernel: [ 3878.721845] [<ffffffff8112d9ae>] ? > sys_splice+0x345/0x3bf > Jan 8 18:30:59 srv11 kernel: [ 3878.746239] [<ffffffff813651a9>] ? > system_call_fastpath+0x16/0x1b > Jan 8 18:30:59 srv11 kernel: [ 3878.770749] ---[ end trace 91a60bafa2f9d85e > ]--- This looks like your nic cause problems. What nic type is it? > This is our global configuration and one of the most busy threads (about 1k > requests/s): > > global > daemon > maxconn 131072 > spread-checks 2 > stats socket /var/run/haproxy.sock > nbproc 34 > > defaults > mode http > option splice-response > option splice-request > timeout connect 5000ms > timeout client 30000ms > timeout server 300000ms > timeout http-request 20000ms > # option forceclose > > frontend marketing-in > bind <ip>:80 > default_backend marketing > maxconn 32768 > option http-server-close > option forwardfor > reqidel ^X-Forwarded-For:.* > bind-process 17 > backend marketing > stats enable > stats uri <uri> > stats auth <user> > option httpchk GET /server_up.php > http-check expect rstring ^OK$ > balance roundrobin > server web1 <ip1>:80 maxconn 4096 check port 80 inter 10000 fastinter 2000 > server web2 <ip2>:80 maxconn 4096 check port 80 inter 10000 fastinter 2000 > server web3 <ip3>:80 maxconn 4096 check port 80 inter 10000 fastinter 2000 > server web4 <ip4>:80 maxconn 4096 check port 80 inter 10000 fastinter 2000 > server sorry <sorry>:80 check backup > > Additionally this are the build options before and now: > > HA-Proxy version 1.5-dev7 2011/09/10 > Copyright 2000-2011 Willy Tarreau <[email protected]> > > Build options : > TARGET = linux26 > CPU = generic > CC = gcc > CFLAGS = -O2 -g -fno-strict-aliasing -march=core2 -m64 > OPTIONS = USE_LINUX_SPLICE=1 USE_LINUX_TPROXY=1 USE_STATIC_PCRE=1 > > Default settings : > maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 > > Encrypted password support via crypt(3): yes > > Available polling systems : > sepoll : pref=400, test result OK > epoll : pref=300, test result OK > poll : pref=200, test result OK > select : pref=150, test result OK > Total: 4 (4 usable), will use sepoll. > > Note: This is not dev17 anymore - this is todays snapshot You could easily add something like VERSION="1.5.dev17-patch..." to make, then you know at every time what kind of snapshot it was.
> HA-Proxy version 1.5-dev17 2012/12/28 > Copyright 2000-2012 Willy Tarreau <[email protected]> > > Build options : > TARGET = linux2628 > CPU = generic > CC = gcc > CFLAGS = -O2 -g -fno-strict-aliasing -march=core2 -m64 > OPTIONS = USE_LINUX_SPLICE=1 USE_LINUX_TPROXY=1 USE_LIBCRYPT=1 USE_ZLIB=1 > USE_OPENSSL=1 USE_STATIC_PCRE=1 > > Default settings : > maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 > > Encrypted password support via crypt(3): yes > Built with zlib version : 1.2.3.4 > Compression algorithms supported : identity, deflate, gzip > Built with OpenSSL version : OpenSSL 0.9.8o 01 Jun 2010 > OpenSSL library supports TLS extensions : yes > OpenSSL library supports SNI : yes > OpenSSL library supports prefer-server-ciphers : yes > > Available polling systems : > epoll : pref=300, test result OK > poll : pref=200, test result OK > select : pref=150, test result OK > Total: 3 (3 usable), will use epoll. > > Do you have any idea what´s causing this issues? A unstable kernel. You use latest and greatest on you own risk. cheers, thomas > Thank you very much in advance! > Regards, > Christian > ____________________________ > Christian Becker > Systemadministration > > Travian Games GmbH > Wilhelm-Wagenfeld-Str. 22 > 80807 München > Germany > > Tel.: +49 / (0)89 / 324 915 – 0 > Fax: +49 / (0)89 / 324 915 – 970 > [email protected] > www.traviangames.de > > Sitz der Gesellschaft München > AG München HRB: 173511 > Geschäftsführer: Siegfried Müller > USt-IdNr.: DE246258085 > > Diese Email einschließlich ihrer Anlagen ist vertraulich und nur für den > Adressaten bestimmt. Wenn Sie nicht der vorgesehene Empfaenger sind, bitten > wir Sie, diese Email mit Anlagen unverzueglich und vollstaendig zu loeschen > und uns umgehend zu benachrichtigen. > > This email and its attachments are strictly confidential and are intended > solely for the attention of the person to whom it is addressed. If you are > not the intended recipient of this email, please delete it including its > attachments immediately and inform us accordingly. > > > > > > >

