Re: Segmentation faults (1.8.4-de425f6 2018/02/26)

2018-03-26 Thread Peter Lindegaard Hansen
Hi,

No, i'am not using threads.




Med venlig hilsen


*Peter Lindegaard Hansen*

*Softwareudvikler / Partner*

Telefon: +45 96 500 300 | Direkte: 69 14 97 04 | Email: p...@tigermedia.dk
Tiger Media A/S | Gl. Gugvej 17C | 9000 Aalborg | Web: www.tigermedia.dk

For supportspørgsmål kontakt os da på supp...@tigermedia.dk eller på tlf.
96 500 300
og din henvendelse vil blive besvaret af første ledige medarbejder.

2018-03-26 11:27 GMT+02:00 Willy Tarreau <w...@1wt.eu>:

> Hi,
>
> So what I found in Holger's trace is that the H2 connection has no
> more streams and was present in the wait queue. The state of the task
> being deleted doesn't make sense, indicating a memory corruption
> possibly caused by a use-after-free or a double free or by a wrong
> wakeup when a buffer is available, as in a few commits fixed in 1.8.4
> which we thought would have a minor impact only.
>
> After some thought I realized that there is a possible issue with the
> way H2 streams are detached from their connection : in h2_detach(), if
> they are the last one, they can destroy the connection and release it.
> But it's possible that the connection's task was already woken up and
> queued for being processed immediately afterwards (the multi-thread
> scheduler can trigger this), so I'll have to rethink the way detaching
> is performed and only delegate to an asynchronous task. Since your
> config doesn't use threads it's not this which causes the problem, but
> that caught my attention and fixing it might lead to a more robust
> design.
>
> Peter, were you using threads with H2 or not ? Just trying to draw
> some statistics here.
>
> I'll keep you updated of any other discovery anyway.
>
> Cheers,
> Willy
>


Re: Segmentation faults (1.8.4-de425f6 2018/02/26)

2018-03-23 Thread Peter Lindegaard Hansen
We've been experiencing crashes too, with all 1.8 versions - currently
using 1.8.4 from PPA.
We noticed that disabling h2 prevents crashes.



Med venlig hilsen


*Peter Lindegaard Hansen*

*Softwareudvikler / Partner*

Telefon: +45 96 500 300 | Direkte: 69 14 97 04 | Email: p...@tigermedia.dk
Tiger Media A/S | Gl. Gugvej 17C | 9000 Aalborg | Web: www.tigermedia.dk

For supportspørgsmål kontakt os da på supp...@tigermedia.dk eller på tlf.
96 500 300
og din henvendelse vil blive besvaret af første ledige medarbejder.

2018-03-23 10:09 GMT+01:00 Holger Amann <hol...@fehu.org>:

> Hi,
>
> we had two crashes yesterday within about 2 hours.
>
> HA-Proxy version 1.8.4-de425f6 2018/02/26
> Copyright 2000-2018 Willy Tarreau <wi...@haproxy.org>
>
> Build options :
>   TARGET  = linux2628
>   CPU = generic
>   CC  = gcc
>   CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
> -fwrapv -Wno-null-dereference -Wno-unused-label
>   OPTIONS = USE_LINUX_SPLICE=1 USE_LIBCRYPT=1 USE_ZLIB=1 USE_OPENSSL=1
> USE_PCRE=1
>
> Default settings :
>   maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
>
> Built with OpenSSL version : OpenSSL 1.1.0f  25 May 2017
> Running on OpenSSL version : OpenSSL 1.1.0f  25 May 2017
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
> Built with transparent proxy support using: IP_TRANSPARENT
> IPV6_TRANSPARENT IP_FREEBIND
> Encrypted password support via crypt(3): yes
> Built with multi-threading support.
> Built with PCRE version : 8.39 2016-06-14
> Running on PCRE version : 8.39 2016-06-14
> PCRE library supports JIT : no (USE_PCRE_JIT not set)
> Built with zlib version : 1.2.8
> Running on zlib version : 1.2.8
> Compression algorithms supported : identity("identity"),
> deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
> Built with network namespace support.
>
> Available polling systems :
>   epoll : pref=300,  test result OK
>poll : pref=200,  test result OK
>  select : pref=150,  test result OK
> Total: 3 (3 usable), will use epoll.
>
> Available filters :
> [SPOE] spoe
> [COMP] compression
> [TRACE] trace
>
>
>
> root@66b9ab4204d8:/code# gdb /usr/local/sbin/haproxy core
> GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
> Copyright (C) 2016 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.
> html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-linux-gnu".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from /usr/local/sbin/haproxy...done.
> [New LWP 10]
>
> warning: .dynamic section for "/lib64/ld-linux-x86-64.so.2" is not at the
> expected address (wrong library or version mismatch?)
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `/usr/local/sbin/haproxy -f /etc/haproxy.cfg'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  __eb_delete (node=0x55dae9d8db30, node@entry=0x55dae8bdd230) at
> ebtree/ebtree.h:720
> 720 ebtree/ebtree.h: No such file or directory.
> (gdb) bt
> #0  __eb_delete (node=0x55dae9d8db30, node@entry=0x55dae8bdd230) at
> ebtree/ebtree.h:720
> #1  eb_delete (node=node@entry=0x55dae9d8db30) at ebtree/ebtree.c:25
> #2  0x55dae7bc36f5 in eb32_delete (eb32=0x55dae9d8db30) at
> ebtree/eb32tree.h:106
> #3  __task_unlink_wq (t=0x55dae9d8dad0) at include/proto/task.h:145
> #4  task_unlink_wq (t=) at include/proto/task.h:153
> #5  task_delete (t=) at include/proto/task.h:192
> #6  process_stream (t=t@entry=0x55dae9d8dad0) at src/stream.c:2514
> #7  0x55dae7c3f792 in process_runnable_tasks () at src/task.c:229
> #8  0x55dae7bf2674 in run_poll_loop () at src/haproxy.c:2399
> #9  run_thread_poll_loop (data=) at src/haproxy.c:2461
> #10 0x55dae7b6cfea in main (argc=, argv=0x7ffcff36a218)
> at src/haproxy.c:3050
>
>
>
>
> global
>   log /dev/log local0 warning
>   maxconn 5
> 

Re: 1.8.3: Slow posts on H2 (IE only?)

2018-01-04 Thread Peter Lindegaard Hansen
Hi Willy & Lukas,

I did on a test VM using the latest git version
# haproxy -v
HA-Proxy version 1.8.3-646d23-1 2018/01/04
Copyright 2000-2017 Willy Tarreau <wi...@haproxy.org>

And can confirm that the issues related to post+redirects are fixed for us.

And the other issue that we found seems to be resolved too.

Perfect, thanks much! :)



Med venlig hilsen


*Peter Lindegaard Hansen*

*Softwareudvikler / Partner*

Telefon: +45 96 500 300 | Direkte: 69 14 97 04 | Email: p...@tigermedia.dk
Tiger Media A/S | Gl. Gugvej 17C | 9000 Aalborg | Web: www.tigermedia.dk

For supportspørgsmål kontakt os da på supp...@tigermedia.dk eller på tlf.
96 500 300
og din henvendelse vil blive besvaret af første ledige medarbejder.

2018-01-04 14:52 GMT+01:00 Willy Tarreau <w...@1wt.eu>:

> Hi Peter,
>
> On Wed, Jan 03, 2018 at 11:04:03PM +0100, Peter Lindegaard Hansen wrote:
> > I will try to apply the patch (i am no expert at this) and see the
> results.
>
> Well, don't waste your time trying to figure stuff you're not comfortable
> with. I've just committed the fix and backported it. If you're familiar
> with git, pulling the latest version will give you the fixed code.
> Otherwise
> just wait for tomorrow morning, the next nightly snapshot will
> automatically
> contain the fix.
>
> > I did find another h2-related issue, but where unable to pinpoint exactly
> > why as it where deep in ajax application, but i will see with this patch
> if
> > its the same issue.
>
> OK, let's wait for new tests with the patch first, and otherwise once
> you have more info, do not hesitate to open another report.
>
> Thanks!
> Willy
>


Re: 1.8.3: Slow posts on H2 (IE only?)

2018-01-03 Thread Peter Lindegaard Hansen
Hi,

What a quick response on this issue :)

I will try to apply the patch (i am no expert at this) and see the results.
I did find another h2-related issue, but where unable to pinpoint exactly
why as it where deep in ajax application, but i will see with this patch if
its the same issue.





Med venlig hilsen


*Peter Lindegaard Hansen*

*Softwareudvikler / Partner*

Telefon: +45 96 500 300 | Direkte: 69 14 97 04 | Email: p...@tigermedia.dk
Tiger Media A/S | Gl. Gugvej 17C | 9000 Aalborg | Web: www.tigermedia.dk

For supportspørgsmål kontakt os da på supp...@tigermedia.dk eller på tlf.
96 500 300
og din henvendelse vil blive besvaret af første ledige medarbejder.

2018-01-03 22:34 GMT+01:00 Lukas Tribus <lu...@ltri.eu>:

> Hello,
>
>
> On Wed, Jan 3, 2018 at 9:51 PM, Willy Tarreau <w...@1wt.eu> wrote:
> > On Wed, Jan 03, 2018 at 09:31:47PM +0100, Willy Tarreau wrote:
> >> Oh I think you've just put your finger on it. I remember taking care
> >> of handling 0-sized frames, and facing certain difficulties with them
> >> (eg: sometimes returning size 0 just means nothing was done). I sounds
> >> very likely that we can still have a bug around this. It would also
> >> explain why your patch could get rid of it.
> >>
> >> I'll have a look at the code in case I have an idea.
> >
> > Could you please try the attached patch? I'm pretty sure it is *very*
> > related to your observations. In fact till now we would not update the
> > parser's state on an empty data frame, which explains why you had to
> > move the stuff around.
>
> Yes, I can confirm this fixes the problem for me, and in a good
> looking way too: all streams are correctly handled within that single
> connection (no sudden connection teardown and subsequent new
> connection for follow-up streams).
>
> Now we won't even bother the ML subscribers with a 50+ messages
> mega-thread debugging an obscure H2 issue :)
>
>
> regards,
> lukas
>


Re: 1.8.1 Segfault + slowdown

2017-12-20 Thread Peter Lindegaard Hansen
update:

we've disabled h2 on 1.8, and everything is running as expected again.
haproxy does not degrade performance anymore nor does it segfault.
so it issues seem to be related to the h2



Med venlig hilsen


*Peter Lindegaard Hansen*

*Softwareudvikler / Partner*

Telefon: +45 96 500 300 | Direkte: 69 14 97 04 | Email: p...@tigermedia.dk
Tiger Media A/S | Gl. Gugvej 17C | 9000 Aalborg | Web: www.tigermedia.dk

For supportspørgsmål kontakt os da på supp...@tigermedia.dk eller på tlf.
96 500 300
og din henvendelse vil blive besvaret af første ledige medarbejder.

2017-12-19 11:36 GMT+01:00 Peter Lindegaard Hansen <p...@tigermedia.dk>:

> Hi list,
>
> We upgraded from 1.5 to 1.8 recently - then to 1.8.1
>
> Now we're seeing segfaults and slowdowns with haproxy
>
> Repeating:
> Dec 19 11:14:26 haproxy02 kernel: [122635.295196] haproxy[29582]: segfault
> at 55d5152279b2 ip 7f9c2dcc5a28 sp 7fff07caf4b8 error 6 in
> libc-2.23.so[7f9c2dc26000+1c]
> Dec 19 11:14:26 haproxy02 systemd[1]: haproxy.service: Main process
> exited, code=exited, status=139/n/a
> Dec 19 11:14:26 haproxy02 systemd[1]: haproxy.service: Unit entered failed
> state.
> Dec 19 11:14:26 haproxy02 systemd[1]: haproxy.service: Failed with result
> 'exit-code'.
> Dec 19 11:14:26 haproxy02 systemd[1]: haproxy.service: Service hold-off
> time over, scheduling restart.
> Dec 19 11:14:26 haproxy02 systemd[1]: Stopped HAProxy Load Balancer.
> Dec 19 11:14:26 haproxy02 systemd[1]: Starting HAProxy Load Balancer...
> Dec 19 11:14:26 haproxy02 systemd[1]: Started HAProxy Load Balancer.
> Dec 19 11:14:27 haproxy02 kernel: [122636.578738] haproxy[31479]: segfault
> at 56409a8c1de2 ip 7fa5fa349a28 sp 7ffe66f4f688 error 6 in
> libc-2.23.so[7fa5fa2aa000+1c]
> Dec 19 11:14:27 haproxy02 systemd[1]: haproxy.service: Main process
> exited, code=exited, status=139/n/a
> Dec 19 11:14:27 haproxy02 systemd[1]: haproxy.service: Unit entered failed
> state.
> Dec 19 11:14:27 haproxy02 systemd[1]: haproxy.service: Failed with result
> 'exit-code'.
> Dec 19 11:14:27 haproxy02 systemd[1]: haproxy.service: Service hold-off
> time over, scheduling restart.
> Dec 19 11:14:27 haproxy02 systemd[1]: Stopped HAProxy Load Balancer.
> Dec 19 11:14:27 haproxy02 systemd[1]: Starting HAProxy Load Balancer...
> Dec 19 11:14:28 haproxy02 systemd[1]: Started HAProxy Load Balancer.
> Dec 19 11:14:28 haproxy02 kernel: [122637.569863] haproxy[31487]: segfault
> at 55cb4bd59857 ip 7f71e678aa28 sp 7fffb94427b8 error 6 in
> libc-2.23.so[7f71e66eb000+1c]
> Dec 19 11:14:28 haproxy02 systemd[1]: haproxy.service: Main process
> exited, code=exited, status=139/n/a
> Dec 19 11:14:28 haproxy02 systemd[1]: haproxy.service: Unit entered failed
> state.
> Dec 19 11:14:28 haproxy02 systemd[1]: haproxy.service: Failed with result
> 'exit-code'.
> Dec 19 11:14:28 haproxy02 systemd[1]: haproxy.service: Service hold-off
> time over, scheduling restart.
> Dec 19 11:14:28 haproxy02 systemd[1]: Stopped HAProxy Load Balancer.
> Dec 19 11:14:28 haproxy02 systemd[1]: Starting HAProxy Load Balancer...
> Dec 19 11:14:29 haproxy02 systemd[1]: Started HAProxy Load Balancer.
>
>
> At same time in haproxy.log
>
> (lots of ssl handshake failures...) then
> Dec 19 11:14:26 haproxy02 haproxy[29579]: [ALERT] 352/090058 (29579) :
> Current worker 29582 left with exit code 139
> Dec 19 11:14:26 haproxy02 haproxy[29579]: [ALERT] 352/090058 (29579) :
> exit-on-failure: killing every workers with SIGTERM
> Dec 19 11:14:26 haproxy02 haproxy[29579]: [WARNING] 352/090058 (29579) :
> All workers are left. Leaving... (139)
> Dec 19 11:14:27 haproxy02 haproxy[31476]: [ALERT] 352/111426 (31476) :
> Current worker 31479 left with exit code 139
> Dec 19 11:14:27 haproxy02 haproxy[31476]: [ALERT] 352/111426 (31476) :
> exit-on-failure: killing every workers with SIGTERM
> Dec 19 11:14:27 haproxy02 haproxy[31476]: [WARNING] 352/111426 (31476) :
> All workers are left. Leaving... (139)
> Dec 19 11:14:28 haproxy02 haproxy[31485]: [ALERT] 352/111428 (31485) :
> Current worker 31487 left with exit code 139
> Dec 19 11:14:28 haproxy02 haproxy[31485]: [ALERT] 352/111428 (31485) :
> exit-on-failure: killing every workers with SIGTERM
> Dec 19 11:14:28 haproxy02 haproxy[31485]: [WARNING] 352/111428 (31485) :
> All workers are left. Leaving... (139)
> Dec 19 11:14:29 haproxy02 haproxy[31493]: [ALERT] 352/111429 (31493) :
> Current worker 31496 left with exit code 139
> Dec 19 11:14:29 haproxy02 haproxy[31493]: [ALERT] 352/111429 (31493) :
> exit-on-failure: killing every workers with SIGTERM
> Dec 19 11:14:29 haproxy02 haproxy[31493]: [WARNING] 352/111429 (31493) :
> All workers are left. Leaving... (139)
> Dec 19 11:14:30 haproxy02 haproxy[31503]: [ALERT] 352/11142

1.8.1 Segfault + slowdown

2017-12-19 Thread Peter Lindegaard Hansen
 help appreciated.

Is it related to h2 or 1.8?

We've been running 1.5.9 (ubuntu repos default) for years on the same
requests with no problem.

Thanks

Med venlig hilsen


*Peter Lindegaard Hansen*

*Softwareudvikler / Partner*

Telefon: +45 96 500 300 | Direkte: 69 14 97 04 | Email: p...@tigermedia.dk
Tiger Media A/S | Gl. Gugvej 17C | 9000 Aalborg | Web: www.tigermedia.dk

For supportspørgsmål kontakt os da på supp...@tigermedia.dk eller på tlf.
96 500 300
og din henvendelse vil blive besvaret af første ledige medarbejder.