Re: Crash reposrt

2018-09-27 Thread Willy Tarreau
Hi guys,

On Wed, Sep 26, 2018 at 02:05:22PM +0200, Olivier Houchard wrote:
> > And unfortunately we were not able to reproduce the issue manually. Tried
> > slowhttptest, curl, for now no luck.
> > 
> > Could you please suggest next steps that we can do to fix that?
> > 
> 
> Any chance you can give us the binary that generated that core as well ? That
> would be really useful.

Just an update on this, yesterday evening Olivier explained me the
issue and his first proposed fix. I think the issue can be reproduced
using the following sequence :

  - send a few pipelined HTTP requests
  - end the first block with an incomplete POST request (missing the
ending CRLF should be OK)
  - wait for all responses to come, then send the CRLF followed by a
large enough amount of data (16kB) to fill the buffer.

In this case the request is not realigned and there's wrapping data
at the end of the buffer and trigger the unexpected condition in the
crashing function.

Back offline, listening to the conference at Kernel Recipes :-)

Willy



Re: Crash reposrt

2018-09-26 Thread Olivier Houchard
Hi Anton,

On Wed, Sep 26, 2018 at 12:09:07PM +0300, prog 76 wrote:
> 
> Hi
> First of all Thank you for this great product. We are very happy to use it 
> for years.
> Unfortunately from version 1.8.12 we have an issue. Sometimes haproxy crash.
> We tried  to upgrade to 1.8.13 and it also crashes.
> Not often, we have almot constant load and it crashes 1 times per week. We 
> have count of instances on different machines and enough amount of them are 
> ready. Also we use Monit service to restart crashed instances.
> But anyway not good because we broke connections.
> Our config haproxy.cfg attached. 
> Crash data haproxy.core.zip attached.
> 

Beware when sending the core on the mailing list, it may contain sensitive
informations.

> It seems always crashes on the same URL last log lines from two crashes
> 
> proxy1web-next haproxy[12763]: x.x.x.x:6733 [07/Sep/2018:10:16:56.924] https~ 
> oldpool/web1-next 2634/0/1/82/3649 500 1434 - -  2/1/0/1/0 0/0 "POST 
> /Mobile/UpdateTechnicianPositionBatched HTTP/1.1"
> proxy2web-next haproxy[88028]: x.x.x.x:6750 [12/Sep/2018:10:55:56.696] https~ 
> oldpool/web2-next 1597/0/1/78/3802 500 1434 - -  4/3/0/1/0 0/0 "POST 
> /Mobile/UpdateTechnicianPositionBatched HTTP/1.1"
> 
> Back trace looks like
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p 
> /var/run/haproxy.pid'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 0x7f5e6c27981d in __memmove_ssse3_back ()
> at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1881
> 1881 ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S: No such file or 
> directory.
> (gdb) bt
> #0 0x7f5e6c27981d in __memmove_ssse3_back ()
> at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1881
> #1 0x556549c98e9b in memmove (__len=, __src=0x7f5e480594c7,
> __dest=0x7f5e480594e1) at /usr/include/x86_64-linux-gnu/bits/string3.h:57
> #2 buffer_insert_line2 (b=0x7f5e48055f70,
> pos=0x7f5e480594c7 
> "\r\n{\"location\":[{\"coords\":{\"speed\":-1,\"longitude\":-122.31412169240122,\"latitude\":47.311481963215847,\"accuracy\":70.4",
>  '0' , 
> "6,\"altitude_accuracy\":10,\"altitude\":121.7,\"heading\":-1},\"extras\":{},\"is_m"...,
> str=str@entry=0x556549ff0340 "X-Forwarded-Proto: https", len=len@entry=24)
> at src/buffer.c:158
> #3 0x556549bd35b7 in http_header_add_tail (msg=msg@entry=0x7f5e5004a570,
> hdr_idx=hdr_idx@entry=0x7f5e5004a510, text=0x556549ff0340 "X-Forwarded-Proto: 
> https")
> at src/proto_http.c:538
> #4 0x556549bdc9d8 in http_process_req_common (s=s@entry=0x7f5e50027470,
> req=req@entry=0x7f5e50027480, an_bit=an_bit@entry=16, px=0x556549fed430)
> at src/proto_http.c:3516
> #5 0x556549c0dc72 in process_stream (t=) at 
> src/stream.c:1905
> #6 0x556549c8ba07 in process_runnable_tasks () at src/task.c:317
> #7 0x556549c3ec85 in run_poll_loop () at src/haproxy.c:2403
> #8 run_thread_poll_loop (data=) at src/haproxy.c:2470
> #9 0x7f5e6cfa4184 in start_thread (arg=0x7f5e68412700) at 
> pthread_create.c:312
> #10 0x7f5e6c22237d in eventfd (count=26, flags=1208320820)
> at ../sysdeps/unix/sysv/linux/eventfd.c:55
> #11 0x in ?? ()
> (gdb)
>  
> And unfortunately we were not able to reproduce the issue manually. Tried 
> slowhttptest, curl, for now no luck.
> 
> Could you please suggest next steps that we can do to fix that?
> 

Any chance you can give us the binary that generated that core as well ? That
would be really useful.

Thanks !

Olivier