Re: possible crashing?

2015-10-31 Thread Willy Tarreau
Hi Donovan,

On Fri, Oct 30, 2015 at 06:39:10PM +, Donovan Meyers wrote:
> 
> Hey all,
> 
> I'm currently investigating some possible crashing in 1.5.14.
> 
> I really hate to call it a crash since I've never seen haproxy crash in my
> life until now, but the process is suddenly not running and I can't figure
> out why.

Well, crashes are indeed extremely rare but sometimes a bug surfaces and
causes this. Then we fix it and emit a new version.

> I've checked syslogs and puppet logs (and now disabled puppet) and anything
> else that could stop the process for any reason. There are no messages
> related to haproxy stopping at all.

Strange, because if it crashes (segfault, bus error, abort) the kernel should
emit a line saying the process died, unless kernel.print-fatal-signals is not
set.

> I turned haproxy logging up to debug but there's nothing in the log when it
> happens. I keep a stats page reloading on my desktop, so the log has entries
> for those requests, and then suddenly nothing.

So that looks like a crash or a kill from another process.

> The issue doesn't seem related to load; under load testing the issue doesn't
> manifest (not right away anyway) and it does manifest under no load at all.
> 
> The config is extremely simple; I'm happy to post it if anyone's interested.
> But it's not doing anything interesting.

That can indeed help. Don't forget to anonymize it if required (eg: stats
password).

> I've now enabled core dumps and even have it running under valgrind memcheck
> but unfortunately I haven't gotten it to happen since I did that.

OK. Please test by hand that your core dumps work (just send SIG11 to
the running process). You generally need to disable chroot, and enable
fs.suid_dumpable if you have either uid/user/gid/group in the config.

> The binary is one I compiled from source in order to build an RPM, using the
> haproxy.spec from the CentOS 6.5 haproxy-1.4.24-2.el6.src.rpm, slightly
> tweaked. For valgrind I also added CFLAGS="-g -O0".

OK.

> I really hope I can figure out that it's my fault. Barring that, I hope I
> can ferret out an issue with haproxy and contribute that to the community
> that has been so helpful to me. (Although you won't see a patch from me; I'm
> no developer! :) )

We hope so as well. A dying process is not something expected at all and
we take such issues very seriously. Note, there are some fixes pending
since 1.5.14, I should have issued 1.5.15 already but lack of time, usual
excuse, etc... The only important fix pending there is related to option
http-send-name-header, so if you're not using it, I don't think you're
experiencing a known bug.

If the bug happens rarely, you may also want to capture traffic using
tcpdump in order to be able to isolate the request(s) that causes this
issue (if at all). This can be cumbersome though since you'll have to
rotate files to avoid filling the file system.

Willy




possible crashing?

2015-10-30 Thread Donovan Meyers

Hey all,

I'm currently investigating some possible crashing in 1.5.14.

I really hate to call it a crash since I've never seen haproxy crash in my
life until now, but the process is suddenly not running and I can't figure
out why.

I've checked syslogs and puppet logs (and now disabled puppet) and anything
else that could stop the process for any reason. There are no messages
related to haproxy stopping at all.

I turned haproxy logging up to debug but there's nothing in the log when it
happens. I keep a stats page reloading on my desktop, so the log has entries
for those requests, and then suddenly nothing.

The issue doesn't seem related to load; under load testing the issue doesn't
manifest (not right away anyway) and it does manifest under no load at all.

The config is extremely simple; I'm happy to post it if anyone's interested.
But it's not doing anything interesting.

I've now enabled core dumps and even have it running under valgrind memcheck
but unfortunately I haven't gotten it to happen since I did that.

The binary is one I compiled from source in order to build an RPM, using the
haproxy.spec from the CentOS 6.5 haproxy-1.4.24-2.el6.src.rpm, slightly
tweaked. For valgrind I also added CFLAGS="-g -O0".

I really hope I can figure out that it's my fault. Barring that, I hope I
can ferret out an issue with haproxy and contribute that to the community
that has been so helpful to me. (Although you won't see a patch from me; I'm
no developer! :) )

So this message may seem premature since I haven't found anything, but I'm
sending it:

1. To ask if there's a known bug that could cause this
2. To give a heads up that I may have found something
3. To ask for any recommendations on how to determine what's happening

Thanks in advance,
Donovan