Willy Tarreau wrote:

Hi,

There's something easy you can do to check if it's that : in
src/stream_sock.c, there's only one recv() call. Simply check
that the max value is within bounds :

+               if (max<  0 || max>  b->size)
+                       abort();
                ret = recv(fd, b->r, max, 0);

If you believe you can reproduce it, doing it under strace could
immensely help : "strace -tt -s 200 -o trace.log haproxy -[args]".

The odd-looking numbers got me thinking and in the meantime I have modified the Makefile and compiled HAProxy with CPU set to "custom" and ARCH set to "s390x" (it's a 64bit system) - I'm not sure how's that related but z/Linux (s390 one) can also be a 31bit system http://www.zjournal.com/index.cfm?section=article&aid=1033 and maybe the default Makefile & gcc somehow got confused by that? Anyway, things look better now, it's been 2 hours and there have been about 1M of messages processed so far. I'll let it run over the weekend and we'll see how stable it is.

Here's how the pools look like now:

Dumping pools usage.
  - Pool pipe (32 bytes) : 0 allocated (0 bytes), 0 used, 2 users [SHARED]
- Pool capture (64 bytes) : 0 allocated (0 bytes), 0 used, 1 users [SHARED] - Pool task (144 bytes) : 7 allocated (1008 bytes), 5 used, 1 users [SHARED] - Pool hdr_idx (832 bytes) : 5 allocated (4160 bytes), 3 used, 2 users [SHARED] - Pool requri (1024 bytes) : 5 allocated (5120 bytes), 2 used, 1 users [SHARED] - Pool session (1344 bytes) : 5 allocated (6720 bytes), 3 used, 1 users [SHARED] - Pool buffer (16512 bytes) : 10 allocated (165120 bytes), 6 used, 1 users [SHARED]
Total: 7 pools, 182128 bytes allocated, 108368 used.

Too bad I didn't take a snapshot of those when everything was fine initially but I really didn't expect any problems would arise.

Assuming there aren't any problems, would you still like me to strace it? It would have to wait till next week - I'll need to ask the sysadmin for installing strace for me.

What would you consider a good indicator of its reliability? Would running flawlessly for a week straight be enough of testing?

Also, do you see any build warning ? It's possible that we have
one type wrong somewhere which is different on your platform. I
once got caught by unsigned chars on PPC for instance.

There are indeed some warnings during compilation:

gcc -Iinclude -Iebtree -Wall -g -DTPROXY -DCONFIG_HAP_CRYPT -DENABLE_POLL -DENABLE_EPOLL -DENABLE_SEPOLL -DNETFILTER -DUSE_GETSOCKNAME -DCONFIG_HAPROXY_VERSION=\"1.4.2\" -DCONFIG_HAPROXY_DATE=\"2010/03/17\" -c -o src/dumpstats.o src/dumpstats.c
src/dumpstats.c: In function ‘stats_dump_full_sess_to_buffer’:
src/dumpstats.c:2469: warning: format ‘%d’ expects type ‘int’, but argument 5 has type ‘long int’ src/dumpstats.c:2469: warning: format ‘%d’ expects type ‘int’, but argument 6 has type ‘long int’ src/dumpstats.c:2469: warning: format ‘%d’ expects type ‘int’, but argument 7 has type ‘long int’ src/dumpstats.c:2499: warning: format ‘%d’ expects type ‘int’, but argument 5 has type ‘long int’ src/dumpstats.c:2499: warning: format ‘%d’ expects type ‘int’, but argument 6 has type ‘long int’ src/dumpstats.c:2499: warning: format ‘%d’ expects type ‘int’, but argument 7 has type ‘long int’

Last, are you aware of any version that has worked reliably on
your platform ?

Not really, it's the first time we're using HAProxy on that platform.

Thanks!

--
Dariusz Suchojad

Reply via email to