On Fri, Mar 19, 2010 at 11:36:56PM +0100, Dariusz Suchojad wrote: > The odd-looking numbers got me thinking and in the meantime I have > modified the Makefile and compiled HAProxy with CPU set to "custom" and > ARCH set to "s390x" (it's a 64bit system) - I'm not sure how's that > related but z/Linux (s390 one) can also be a 31bit system
yes I've already seen in gcc's man that a 31bit addressing mode was supported. I don't see at all why this could be an issue, but maybe some pointer comparisons slightly depend on it :-/ > http://www.zjournal.com/index.cfm?section=article&aid=1033 and maybe the > default Makefile & gcc somehow got confused by that? > Anyway, things look better now, it's been 2 hours and there have been > about 1M of messages processed so far. I'll let it run over the weekend > and we'll see how stable it is. if you can make the 31-bit one die in 2 hours and that one remains OK for the weekend, we'll be able to tell the problem is over (or probably hidden). > Here's how the pools look like now: (...) they look better now. > Too bad I didn't take a snapshot of those when everything was fine > initially but I really didn't expect any problems would arise. > > Assuming there aren't any problems, would you still like me to strace > it? It would have to wait till next week - I'll need to ask the sysadmin > for installing strace for me. no, not needed. Strace is useful only if you reliably get it to crash. It could help us spot something like "recv(fd, buffer, -1, 0)" returning a big value a few lines before the crash. > What would you consider a good indicator of its reliability? Would > running flawlessly for a week straight be enough of testing? The fact that it runs a lot longer than previous run is a natural indicator of reliability. However it's not an indicator of correctness. Whatever we spot, I'll keep in mind that we can get it to crash on your machine in 31-bit mode. If ever I come across a vicious bug that could explain that, I'd be happy to ask you to give it a try. > >Also, do you see any build warning ? It's possible that we have > >one type wrong somewhere which is different on your platform. I > >once got caught by unsigned chars on PPC for instance. > > There are indeed some warnings during compilation: > > gcc -Iinclude -Iebtree -Wall -g -DTPROXY -DCONFIG_HAP_CRYPT > -DENABLE_POLL -DENABLE_EPOLL -DENABLE_SEPOLL -DNETFILTER > -DUSE_GETSOCKNAME -DCONFIG_HAPROXY_VERSION=\"1.4.2\" > -DCONFIG_HAPROXY_DATE=\"2010/03/17\" -c -o src/dumpstats.o src/dumpstats.c > src/dumpstats.c: In function ?stats_dump_full_sess_to_buffer?: > src/dumpstats.c:2469: warning: format ?%d? expects type ?int?, but > argument 5 has type ?long int? OK, this one was already reported and is harmless. > >Last, are you aware of any version that has worked reliably on > >your platform ? > > Not really, it's the first time we're using HAProxy on that platform. OK so I wish you that it works well for this first time :-) Regards, Willy