For a week or two now I am seeing the system tests hang regularly. It usually hangs in one of the python tests, today I got hangs in: test_example (tests_0-10.example.ExampleTest) ... test_ack (tests_0-10.message.MessageTests) ...
I have seen other python tests hang, and more rarely client_test. I suspect it coincides with the new epoll IO layer but I'm not sure of that. At the point of the hang the broker appears perfectly healthy - other clients can talk to it no problem. It seems to be the client that's hanging, but the fact that c++ and python clients are hanging makes me suspect the broker. Possibly there's a race condition where the broker drops an outgoing frame and clients are hanging waiting for it. So far I haven't tried to pin it down, I'm not sure exactly how to. It's frequent enough that make check hangs more often than not for me, but usually in a python test so I can't attach a debugger to figure out where it hung. Anyone have any thoughts? Cheers, Alan.
