Hi, Apologies for not coming back to you earlier.
This,[1], is the list of entries that were present in /tmp in the node (cyclops-13) when I first looked into it after the failures. Just for your information, there were also some leftover xvfb processes (similar to [2] but more) owned by pbuilder running on the same node. To take care of both of those leaks on every cyclops node, we are now periodically running a clean-up job [3] to remove any leftover content from /tmp when a node is not running any jobs as well as to kill any leftover processes owned by pbuilder. This should take care of this particular issue in the future. Thanks [1]: http://paste.ubuntu.com/14003898/ [2]: http://paste.ubuntu.com/14004905/ [3]: http://s-jenkins.ubuntu-ci:8080/job/cleanup-cyclops-nodes/ On Mon, Dec 14, 2015 at 2:27 PM, Evan Dandrea <[email protected]> wrote: > Siva is looking into this and will update you with his progress. Thanks! > On Mon, Dec 14, 2015 at 00:10, Michi Henning <[email protected]> > wrote: > >> Could you fix node13 as a matter of urgency please? It’s been damn near >> impossible for us to merge anything because we keep falling over the broken >> builder. >> >> Thanks! >> >> Michi. >> >> >> On 14 Dec 2015, at 11:41 , James Henstridge < >> [email protected]> wrote: >> >> On 14 December 2015 at 09:26, Michi Henning <[email protected]> >> wrote: >> >> >> On 12 Dec 2015, at 18:17 , James Henstridge < >> [email protected]> >> wrote: >> >> So, looking at the xvfb-run man page, it sends the X server logs to >> /dev/null by default: >> >> -e file, --error-file=file >> Store output from xauth and Xvfb in file. The default >> is >> /dev/null. >> >> We should adjust the test to do something useful with those logs (e.g. >> print them out if the test fails). Presumably the cause of the test >> failure will be obvious once we can see that. >> >> >> Did that. Here is what comes out: >> >> 8: Test command: >> >> /tmp/buildd/thumbnailer-2.3+16.04.20151102.2bzr314pkg0vivid283/tests/qml/run_test.sh >> >> "/tmp/buildd/thumbnailer-2.3+16.04.20151102.2bzr314pkg0vivid283/obj-arm-linux-gnueabihf/stderr.log" >> >> "/tmp/buildd/thumbnailer-2.3+16.04.20151102.2bzr314pkg0vivid283/obj-arm-linux-gnueabihf/plugins" >> 8: Test timeout computed to be: 1500 >> 8: QXcbConnection: Could not connect to display :109 >> 8: Aborted >> 8: _XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed >> 8: _XSERVTransMakeAllCOTSServerListeners: server already running >> 8: (EE) >> 8: Fatal server error: >> 8: (EE) Cannot establish any listening sockets - Make sure an X server >> isn't >> already running(EE) >> 8: _XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed >> 8: _XSERVTransMakeAllCOTSServerListeners: server already running >> 8: (EE) >> >> >> Here's the relevant code from xvfb-run script: >> >> SERVERNUM=99 >> >> # Find a free server number by looking at .X*-lock files in /tmp. >> find_free_servernum() { >> # Sadly, the "local" keyword is not POSIX. Leave the next line >> commented in >> # the hope Debian Policy eventually changes to allow it in /bin/sh >> scripts >> # anyway. >> #local i >> >> i=$SERVERNUM >> while [ -f /tmp/.X$i-lock ]; do >> i=$(($i + 1)) >> done >> echo $i >> } >> >> >> So to get these results, there must have been /tmp/.X99-lock to >> /tmp/.X108-lock files on the system, and the /tmp/.X11-unix/X109 >> socket either existed or couldn't be created due to permission issues. >> >> The code looks like it could be prone to race conditions if other jobs >> were trying to start of Xvfb servers at the same time, but that >> wouldn't explain the repeated failures. It seems more likely that >> some previous test run (or multiple runs) has left garbage behind >> under /tmp. >> >> James. >> >> >> -- >> Mailing list: https://launchpad.net/~canonical-ci-engineering >> Post to : [email protected] >> Unsubscribe : https://launchpad.net/~canonical-ci-engineering >> More help : https://help.launchpad.net/ListHelp >> > > -- > Mailing list: https://launchpad.net/~canonical-ci-engineering > Post to : [email protected] > Unsubscribe : https://launchpad.net/~canonical-ci-engineering > More help : https://help.launchpad.net/ListHelp > >
-- Mailing list: https://launchpad.net/~canonical-ci-engineering Post to : [email protected] Unsubscribe : https://launchpad.net/~canonical-ci-engineering More help : https://help.launchpad.net/ListHelp

