Module: Mesa Branch: main Commit: e9aef19e2b937b6bdacbf3cb1184be17cfb09c37 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=e9aef19e2b937b6bdacbf3cb1184be17cfb09c37
Author: Guilherme Gallo <[email protected]> Date: Thu Apr 7 16:01:27 2022 -0300 ci/lava: Trap init-stage2.sh background processes Any daemon executed in init-stage2.sh may interfere with LAVA signals, since any threaded output to console may clutter the signals, which are based on the log output. E.g: This job https://gitlab.freedesktop.org/gallo/mesa/-/jobs/20779120#L2102 has failed because capture-devcoredump.sh was alive and emitting kernel messages to the console during the LAVA signal handling, mangling the output and making the LAVA to fail to check the results of the job. Another problem is that CONFIG_DEBUG_STACK_USAGE Kconfig is enabled. This causes process exit to dump a `RESULT=[ 246.756067] lava-test-case (156) used greatest stack depth: ... bytes left` kernel message to the logs corrupting LAVA signal message. Empirically, it happens one in every 280 jobs. To solve that, compose the lava-test-case custom script with a short sleep to give time for kernel to dump the message clearly and a exit command to keep the return code from init-stage2.sh script. Signed-off-by: Guilherme Gallo <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15938> --- .gitlab-ci/common/init-stage2.sh | 30 +++++++++++++++++++++++++++++- .gitlab-ci/lava/lava-submit.sh | 2 +- 2 files changed, 30 insertions(+), 2 deletions(-) diff --git a/.gitlab-ci/common/init-stage2.sh b/.gitlab-ci/common/init-stage2.sh index e371228ea48..59e974ecce1 100755 --- a/.gitlab-ci/common/init-stage2.sh +++ b/.gitlab-ci/common/init-stage2.sh @@ -1,5 +1,31 @@ #!/bin/sh +# Make sure to kill itself and all the children process from this script on +# exiting, since any console output may interfere with LAVA signals handling, +# which based on the log console. +cleanup() { + set +x + echo "Killing all child processes" + for pid in $BACKGROUND_PIDS + do + kill "$pid" + done + + # Sleep just a little to give enough time for subprocesses to be gracefully + # killed. Then apply a SIGKILL if necessary. + sleep 5 + for pid in $BACKGROUND_PIDS + do + kill -9 "$pid" 2>/dev/null || true + done +} +trap cleanup INT TERM EXIT + +# Space separated values with the PIDS of the processes started in the +# background by this script +BACKGROUND_PIDS= + + # Second-stage init, used to set up devices and our job environment before # running tests. @@ -75,7 +101,8 @@ fi # Start a little daemon to capture the first devcoredump we encounter. (They # expire after 5 minutes, so we poll for them). -./capture-devcoredump.sh & +/capture-devcoredump.sh & +BACKGROUND_PIDS="$! $BACKGROUND_PIDS" # If we want Xorg to be running for the test, then we start it up before the # HWCI_TEST_SCRIPT because we need to use xinit to start X (otherwise @@ -85,6 +112,7 @@ if [ -n "$HWCI_START_XORG" ]; then echo "touch /xorg-started; sleep 100000" > /xorg-script env \ xinit /bin/sh /xorg-script -- /usr/bin/Xorg -noreset -s 0 -dpms -logfile /Xorg.0.log & + BACKGROUND_PIDS="$! $BACKGROUND_PIDS" # Wait for xorg to be ready for connections. for i in 1 2 3 4 5; do diff --git a/.gitlab-ci/lava/lava-submit.sh b/.gitlab-ci/lava/lava-submit.sh index a4bd28a3993..ce919ac64fa 100755 --- a/.gitlab-ci/lava/lava-submit.sh +++ b/.gitlab-ci/lava/lava-submit.sh @@ -47,4 +47,4 @@ artifacts/lava/lava_job_submitter.py \ --visibility-group ${VISIBILITY_GROUP} \ --lava-tags "${LAVA_TAGS}" \ --mesa-job-name "$CI_JOB_NAME" \ - >> results/lava.log \ No newline at end of file + >> results/lava.log
