osmith has uploaded this change for review. ( https://gerrit.osmocom.org/c/osmo-ci/+/34645?usp=email )
Change subject: scripts/docker-cleanup: remove containers > 24h ...................................................................... scripts/docker-cleanup: remove containers > 24h Remove containers starting with jenkins- or having ttcn3 in the name, if they have been running for more than 24 hours. This can happen with the ttcn3 testsuites, as they typically start multiple docker containers in the background (one per Osmocom program) before they start the testsuite docker container in the foreground. Usually the clean up trap makes sure that all containers get killed, but we have seen that a few containers have been running for a few months. One reason for this could be temporary loss of connection between the jenkins server and the node running the job. Extend the clean script to remove the containers that were not properly removed by the clean up trap. Historically we used to kill docker containers of the same name before starting a testsuite, but this had the downside that we could not start the same testsuite multiple times in parallel. This was refactored in docker-playground Ifcd384272c56d585e220e2588f2186dc110902ed. Change-Id: I58c17b57c998eaba411658e83b7295d7cfcf9a23 --- M scripts/docker-cleanup.sh 1 file changed, 61 insertions(+), 0 deletions(-) git pull ssh://gerrit.osmocom.org:29418/osmo-ci refs/changes/45/34645/1 diff --git a/scripts/docker-cleanup.sh b/scripts/docker-cleanup.sh index fe134d2..0d19d7d 100755 --- a/scripts/docker-cleanup.sh +++ b/scripts/docker-cleanup.sh @@ -1,6 +1,40 @@ #!/bin/sh -x # https://osmocom.org/projects/osmocom-servers/wiki/Docker_cache_clean_up +kill_docker_containers_running_longer_than_24h() { + docker ps + set +x + + local date_24h_ago="$(date "+%s" -d"24 hours ago")" + docker ps --format "{{.ID}}|{{.Names}}|{{.CreatedAt}}" | while read -r line; do + local id="$(echo "$line" | cut -d '|' -f 1)" + local name="$(echo "$line" | cut -d '|' -f 2)" + local created_at="$(echo "$line" | cut -d '|' -f 3 | cut -d ' ' -f 1-3)" + local date_created_at="$(date "+%s" -d "$created_at")" + + if [ "$date_created_at" -gt "$date_24h_ago" ]; then + echo "$name: not running for >24h" + continue + fi + + case "$name" in + jenkins-*|*ttcn3*) ;; + *) + echo "$name: does not match name pattern" + continue + ;; + esac + + echo "$name ($id): has been running for >24h, killing" + docker kill "$id" + done + + set -x + docker ps +} + +kill_docker_containers_running_longer_than_24h + # delete all containers where we forgot to use --rm with docker run, # older than 24 hours docker container prune --filter "until=24h" -f -- To view, visit https://gerrit.osmocom.org/c/osmo-ci/+/34645?usp=email To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings Gerrit-Project: osmo-ci Gerrit-Branch: master Gerrit-Change-Id: I58c17b57c998eaba411658e83b7295d7cfcf9a23 Gerrit-Change-Number: 34645 Gerrit-PatchSet: 1 Gerrit-Owner: osmith <osm...@sysmocom.de> Gerrit-MessageType: newchange