osmith has uploaded this change for review. ( 
https://gerrit.osmocom.org/c/osmo-ci/+/34645?usp=email )


Change subject: scripts/docker-cleanup: remove containers > 24h
......................................................................

scripts/docker-cleanup: remove containers > 24h

Remove containers starting with jenkins- or having ttcn3 in the name, if
they have been running for more than 24 hours. This can happen with the
ttcn3 testsuites, as they typically start multiple docker containers in
the background (one per Osmocom program) before they start the testsuite
docker container in the foreground. Usually the clean up trap makes sure
that all containers get killed, but we have seen that a few containers
have been running for a few months. One reason for this could be
temporary loss of connection between the jenkins server and the node
running the job.

Extend the clean script to remove the containers that were not properly
removed by the clean up trap.

Historically we used to kill docker containers of the same name before
starting a testsuite, but this had the downside that we could not start
the same testsuite multiple times in parallel. This was refactored in
docker-playground Ifcd384272c56d585e220e2588f2186dc110902ed.

Change-Id: I58c17b57c998eaba411658e83b7295d7cfcf9a23
---
M scripts/docker-cleanup.sh
1 file changed, 61 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.osmocom.org:29418/osmo-ci refs/changes/45/34645/1

diff --git a/scripts/docker-cleanup.sh b/scripts/docker-cleanup.sh
index fe134d2..0d19d7d 100755
--- a/scripts/docker-cleanup.sh
+++ b/scripts/docker-cleanup.sh
@@ -1,6 +1,40 @@
 #!/bin/sh -x
 # https://osmocom.org/projects/osmocom-servers/wiki/Docker_cache_clean_up

+kill_docker_containers_running_longer_than_24h() {
+       docker ps
+       set +x
+
+       local date_24h_ago="$(date "+%s" -d"24 hours ago")"
+       docker ps --format "{{.ID}}|{{.Names}}|{{.CreatedAt}}" | while read -r 
line; do
+               local id="$(echo "$line" | cut -d '|' -f 1)"
+               local name="$(echo "$line" | cut -d '|' -f 2)"
+               local created_at="$(echo "$line" | cut -d '|' -f 3 | cut -d ' ' 
-f 1-3)"
+               local date_created_at="$(date "+%s" -d "$created_at")"
+
+               if [ "$date_created_at" -gt "$date_24h_ago" ]; then
+                       echo "$name: not running for >24h"
+                       continue
+               fi
+
+               case "$name" in
+               jenkins-*|*ttcn3*) ;;
+               *)
+                       echo "$name: does not match name pattern"
+                       continue
+                       ;;
+               esac
+
+               echo "$name ($id): has been running for >24h, killing"
+               docker kill "$id"
+       done
+
+       set -x
+       docker ps
+}
+
+kill_docker_containers_running_longer_than_24h
+
 # delete all containers where we forgot to use --rm with docker run,
 # older than 24 hours
 docker container prune --filter "until=24h" -f

--
To view, visit https://gerrit.osmocom.org/c/osmo-ci/+/34645?usp=email
To unsubscribe, or for help writing mail filters, visit 
https://gerrit.osmocom.org/settings

Gerrit-Project: osmo-ci
Gerrit-Branch: master
Gerrit-Change-Id: I58c17b57c998eaba411658e83b7295d7cfcf9a23
Gerrit-Change-Number: 34645
Gerrit-PatchSet: 1
Gerrit-Owner: osmith <osm...@sysmocom.de>
Gerrit-MessageType: newchange

Reply via email to