My team builds several amd64 and arm64 Linux container images daily and lately
we've been having trouble with the CentOS 7 arm64 build hanging. Our build
machine is an amd64 Ubuntu Openstack machine running Docker and we use QEMU to
run arm64 containers. We recently upgraded tooling to:
* Ubuntu 22.04.2
* Docker 24.0.5
* We're installing these packages:
binfmt-support/jammy,jammy,now 2.2.1-2 amd64 [installed]
qemu-guest-agent/jammy-updates,jammy-updates,now 1:6.2+dfsg-2ubuntu6.12 amd64
[installed]
qemu-user-static/jammy-updates,jammy-updates,now 1:6.2+dfsg-2ubuntu6.12 amd64
[installed]
qemu/jammy-updates,jammy-updates,now 1:6.2+dfsg-2ubuntu6.12 amd64 [installed]
We start the container with the centos:7 image which looks like it's 18 months
old. The problem first manifested when doing apt upgrade -y in a CentOS 7
arm64 container and I've tracked it down to this command command:
/lib64/ld-2.17.so --verify /usr/bin/true
The command seems to be taking over the CPU:
[root@83d610f0f031 /]# ps -e -o pid,ppid,etime,time,state,args
PID PPID ELAPSED TIME S COMMAND
1 0 40:35 00:00:00 S
/usr/libexec/qemu-binfmt/aarch64-binfmt-P /bin/bash /bin/bash
35 1 38:50 00:38:28 R
/usr/libexec/qemu-binfmt/aarch64-binfmt-P /lib64/ld-2.17.so /lib64/ld-2.17.so
--verify /usr/bin/true
140 1 1-00:03:13 00:00:00 R ps -e -o pid,ppid,etime,time,state,args
[root@83d610f0f031 /]#
The same scenario doesn't happen on our previous build system using Ubuntu 20
(qemu 4.2-3ubuntu6.27 and Docker 24.0.5).
I also did the following scenario:
1. Started an AWS Ubuntu 22 arm64 instance
2. Installed Docker
3. Started a CentOS 7 container (native arm64 architecture)
4. Observed the command did not hang
I don't know for sure this is a QEMU issue but it's a candidate. Can anyone
suggest further paths of investigation? Should I open a QEMU bug?