I did some additional testing yesterday using gdb and strace….
gbd didn’t return any useful information, but that might also be due to my lack
of gdb experience. Running strace resulted in the following:
...
clock_gettime(CLOCK_MONOTONIC, {207116, 975055264}) = 0
clock_gettime(CLOCK_MONOTONIC, {207116, 975185579}) = 0
mmap(0xc420200000, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0xc420200000
mmap(0xc41ffe8000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0xc41ffe8000
clock_gettime(CLOCK_MONOTONIC, {207116, 975325872}) = 0
clock_gettime(CLOCK_MONOTONIC, {207116, 975713954}) = 0
mmap(0xc420300000, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0xc420300000
mmap(0xc41ffe0000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0xc41ffe0000
clock_gettime(CLOCK_MONOTONIC, {207116, 977430825}) = 0
getrandom(
Doesn’t matter whether I use docker 1.13 or docker-ce 18, on the Clefos75
guest the daemon process hangs on the getrandom system call..
When I do the same on an Ubuntu 16.04 guest (where the daemon starts and runs
without any issues) the strace shows the same getrandom call which executes
successfully:
...
futex(0xc420098948, FUTEX_WAKE, 1) = 1
futex(0xc42005e948, FUTEX_WAKE, 1) = 1
futex(0xc420098948, FUTEX_WAKE, 1) = 1
futex(0xc420098948, FUTEX_WAKE, 1) = 1
getrandom("h\"\277\352\376\262(\344", 8, 0) = 8
clock_gettime(CLOCK_REALTIME, {1569355392, 476405161}) = 0
clock_gettime(CLOCK_MONOTONIC, {2337380, 47849415}) = 0
clock_gettime(CLOCK_REALTIME, {1569355392, 477070655}) = 0
clock_gettime(CLOCK_MONOTONIC, {2337380, 48519415}) = 0
clock_gettime(CLOCK_REALTIME, {1569355392, 477339454}) = 0
clock_gettime(CLOCK_MONOTONIC, {2337380, 48781970}) = 0
...
Both the Clefos and Ubuntu guests are running in KVM.
On the Clefos guest I have running in zVM the getrandom call executes
successfully……
Reading up on the issue…. i found this: "When the entropy pool is empty,
reads from /dev/random will block until additional environmental noise is
gathered."
Running “cat /dev/random” on the clefos guest actually freezes…. Doing the
same on the Ubuntu guest does return data as does running the command on the
clefos guest in zVM . So I feel that that’s where there problem is.
Any other insights?
Regards
Johan
Op 24 sep. 2019, om 20:25 heeft Johan Schelling
<[email protected]<mailto:[email protected]>> het volgende
geschreven:
Hi Christian,
I’m running Ubuntu 16.04 in LPAR as the KVM host.
No problems running Docker in an Ubuntu 16.04 guest. The docker daemon will
not start in a Clefos75 guest.
When using a similar Clefos75 guest on zVM 6.4 the docker daemon has no
problem starting up.
I’ll get some trace and debug info tomorrow..
Johan
Op 24 sep. 2019, om 11:26 heeft Christian Borntraeger
<[email protected]<mailto:[email protected]>> het volgende
geschreven:
On 24.09.19 11:13, Johan Schelling wrote:
Goodmorning all,
I have been playing around for a while now with Openshift Origin on our
LinuxONE system following the great “Getting_Started_with_OpenShift_v3.10”
guide….
When using zVM as a hypervisor I can deploy an Openshift cluster with multiple
nodes with no problems at all (using zVM 6.4 and Clefos 7.5). Everything works
like a charm….
But when I try to deploy an Openshift cluster with one or multiple nodes using
KVM (using Ubuntu 16.04 KVM and Clefos 7.5) I run into problems starting the
docker daemon.
Somehow the docker daemon freezes after which I have to cancel the ansible
playbook…… Starting the docker daemon by hand (systemctl start docker), the
same thing happens:
Can you say exactly what distro version run as host and what distro version
runs as guest (the good and the bad variants)
This was not 100% clear.
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor
preset: disabled)
Active: activating (start) since di 2019-09-24 11:00:01 CEST; 8min ago
Docs: http://docs.docker.com
Main PID: 27563 (dockerd-current)
CGroup: /system.slice/docker.service
└─27563 /usr/bin/dockerd-current --add-runtime
docker-runc=/usr/libexec/docker/docker-runc-current
--default-runtime=docker-runc --exec-...
sep 24 11:00:01 lnxicu20 systemd[1]: Starting Docker Application Container
Engine...
any chance to attach gdb to that hanging process and doing a "thread apply all
bt"?
The service will remain in status “activating (start)” until I kill the
service. No messages (other than “starting Docker Application Container
Engine”) the logs. Every docker command that you try to give (e.g. docker
version
Anyone seen this behaviour before or any ideas on what is going wrong?
I have a couple of other linux guests (both Ubuntu and SLES) running Docker in
both zVM and KVM environment without any problems….
So that means that you have other guests under KVM where docker starts up just
fine.
Some additional information: yum list installed | grep docker
cockpit-docker.s390x 176-4.el7.centos @extras
docker.s390x 2:1.13.1-96.gitb2f74b2.el7.centos
docker-client.s390x 2:1.13.1-96.gitb2f74b2.el7.centos
docker-common.s390x 2:1.13.1-96.gitb2f74b2.el7.centos
origin-docker-excluder.noarch 3.11.0-1.el7.git.0.62803d0
@centos-openshift-origin
python-docker-py.noarch 1:1.10.6-9.el7_6 @extras
python-docker-pycreds.noarch 1:0.3.0-9.el7_6 @extras
Or asked differently, is it just the clefos guest that does not work?
Same situation on linux guests running in zVM and KVM, but only on zVM the
docker service will start.
On the zVM guest the “docker version” command returns:
Client:
Version: 1.13.1
API version: 1.26
Package version: docker-1.13.1-96.gitb2f74b2.el7.centos.s390x
Go version: go1.10.3
Git commit: b2f74b2/1.13.1
Built: Thu Jun 20 12:57:27 2019
OS/Arch: linux/s390x
Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Package version: docker-1.13.1-96.gitb2f74b2.el7.centos.s390x
Go version: go1.10.3
Git commit: b2f74b2/1.13.1
Built: Thu Jun 20 12:57:27 2019
OS/Arch: linux/s390x
Experimental: false
Running the same “docker version” on the guest in the KVM environment results
in a freeze …..
On an Ubuntu guest running in KVM an “apt list —installed | grep docker”
returns:
docker/xenial,now 1.5-1 s390x [installed]
docker.io/xenial-updates,xenial-security,now<http://docker.io/xenial-updates,xenial-security,now>
18.09.7-0ubuntu1~16.04.5 s390x [installed]
and the “docker version” command shows:
Client:
Version: 18.09.7
API version: 1.39
Go version: go1.10.4
Git commit: 2d0083d
Built: Fri Aug 16 14:19:34 2019
OS/Arch: linux/s390x
Experimental: false
Server:
Engine:
Version: 18.09.7
API version: 1.39 (minimum version 1.12)
Go version: go1.10.4
Git commit: 2d0083d
Built: Thu Aug 15 15:12:41 2019
OS/Arch: linux/s390x
Experimental: false
The Ubuntu environment on KVM has been running perfectly OK for quite some time
now …..
Sorry for the long post, but this is driving me nuts…..
Regards
Johan Schelling
Infrastructure Solution Architect
[cid:[email protected]]
ICU IT Services BV
Transistorstraat 55b I 1322 CK ALMERE
M 06 – 21 245 992 I E
[email protected]<mailto:[email protected]><mailto:[email protected]>
T 088 – 5 234 123 I
www.icu-it.nl<http://www.icu-it.nl><http://www.icu-it.nl/> I KvK 32135776
ICU Proclaimer<http://www.icu-it.nl/proclaimer/>
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected]<mailto:[email protected]> with the
message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected]<mailto:[email protected]> with the
message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390