Hello Dimitri,
I tried to reproduce the same behaviour using default LXC containers in
real HW (ARMv8 - ARMHF containers) and wasn't able to.
Nevertheless, I was able to cause corosync not to start due to failed
mlock() calls:
main.log:Jul 15 18:27:57 [2386] hasid01 corosync warning [MAIN ]
main.c:corosync_mlockall:481 Could not lock memory of service to avoid page
faults: Operation not permitted (1)
main.log:Jul 15 18:27:57 [2386] hasid01 corosync error [MAIN ]
main.c:corosync_flock:1087 Corosync Executive couldn't create lock file.
when I made mlock soft/hard limit to be 0 for "hacluster/haclient"
user/group like you said.
hacluster@hasid01:~$ strace -f /usr/sbin/corosync -f 2>&1 | grep -i mlock
prlimit64(0, RLIMIT_MEMLOCK, {rlim_cur=RLIM64_INFINITY,
rlim_max=RLIM64_INFINITY}, NULL) = -1 EPERM (Operation not permitted)
mlockall(MCL_CURRENT|MCL_FUTURE) = -1 EPERM (Operation not permitted)
and both calls, prlimit64() and mlockall() failed with EPERM.
When testing with 1MB soft/hard limit:
hacluster@hasid01:~$ strace -f /usr/sbin/corosync -f 2>&1 | grep -i mlock
prlimit64(0, RLIMIT_MEMLOCK, {rlim_cur=RLIM64_INFINITY,
rlim_max=RLIM64_INFINITY}, NULL) = -1 EPERM (Operation not permitted)
mlockall(MCL_CURRENT|MCL_FUTURE) = 0
only prlimit64() fails with EPERM.
It tries to set RLIMIT_MEMLOCK soft and hard limits to RLIM64_INFINITY,
which is defined as:
#define RLIM64_INFINITY (~0ULL)
And it is, possibly, the "unlimited" value.
Since it failed with EPERM, checking return: EPERM = An unprivileged
process tried to raise the hard limit; the CAP_SYS_RESOURCE capability
is required to do this.
Looks like unless your container has "sys_resource" as lxc.cap.keep=
value, AND you configure corosync to have CAP_SYS_RESOURCE enabled by
default:
sudo setcap 'CAP_SYS_RESOURCE=+ep' /usr/sbin/corosync
the prlimit64() call will fail UNLESS you have unlimited value set for
memlock, then it would work:
(c)inaddy@hasid01:~$ sudo su - hacluster
hacluster@hasid01:~$ ulimit -H -l
unlimited
hacluster@hasid01:~$ strace -f /usr/sbin/corosync -f 2>&1 | grep -i mlock
prlimit64(0, RLIMIT_MEMLOCK, {rlim_cur=RLIM64_INFINITY,
rlim_max=RLIM64_INFINITY}, NULL) = 0
mlockall(MCL_CURRENT|MCL_FUTURE) = 0
And, despite failing in other parts:
sched_setscheduler(0, SCHED_RR, [99]) = -1 EPERM (Operation not permitted)
setpriority(PRIO_PGRP, 0, -2147483648) = -1 EACCES (Permission denied)
It works:
(c)inaddy@hasid02:~$ sudo crm status
Stack: corosync
Current DC: hasid02 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Mon Jul 15 19:40:25 2019
Last change: Mon Jul 15 17:51:53 2019 by root via cibadmin on hasid01
3 nodes configured
0 resources configured
Node hasid01: pending
Online: [ hasid02 hasid03 ]
And
(c)inaddy@hasid02:~$ sudo corosync-quorumtool
Quorum information
------------------
Date: Mon Jul 15 19:40:41 2019
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 2
Ring ID: 1/136
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
1 1 hasid01
2 1 hasid02 (local)
3 1 hasid03
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1828228
Title:
corosync fails to start in container (armhf) bump some limits
To manage notifications about this bug go to:
https://bugs.launchpad.net/auto-package-testing/+bug/1828228/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs