Package: pacemaker
Version: 1.1.16-1+deb9u1
Severity: grave
X-Debbugs-CC: a...@debian.org

Hi,
I am running corosync 2.4.2-3+deb9u1 with pacemaker and the last run of
unattended-upgrades broke the cluster (downgrading pacemaker to 1.1.16-1
fixed it immediately).
The logs contain a lot of warnings that seem to point to a permission
problem, such as "Rejecting IPC request 'lrmd_rsc_info' from
unprivileged client crmd". I am not using ACLs so the patch should not
impact my system.

Here is an excerpt from the logs after the upgrade:
Nov 12 06:26:05 cluster-1 crmd[20868]:   notice: State transition
S_PENDING -> S_NOT_DC
Nov 12 06:26:05 cluster-1 crmd[20868]:   notice: State transition
S_NOT_DC -> S_PENDING
Nov 12 06:26:05 cluster-1 attrd[20866]:   notice: Defaulting to uname -n
for the local corosync node name
Nov 12 06:26:05 cluster-1 crmd[20868]:   notice: State transition
S_PENDING -> S_NOT_DC
Nov 12 06:26:06 cluster-1 lrmd[20865]:  warning: Rejecting IPC request
'lrmd_rsc_info' from unprivileged client crmd
Nov 12 06:26:06 cluster-1 lrmd[20865]:  warning: Rejecting IPC request
'lrmd_rsc_info' from unprivileged client crmd
Nov 12 06:26:06 cluster-1 lrmd[20865]:  warning: Rejecting IPC request
'lrmd_rsc_register' from unprivileged client crmd
Nov 12 06:26:06 cluster-1 lrmd[20865]:  warning: Rejecting IPC request
'lrmd_rsc_info' from unprivileged client crmd
Nov 12 06:26:06 cluster-1 crmd[20868]:    error: Could not add resource
service to LRM cluster-1
Nov 12 06:26:06 cluster-1 crmd[20868]:    error: Invalid resource
definition for service
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: bad input
<create_request_adv origin="te_rsc_command" t="crmd" version="3.0.11"
subt="request" reference="lrm_invoke-tengine-xxx-29"
crm_task="lrm_invoke" crm_sys_to="lrmd" crm_sys_from="tengine"
crm_host_to="cluster-1" src="cluster-2" acl_target="hacluster"
crm_user="hacluster">
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: bad input     <crm_xml>
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: bad input       <rsc_op
id="5" operation="monitor" operation_key="service:1_monitor_0"
on_node="cluster-1" on_node_uuid="xxx" transition-key="xxx">
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: bad input
<primitive id="service" long-id="service:1" class="systemd" type="service"/>
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: bad input
<attributes CRM_meta_clone="1" CRM_meta_clone_max="2"
CRM_meta_clone_node_max="1" CRM_meta_globally_unique="false"
CRM_meta_notify="false" CRM_meta_op_target_rc="7"
CRM_meta_timeout="15000" crm_feature_set="3.0.11"/>
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: bad input       </rsc_op>
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: bad input     </crm_xml>
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: bad input
</create_request_adv>
Nov 12 06:26:06 cluster-1 lrmd[20865]:  warning: Rejecting IPC request
'lrmd_rsc_info' from unprivileged client crmd
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: Resource service no
longer exists in the lrmd
Nov 12 06:26:06 cluster-1 crmd[20868]:    error: Result of probe
operation for service on cluster-1: Error
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: Input I_FAIL received
in state S_NOT_DC from get_lrm_resource
Nov 12 06:26:06 cluster-1 crmd[20868]:   notice: State transition
S_NOT_DC -> S_RECOVERY
Nov 12 06:26:06 cluster-1 crmd[20868]:  warning: Fast-tracking shutdown
in response to errors
Nov 12 06:26:06 cluster-1 crmd[20868]:    error: Input I_TERMINATE
received in state S_RECOVERY from do_recover
Nov 12 06:26:06 cluster-1 crmd[20868]:   notice: Disconnected from the LRM
Nov 12 06:26:06 cluster-1 crmd[20868]:   notice: Disconnected from Corosync
Nov 12 06:26:06 cluster-1 crmd[20868]:    error: Could not recover from
internal error
Nov 12 06:26:06 cluster-1 pacemakerd[20857]:    error: The crmd process
(20868) exited: Generic Pacemaker error (201)
Nov 12 06:26:06 cluster-1 pacemakerd[20857]:   notice: Respawning failed
child process: crmd

My corosync.conf is quite standard:
totem {
        version: 2
        cluster_name: debian
        token: 0
        token_retransmits_before_loss_const: 10
        clear_node_high_bit: yes
        crypto_cipher: aes256
        crypto_hash: sha256
        interface {
                ringnumber: 0
                bindnetaddr: xxx
                mcastaddr: yyy
                mcastport: 5405
                ttl: 1
        }
}
logging {
        fileline: off
        to_stderr: yes
        to_logfile: yes
        logfile: /var/log/corosync/corosync.log
        to_syslog: yes
        syslog_facility: daemon
        debug: off
        timestamp: on
        logger_subsys {
                subsys: QUORUM
                debug: off
        }
}
quorum {
        provider: corosync_votequorum
        expected_votes: 2
}

So is my crm configuration:
node xxx: cluster-1 \
        attributes standby=off
node xxx: cluster-2 \
        attributes standby=off
primitive service systemd:service \
        meta failure-timeout=30 \
        op monitor interval=5 on-fail=restart timeout=15s
primitive vip-1 IPaddr2 \
        params ip=xxx cidr_netmask=32 \
        op monitor interval=10s
primitive vip-2 IPaddr2 \
        params ip=xxx cidr_netmask=32 \
        op monitor interval=10s
clone clone_service service
colocation service_vip-1 inf: vip-1 clone_service
colocation service_vip-2 inf: vip-2 clone_service
order kot_before_vip-1 inf: clone_service vip-1
order kot_before_vip-2 inf: clone_service vip-2
location prefer-cluster1-vip-1 vip-1 1: cluster-1
location prefer-cluster2-vip-2 vip-2 1: cluster-2
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=1.1.16-94ff4df \
        cluster-infrastructure=corosync \
        cluster-name=debian \
        stonith-enabled=false \
        no-quorum-policy=ignore \
        cluster-recheck-interval=1m \
        last-lrm-refresh=1605159600
rsc_defaults rsc-options: \
        failure-timeout=5m \
        migration-threshold=1

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to