Re: [ClusterLabs] crmd error: Cannot route message to unknown node

Ken Gaillot Thu, 07 Apr 2016 14:10:13 -0700

On 04/07/2016 03:22 PM, Ferenc Wágner wrote:
> Hi,
> 
> On a freshly rebooted cluster node (after crm_mon reports it as
> 'online'), I get the following:
> 
> wferi@vhbl08:~$ sudo crm_resource -r vm-cedar --cleanup
> Cleaning up vm-cedar on vhbl03, removing fail-count-vm-cedar
> Cleaning up vm-cedar on vhbl04, removing fail-count-vm-cedar
> Cleaning up vm-cedar on vhbl05, removing fail-count-vm-cedar
> Cleaning up vm-cedar on vhbl06, removing fail-count-vm-cedar
> Cleaning up vm-cedar on vhbl07, removing fail-count-vm-cedar
> Cleaning up vm-cedar on vhbl08, removing fail-count-vm-cedar
> Waiting for 6 replies from the CRMd..No messages received in 60 seconds.. 
> aborting
> 
> Meanwhile, this is written into syslog (I can also provide info level
> logs if necessary):
> 
> 22:03:02 vhbl08 crmd[8990]:    error: Cannot route message to unknown node 
> vhbl03
> 22:03:02 vhbl08 crmd[8990]:    error: Cannot route message to unknown node 
> vhbl04
> 22:03:02 vhbl08 crmd[8990]:    error: Cannot route message to unknown node 
> vhbl06
> 22:03:02 vhbl08 crmd[8990]:    error: Cannot route message to unknown node 
> vhbl07


This message can only occur when the node name is not present in this
node's peer cache.

I'm guessing that since you don't have node names in corosync, the cache
entries only have node IDs at this point. I don't know offhand when
pacemaker would figure out the association, but I bet it would be
possible to ensure it by running some command beforehand, maybe crm_node -l?

> 22:03:04 vhbl08 crmd[8990]:   notice: Operation vm-cedar_monitor_0: not 
> running (node=vhbl08, call=626, rc=7, cib-update=169, confirmed=true)
> 
> For background:
> 
> wferi@vhbl08:~$ sudo cibadmin --scope=nodes -Q
> <nodes>
>   <node id="167773707" uname="vhbl05">
>     <utilization id="nodes-167773707-utilization">
>       <nvpair id="nodes-167773707-utilization-memoryMiB" name="memoryMiB" 
> value="124928"/>
>     </utilization>
>     <instance_attributes id="nodes-167773707"/>
>   </node>
>   <node id="167773708" uname="vhbl06">
>     <utilization id="nodes-167773708-utilization">
>       <nvpair id="nodes-167773708-utilization-memoryMiB" name="memoryMiB" 
> value="124928"/>
>     </utilization>
>     <instance_attributes id="nodes-167773708"/>
>   </node>
>   <node id="167773706" uname="vhbl04">
>     <utilization id="nodes-167773706-utilization">
>       <nvpair id="nodes-167773706-utilization-memoryMiB" name="memoryMiB" 
> value="124928"/>
>     </utilization>
>     <instance_attributes id="nodes-167773706"/>
>   </node>
>   <node id="167773705" uname="vhbl03">
>     <utilization id="nodes-167773705-utilization">
>       <nvpair id="nodes-167773705-utilization-memoryMiB" name="memoryMiB" 
> value="124928"/>
>     </utilization>
>     <instance_attributes id="nodes-167773705"/>
>   </node>
>   <node id="167773709" uname="vhbl07">
>     <utilization id="nodes-167773709-utilization">
>       <nvpair id="nodes-167773709-utilization-memoryMiB" name="memoryMiB" 
> value="124928"/>
>     </utilization>
>     <instance_attributes id="nodes-167773709"/>
>   </node>
>   <node id="167773710" uname="vhbl08">
>     <utilization id="nodes-167773710-utilization">
>       <nvpair id="nodes-167773710-utilization-memoryMiB" name="memoryMiB" 
> value="124928"/>
>     </utilization>
>   </node>
> </nodes>
> 
> Why does this happen?  I've got no node names in corosync.conf, but
> Pacemaker defaults to uname -n all right.
> 


_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] crmd error: Cannot route message to unknown node

Reply via email to