Re: [Pacemaker] errors in corosync.log
cibadmin 1.0.5 for OpenAIS and Heartbeat (Build: 9e9faaab40f3f97e3c0d623e4a4c47ed83fa1601) -Shravan On Tue, Jan 19, 2010 at 8:29 AM, Andrew Beekhof and...@beekhof.net wrote: On Sat, Jan 16, 2010 at 9:20 PM, Shravan Mishra shravan.mis...@gmail.com wrote: Hi Guys, I'm running the following version of pacemaker and corosync corosync=1.1.1-1-2 pacemaker=1.0.9-2-1 That pacemaker version doesn't exist. What does cibadmin --version say? And are you sure about the corosync version, it doesn't look right either. Every thing had been running fine for quite some time now but then I started seeing following errors in the corosync logs, = Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data I can perform all the crm shell commands and what not but it's troubling that the above is happening. My crm_mon output looks good. I also checked the authkey and did md5sum on both it's same. Then I stopped corosync and regenerated the authkey with corosync-keygen and copied it to the the other machine but I still get the above message in the corosync log. Is there anything other authkey that I should look into ? corosync.conf # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2 token: 3000 token_retransmits_before_loss_const: 10 join: 60 consensus: 1500 vsftype: none max_messages: 20 clear_node_high_bit: yes secauth: on threads: 0 rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 192.168.2.0 #mcastaddr: 226.94.1.1 broadcast: yes mcastport: 5405 } interface { ringnumber: 1 bindnetaddr: 172.20.20.0 #mcastaddr: 226.94.1.1 broadcast: yes mcastport: 5405 } } logging { fileline: off to_stderr: yes to_logfile: yes to_syslog: yes logfile: /tmp/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } } service { name: pacemaker ver: 0 } aisexec { user:root group: root } amf { mode: disabled } === Thanks Shravan ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] errors in corosync.log
Corosync Cluster Engine, version '1.1.1' SVN revision '2534' Copyright (c) 2006-2009 Red Hat, Inc. Shravan On Tue, Jan 19, 2010 at 10:59 AM, Shravan Mishra shravan.mis...@gmail.com wrote: cibadmin 1.0.5 for OpenAIS and Heartbeat (Build: 9e9faaab40f3f97e3c0d623e4a4c47ed83fa1601) -Shravan On Tue, Jan 19, 2010 at 8:29 AM, Andrew Beekhof and...@beekhof.net wrote: On Sat, Jan 16, 2010 at 9:20 PM, Shravan Mishra shravan.mis...@gmail.com wrote: Hi Guys, I'm running the following version of pacemaker and corosync corosync=1.1.1-1-2 pacemaker=1.0.9-2-1 That pacemaker version doesn't exist. What does cibadmin --version say? And are you sure about the corosync version, it doesn't look right either. Every thing had been running fine for quite some time now but then I started seeing following errors in the corosync logs, = Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data I can perform all the crm shell commands and what not but it's troubling that the above is happening. My crm_mon output looks good. I also checked the authkey and did md5sum on both it's same. Then I stopped corosync and regenerated the authkey with corosync-keygen and copied it to the the other machine but I still get the above message in the corosync log. Is there anything other authkey that I should look into ? corosync.conf # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2 token: 3000 token_retransmits_before_loss_const: 10 join: 60 consensus: 1500 vsftype: none max_messages: 20 clear_node_high_bit: yes secauth: on threads: 0 rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 192.168.2.0 #mcastaddr: 226.94.1.1 broadcast: yes mcastport: 5405 } interface { ringnumber: 1 bindnetaddr: 172.20.20.0 #mcastaddr: 226.94.1.1 broadcast: yes mcastport: 5405 } } logging { fileline: off to_stderr: yes to_logfile: yes to_syslog: yes logfile: /tmp/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } } service { name: pacemaker ver: 0 } aisexec { user:root group: root } amf { mode: disabled } === Thanks Shravan ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] errors in corosync.log
On Sat, Jan 16, 2010 at 9:20 PM, Shravan Mishra shravan.mis...@gmail.com wrote: Hi Guys, I'm running the following version of pacemaker and corosync corosync=1.1.1-1-2 pacemaker=1.0.9-2-1 Every thing had been running fine for quite some time now but then I started seeing following errors in the corosync logs, = Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data I can perform all the crm shell commands and what not but it's troubling that the above is happening. My crm_mon output looks good. I also checked the authkey and did md5sum on both it's same. Then I stopped corosync and regenerated the authkey with corosync-keygen and copied it to the the other machine but I still get the above message in the corosync log. Are you sure there's not a third node somewhere broadcasting on that mcast and port combination? Is there anything other authkey that I should look into ? corosync.conf # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2 token: 3000 token_retransmits_before_loss_const: 10 join: 60 consensus: 1500 vsftype: none max_messages: 20 clear_node_high_bit: yes secauth: on threads: 0 rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 192.168.2.0 #mcastaddr: 226.94.1.1 broadcast: yes mcastport: 5405 } interface { ringnumber: 1 bindnetaddr: 172.20.20.0 #mcastaddr: 226.94.1.1 broadcast: yes mcastport: 5405 } } logging { fileline: off to_stderr: yes to_logfile: yes to_syslog: yes logfile: /tmp/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } } service { name: pacemaker ver: 0 } aisexec { user:root group: root } amf { mode: disabled } === Thanks Shravan ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] errors in corosync.log
Hi , Since the interfaces on the two nodes are connected via cross over cable so there is no chance of that happening and since I'm using rrp: passive, which means that the other ring i.e. ring 1 will come into play only when ring 0 fails,I assume. I say this because ring 1 interface is on the network. Once interesting that I observed was that lintomcrypt is being used for crypto reasons because I have secauth: on. But I couldn't find that library on my machine. I'm wondering if it's because of that. Basically we are using 3 interfaces eth0, eth1 and eth2. eth0 and eth2 are for ring 0 and ring 1 respectively. eth1 is the primary interface. This is what my drbd.conf looks like: == # please have a a look at the example configuration file in # /usr/share/doc/drbd82/drbd.conf # global { usage-count no; } common { protocol C; startup { wfc-timeout 120; degr-wfc-timeout 120; } } resource var_nsm { syncer { rate 333M; } handlers { fence-peer /usr/lib/drbd/crm-fence-peer.sh; after-resync-target /usr/lib/drbd/crm-unfence-peer.sh; } net { after-sb-1pri discard-secondary; } on node1.itactics.com { device /dev/drbd1; disk /dev/sdb3; address 172.20.20.1:7791; meta-disk internal; } on node2.itactics.com { device /dev/drbd1; disk /dev/sdb3; address 172.20.20.2:7791; meta-disk internal; } } = eth0's of the two nodes are connected via cross over as I mentioned and eth1 and eth2 are on the network. I'm not a networking expert but is it possible that broadcast done by ,let's say, any node not in my cluster, will still cause it to come to my nodes through other interfaces which are attached to the network? We in the dev and the QA guys are testing this in parallel. And let's say there is QA cluster of two nodes and dev cluster of 2 nodes. And interfaces for both of them are hooked as I mentioned above and that corosync.conf for both the clusters have bindnetaddr: 192.168.2.0. Is there possibility of bad messages for the cluster casused by the other. We are in the final leg of the testing and this came up. Thanks for the help. Shravan On Mon, Jan 18, 2010 at 2:58 AM, Andrew Beekhof and...@beekhof.net wrote: On Sat, Jan 16, 2010 at 9:20 PM, Shravan Mishra shravan.mis...@gmail.com wrote: Hi Guys, I'm running the following version of pacemaker and corosync corosync=1.1.1-1-2 pacemaker=1.0.9-2-1 Every thing had been running fine for quite some time now but then I started seeing following errors in the corosync logs, = Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data I can perform all the crm shell commands and what not but it's troubling that the above is happening. My crm_mon output looks good. I also checked the authkey and did md5sum on both it's same. Then I stopped corosync and regenerated the authkey with corosync-keygen and copied it to the the other machine but I still get the above message in the corosync log. Are you sure there's not a third node somewhere broadcasting on that mcast and port combination? Is there anything other authkey that I should look into ? corosync.conf # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2 token: 3000 token_retransmits_before_loss_const: 10 join: 60 consensus: 1500 vsftype: none max_messages: 20 clear_node_high_bit: yes secauth: on threads: 0 rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 192.168.2.0 #mcastaddr: 226.94.1.1 broadcast: yes mcastport: 5405 } interface { ringnumber: 1 bindnetaddr: 172.20.20.0 #mcastaddr: 226.94.1.1 broadcast: yes mcastport: 5405 } } logging { fileline: off to_stderr: yes to_logfile: yes to_syslog: yes logfile: /tmp/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } } service { name: pacemaker ver: 0 } aisexec { user:root group:
Re: [Pacemaker] errors in corosync.log
One possibility is you have a different cluster in your network on the same multicast address and port. Regards -steve On Sat, 2010-01-16 at 15:20 -0500, Shravan Mishra wrote: Hi Guys, I'm running the following version of pacemaker and corosync corosync=1.1.1-1-2 pacemaker=1.0.9-2-1 Every thing had been running fine for quite some time now but then I started seeing following errors in the corosync logs, = Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data Jan 16 15:08:39 corosync [TOTEM ] Received message has invalid digest... ignoring. Jan 16 15:08:39 corosync [TOTEM ] Invalid packet data I can perform all the crm shell commands and what not but it's troubling that the above is happening. My crm_mon output looks good. I also checked the authkey and did md5sum on both it's same. Then I stopped corosync and regenerated the authkey with corosync-keygen and copied it to the the other machine but I still get the above message in the corosync log. Is there anything other authkey that I should look into ? corosync.conf # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2 token: 3000 token_retransmits_before_loss_const: 10 join: 60 consensus: 1500 vsftype: none max_messages: 20 clear_node_high_bit: yes secauth: on threads: 0 rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 192.168.2.0 #mcastaddr: 226.94.1.1 broadcast: yes mcastport: 5405 } interface { ringnumber: 1 bindnetaddr: 172.20.20.0 #mcastaddr: 226.94.1.1 broadcast: yes mcastport: 5405 } } logging { fileline: off to_stderr: yes to_logfile: yes to_syslog: yes logfile: /tmp/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } } service { name: pacemaker ver: 0 } aisexec { user:root group: root } amf { mode: disabled } === Thanks Shravan ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker