Hi, 

This sounds like something that someone on the openais would know.  I've
CC'd the openais list.

-- Lon

On Fri, 2008-06-27 at 16:03 +1000, Bevan Broun wrote:
> Hi All
> 
> I have a 2 node RHEL-5.1 cluster. A quorum disk is configured.
> The hosts have 4 NICs. These are bonded:
> (eth0+eth2) -> bond0
> (eth1+eth3) -> bond1
> Unfortunately I was not able to use a dedicated interface for cluster 
> communications - bond1 is being used. This is where I think Im in trouble.
> 
> The cluster has been configured using IP addressess. I did have to use 
> http://archives.free.net.ph/message/20080130.074958.5c7a211c.en.html
> as the hostname is related to the bond0 IP.
> 
> I have not defined the interface to be used by the cluster, just relying on 
> the IP address configured.
> The cluster's purpose is 2 GFS file systems.
> 
> The cluster was configured and working for 4 days before there was problems.
> 
> I now have almost constant lost of token message in /var/log/message. They 
> are almost exactly 5 minutes apart. A typical bit of messages file is show 
> below my sig.
> 
> Just before the problem started a samba message shows nmdb becomming local 
> master browser for a work group on the interface used for cluster 
> communications.
> 
> Jun 20 13:39:27 HOST1 nmbd[24506]: [2008/06/20 13:39:27, 0] 
> nmbd/nmbd_become_lmb.c:become_loca
> l_master_stage2(396)
> Jun 20 13:39:27 HOST1 nmbd[24506]:   *****
> Jun 20 13:39:27 HOST1 nmbd[24506]:
> Jun 20 13:39:27 HOST1 nmbd[24506]:   Samba name server NBM1 is now a local 
> master browser for
> workgroup SMS_DOMAIN on subnet 162.16.96.229
> Jun 20 13:39:27 HOST1 nmbd[24506]:
> Jun 20 13:39:27 HOST1 nmbd[24506]:   *****
> Jun 20 13:43:27 HOST1 openais[15265]: [TOTEM] The token was lost in the 
> OPERATIONAL state.
> 
> "cman_tool status" shows both nodes and looks normal. Looks like clmvd is not 
> happy, df commands are hanging.
> 
> Could nmdb be causing this token loss? Any ideas on how to proceed?
> 
> (names and IPs have been changed).
> 
> Thanks
> 
> Bevan Broun
> Solutions Architect
> Ardec International
> http://www.ardec.com.au
> http://www.lisasoft.com
> http://www.terrapages.com
> Sydney
> -----------------------
> Suite 112,The Lower Deck
> 19-21 Jones Bay Wharf
> Pirrama Road, Pyrmont 2009
> Ph:  +61 2 8570 5000
> Fax: +61 2 8570 5099
> 
> 
> 
> Jun 20 13:48:31 HOST1 openais[15265]: [TOTEM] The token was lost in the 
> OPERATIONAL state.
>  Jun 20 13:48:31 HOST1 openais[15265]: [TOTEM] Receive multicast socket recv 
> buffer size (28800
>  0 bytes).
>  Jun 20 13:48:31 HOST1 openais[15265]: [TOTEM] Transmit multicast socket send 
> buffer size (2621
>  42 bytes).
>  Jun 20 13:48:31 HOST1 openais[15265]: [TOTEM] entering GATHER state from 2.
>  Jun 20 13:48:31 HOST1 openais[15265]: [TOTEM] Creating commit token because 
> I am the rep.
>  Jun 20 13:48:31 HOST1 openais[15265]: [TOTEM] Saving state aru 16 high seq 
> received 16
>  Jun 20 13:48:31 HOST1 openais[15265]: [TOTEM] Storing new sequence id for 
> ring 20ce34
>  Jun 20 13:48:31 HOST1 openais[15265]: [TOTEM] entering COMMIT state.
>  Jun 20 13:48:41 HOST1 openais[15265]: [TOTEM] The token was lost in the 
> COMMIT state.
>  Jun 20 13:48:41 HOST1 openais[15265]: [TOTEM] entering GATHER state from 4.
>  Jun 20 13:48:41 HOST1 openais[15265]: [TOTEM] Creating commit token because 
> I am the rep.
>  Jun 20 13:48:41 HOST1 openais[15265]: [TOTEM] Storing new sequence id for 
> ring 20ce38
>  Jun 20 13:48:41 HOST1 openais[15265]: [TOTEM] entering COMMIT state.
>  Jun 20 13:48:51 HOST1 openais[15265]: [TOTEM] The token was lost in the 
> COMMIT state.
>  Jun 20 13:48:51 HOST1 openais[15265]: [TOTEM] entering GATHER state from 4.
>  Jun 20 13:48:51 HOST1 openais[15265]: [TOTEM] Creating commit token because 
> I am the rep.
>  Jun 20 13:48:51 HOST1 openais[15265]: [TOTEM] Storing new sequence id for 
> ring 20ce3c
>  Jun 20 13:48:51 HOST1 openais[15265]: [TOTEM] entering COMMIT state.
>  Jun 20 13:49:01 HOST1 openais[15265]: [TOTEM] The token was lost in the 
> COMMIT state.
>  Jun 20 13:49:01 HOST1 openais[15265]: [TOTEM] entering GATHER state from 4.
>  Jun 20 13:49:01 HOST1 openais[15265]: [TOTEM] Creating commit token because 
> I am the rep.
>  Jun 20 13:49:01 HOST1 openais[15265]: [TOTEM] Storing new sequence id for 
> ring 20ce40
>  Jun 20 13:49:01 HOST1 openais[15265]: [TOTEM] entering COMMIT state.
>  Jun 20 13:49:06 HOST1 openais[15265]: [TOTEM] entering RECOVERY state.
>  Jun 20 13:49:06 HOST1 openais[15265]: [TOTEM] position [0] member 
> 162.16.96.229:
>  Jun 20 13:49:06 HOST1 openais[15265]: [TOTEM] previous ring seq 2149936 rep 
> 162.16.96.229
>  Jun 20 13:49:06 HOST1 openais[15265]: [TOTEM] aru 16 high delivered 16 
> received flag 1
>  Jun 20 13:49:06 HOST1 openais[15265]: [TOTEM] position [1] member 
> 162.16.96.230:
>  Jun 20 13:49:06 HOST1 openais[15265]: [TOTEM] previous ring seq 2149936 rep 
> 162.16.96.229
>  Jun 20 13:49:06 HOST1 openais[15265]: [TOTEM] aru 16 high delivered 16 
> received flag 1
>  Jun 20 13:49:06 HOST1 openais[15265]: [TOTEM] Did not need to originate any 
> messages in recove
>  ry.
> Jun 20 13:49:06 HOST1 openais[15265]: [TOTEM] Sending initial ORF token
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ] CLM CONFIGURATION CHANGE
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ] New Configuration:
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ]    r(0) ip(162.16.96.229)
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ]    r(0) ip(162.16.96.230)
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ] Members Left:
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ] Members Joined:
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ] CLM CONFIGURATION CHANGE
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ] New Configuration:
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ]    r(0) ip(162.16.96.229)
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ]    r(0) ip(162.16.96.230)
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ] Members Left:
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ] Members Joined:
>  Jun 20 13:49:06 HOST1 openais[15265]: [SYNC ] This node is within the 
> primary component and wi
>  ll provide service.
>  Jun 20 13:49:06 HOST1 openais[15265]: [TOTEM] entering OPERATIONAL state.
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ] got nodejoin message 
> 162.16.96.229
>  Jun 20 13:49:06 HOST1 openais[15265]: [CLM  ] got nodejoin message 
> 162.16.96.230
>  Jun 20 13:49:06 HOST1 openais[15265]: [CPG  ] got joinlist message from node 
> 2
>  Jun 20 13:49:06 HOST1 openais[15265]: [CPG  ] got joinlist message from node 
> 1
>  Jun 20 13:53:38 HOST1 openais[15265]: [TOTEM] The token was lost in the 
> OPERATIONAL state.
> 
> The contents of this email are confidential and may be subject to legal or 
> professional privilege and copyright. No representation is made that this 
> email is free of viruses or other defects. If you have received this 
> communication in error, you may not copy or distribute any part of it or 
> otherwise disclose its contents to anyone. Please advise the sender of your 
> incorrect receipt of this correspondence.
> 
> --
> Linux-cluster mailing list
> [EMAIL PROTECTED]
> https://www.redhat.com/mailman/listinfo/linux-cluster

_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to