Emmanuel:

I increased the timeout on the sequencer.xml file and
things seem to be working great now!  Thanks for the help.

Is there a programmatic way to see if a controller has
left the group?  I would like to set up automated
monitoring if I can.

Thanks,
        Neil 


--
Neil Aggarwal, (214)986-3533, www.JAMMConsulting.com
FREE! Eliminate junk email and reclaim your inbox.
Visit http://www.spammilter.com for details.

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Emmanuel
Cecchet
Sent: Monday, November 06, 2006 11:15 AM
To: Sequoia general mailing list
Subject: Re: [Sequoia] Controllers think they are not connected

Hi Neil,

> But, shortly after that, I see these messages (This is the output
> on controller 1):
>
> 2006-11-03 11:22:44,136 WARN  protocols.pbcast.GMS I am the coord and I'm
> being am suspected -- will probably leave shortly
> 2006-11-03 11:22:44,144 WARN  protocols.pbcast.GMS I (206.123.70.59:7800)
am
> not a member of view [38.100.86.36:7800|2] [38.100.86.36:7800], shunning
> myself and leaving the group (prev_members are [206.123.70.59:7800
> 38.100.86.36:7800 ], current view is [206.123.70.59:7800|1]
> [206.123.70.59:7800, 38.100.86.36:7800])
> 2006-11-03 11:22:44,389 ERROR continuent.hedera.channel Unhandled JGroups
> message type (class org.jgroups.ExitEvent): ExitEvent.
>   
For some reason, JGroups decides that there is a failure in the total 
order sequencer and decides to quit the group by itself.
You might have to tune your failure detectors timeouts (they might be 
too short).
The best would be to get help from the JGroups mailing list (I saw that 
you already posted there and that's probably the best thing to do).

> When I try to issue a command against the database, I see these messages
> (on controller 1):
>
> 2006-11-03 11:25:22,030 WARN  controller.RequestManager.cbsweb An error
> occured while executing distributed request 1
> org.continuent.hedera.channel.NotConnectedException:
ChannelClosedException
>
> It looks like they are not liking the tunneled connections.
>
> Any ideas how to fix this?
>   
The problem is once JGroups has exited the group, it does not reconnect 
(or try to re-join). This is an issue since your controller becomes 
isolated from the rest of the world. The fix is to get a working 
configuration of JGroups (or try Appia with the help of Nuno).

Keep us posted with your findings,
Emmanuel

-- 
Emmanuel Cecchet
Chief Scientific Officer, Continuent

Blog: http://emanux.blogspot.com/
Open source: http://www.continuent.org
Corporate: http://www.continuent.com
Skype: emmanuel_cecchet
Cell: +33 687 342 685


_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia


_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Reply via email to