Hi,

On Jun 1, 2006, at 10:08 PM, Dylan Hansen wrote:


Ok, so I did as you said and I did get the gossip server running on 192.168._.11:

[HostUtils.getLocalAddress()] :LOCAL ADDRESS -> /192.168._.11
FIFO: Debugging started
[org.continuent.appia.protocols.udpsimple.UdpSimpleSession]UDP: Debugging started192.168._.11
appia:gossipServer:GossipServerSession: handleTimer
clients={}

I then start the controller on 192.168._.11 and I see that it can talk to the gossip server:

appia:gossipServer:GossipServerSession: Not sending to group because there isn't one.
appia:gossipServer:GossipServerSession: handleTimer
clients={[(/192.168._.11:27755),1] , }
appia:gossipServer:GossipServerSession: handleTimer
clients={[(/192.168._.11:27755),2] , }
appia:gossipServer:GossipServerSession: handleTimer
clients={[(/192.168._.11:27755),3] , }
appia:gossipServer:GossipServerSession: Not sending to group because there isn't one.

One thing I see is the message "Not sending to group because there isn't one".  Maybe because there is only one member talking to the gossip server?

No, this is normal. If you are (for instance) in a WAN and you cannot use IPMulticast, you need some mechanism that has a well known address and forwards messages from one member to the others. This is the gossip service: A proxy that forards messages from one member to the other until the members know each other. From this moment, this service is not needed any more.

Ok, but this could be a point of failure: if this service is not alive, the members in a WAN will never merge their initial views. That why this service could be replicated: if you do not put the "-solo" option, this service starts a member of a group of gossips. It will use group communication to distribute the gossip service. That is the group that the gossip was talking about, not the group of Sequoia controllers.

But this service is very simple, it's just to needed for the members to get to know each other, nothing more.


I then start the second controller:

appia:gossipServer:GossipServerSession: sending to /192.168._.11:27755
appia:gossipServer:GossipServerSession:          from /192.168._.13:27755
appia:gossipServer:GossipServerSession: Not sending to group because there isn't one.
appia:gossipServer:GossipServerSession: handleTimer
clients={[(/192.168._.11:27755),7] , [(/192.168._.13:27755),1] , }

As you can see, the second member is seen by the group!  Hooray!  However, further down the line, the member that just joined gets lost:


This is normal, the gossip service purges the list and will keep only the coordinators of the groups and with new members after some time the members that are not coordinators, will be purged.

appia:gossipServer:GossipServerSession: handleTimer
clients={[(/192.168._.11:27755),5] , }
appia:gossipServer:GossipServerSession: handleTimer
clients={[(/192.168._.11:27755),6] , }

I'm not sure why the joining member would join and get lost like that.  Here are some of the Appia/Hedera/Sequoia messages I see in my controller log:
 
14:01:54,295 DEBUG continuent.hedera.channel Received Message
14:01:54,300 DEBUG continuent.hedera.channel received message NOT fragmented
14:01:54,306 DEBUG continuent.hedera.channel LocalMembership: Member(address=192.168._.11/192.168._.11:27755:27755, uid=192.168._.11:27755) : members list from Message: [Member(address=192.168._.11/192.168._.11:27755:27755, uid=192.168._.11:27755)]
14:01:54,306 DEBUG continuent.hedera.channel delivering message


Do you see any view change from the message above to the messages below?

14:01:54,317 DEBUG continuent.hedera.adapters Replying to Member(address=192.168._.13/192.168._.13:27755:27755, uid=192.168._.13:27755) for message 1
14:01:54,320 DEBUG continuent.hedera.channel Received Message
14:01:54,321 DEBUG continuent.hedera.channel received message NOT fragmented
14:01:54,327 DEBUG continuent.hedera.channel LocalMembership: Member(address=192.168._.11/192.168._.11:27755:27755, uid=192.168._.13:27755) : members list from Message: [Member(address=192.168._.13/192.168._.13:27755:27755, uid=192.168._.13:27755)]
14:01:54,327 DEBUG continuent.hedera.channel NOT delivering the message

What's strange here is that I see "NOT delivering the message".  I wonder why this is.


It looks like an issue in the Appia binding of hedera, not Appia it self. This could be verified by starting several instances of the gossip service without the -solo option. If you see that it starts a group, Appia group communication is working. In this case, the server will have it's own addresses as clients :)

I'm going to take a look to the hedera-appia binding code.

Thanks for your feedback,
--
Nuno Carvalho
University of Lisbon, Portugal



_______________________________________________
Hedera mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/hedera

Reply via email to