Re: [Openais] Corosync 1.2.8 totem membership behaviour

Steven Dake Tue, 05 Oct 2010 11:22:03 -0700

On 10/05/2010 02:29 AM, Ranjith wrote:
> Hi Steve,
>
> Please comment on the below.
>
>
> Regards,
> Ranjith
>
>
> On Fri, Oct 1, 2010 at 12:04 AM, Ranjith <ranjith.nath...@gmail.com
> <mailto:ranjith.nath...@gmail.com>> wrote:
>
>     Hi steve,
>
>     Network is like this:
>     A (block all packets from src C)
>     B
>     C (block all packets from src A)
>
>
>
>     Nodes
>     A,B,C
>     A sends join (multicast)
>     Only B receives. (C drops it because of ACL)
>     B sends join (multicast) (with A,B)
>
>     A,C receive join
>     C sends join (with A,B,C)
>     Only B receives the above
>
>     B sends join (with A,B,C)
>     A, C sends join (with A,B,C)
>     B gets consensus but suppose A is the smallest Id
>
>     But A never gets consensus as A cannot get join from C
>


This is not exactly how the algorithm works.  I recommend reading the 
totem specification if you want the details.  After you have read the 
specification, we can go through an example of the proc and fail lists 
in this scenario.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.37.767&rep=rep1&type=pdf

the algorithm for handling a join message is described on page 16 Figure 
3 "Join message from processor q received" and page 17 Figure 4 "Join 
message from processor q received".

Regards
-steve


>     Am I correct till this point?
>
>     Regards,
>     Ranjith
>
>
>
>
>     On Thu, Sep 30, 2010 at 11:49 PM, Steven Dake <sd...@redhat.com
>     <mailto:sd...@redhat.com>> wrote:
>
>         On 09/30/2010 10:40 AM, Ranjith wrote:
>
>             Hi Steve,
>
>             I believe you mean to say that the same acl rules should be
>             applied in
>             the outgoing side also.
>             But since here the nodes are not receiving any packet (both
>             multicast
>             and unicast) from the other, i believe it will also not send
>             to the
>             other....Is that right?
>
>
>
>         That assumption is incorrect.  Example:
>
>         Nodes
>         A,B,C
>         A sends join (multicast)
>         B,C receive join
>         B sends join (multicast)
>         A,C receive join
>         C sends join (with A,B,C)
>         now A rejects that message.
>
>         As a result, the nodes can never come to consensus.
>
>         Regards
>         -steve
>
>             Regards,
>             Ranjith
>
>             On Thu, Sep 30, 2010 at 10:41 PM, Steven Dake
>             <sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>> wrote:
>
>                 On 09/30/2010 03:47 AM, Ranjith wrote:
>
>                     Hi all,
>
>                     Kindly let know whether corosync considers the below
>             network as
>                     byzantine failure i.e the case where N1 and N3 does
>             not have
>                     connectivity?
>                     I am testing such scenarios as i believe such a
>             behaviour can
>                     happen due
>                     to some misbehaviour in switch (stale arp entries).
>
>
>
>                 What makes the fault byzantine is that only incoming
>             packets are
>                 blocked.  If you block both incoming and outgoing
>             packets on the
>                 nodes, the fault is not byzantine and totem will behave
>             properly.
>
>                 Regards
>                 -steve
>
>                     Regards,
>                     Ranjith
>
>
>                     Untitled.png
>                     On Sat, Sep 25, 2010 at 9:47 AM, Ranjith
>             <ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>>> wrote:
>
>                         Hi Steve,
>                         Just to make it clear. Do you mean that in the
>             above case If
>                     N3 is
>                         part of the network, it should have connectivity
>             to both N2
>                     and N1
>                         and if it happens so
>                         that N3 has connectivity to N2 only, corosync
>             doesnot take
>                     care of
>                         the same.
>                         Regards,
>                         Ranjith
>                         On Sat, Sep 25, 2010 at 9:39 AM, Steven Dake
>             <sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>>> wrote:
>
>                             On 09/24/2010 08:20 PM, Ranjith wrote:
>
>                                 Hi ,
>                                 It is hard to tell what is happening
>             without logs
>                     from all 3
>                                 nodes. Does
>                                 this only happen at system start, or can
>             you duplicate 5
>                                 minutes after
>                                 systems have started?
>
>              >> The cluster is never stabilizing. It keeps on
>                                     switching between the
>
>                                 membership and operational state.
>                                 Below is the test network which i am using:
>
>                                 Untitled.png
>
>              >> N1 and N3 does not reveive any packets from each
>                                     other. Here what i
>
>                                 expected was that either (N1,N2) or (N2,
>             N3) forms a two
>                                 node cluster
>                                 and stabilizes. But the cluster is never
>             stabilizing
>                     even
>                                 though 2 node
>                                 clusters are forming, it is going back
>             to membership [I
>                                 checked the logs
>                                 and it looks like because of the steps i
>             mentioned
>                     in the
>                                 previous mail,
>                                 this seems to be happening]
>
>
>
>                             ......  Where did you say you were testing a
>             byzantine
>                     fault in
>                             your original bug report?  Please be more
>             forthcoming in the
>                             future. Corosync does not protect against
>             byzantine faults.
>                               Allowing one way connectivity in network
>             connection = this
>                             fault scenario.  You can try coro-netctl
>             (the attached
>                     script)
>                             which will atomically block a network ip in
>             the network
>                     to test
>                             split brain scenarios without actually
>             pulling network
>                     cables.
>
>                             Regards
>                             -steve
>
>
>                                 Regards,
>                                 Ranjith
>                                 On Fri, Sep 24, 2010 at 11:36 PM, Steven
>             Dake
>             <sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>>>> wrote:
>
>                                     It is hard to tell what is happening
>             without
>                     logs from
>                                 all 3 nodes.
>                                     Does this only happen at system
>             start, or can you
>                                 duplicate 5
>                                     minutes after systems have started?
>
>                                     If it is at system start, you may
>             need to enable
>             "fast
>                                 STP" on your
>                                     switch.  It looks to me like node 3
>             gets some
>                     messages
>                                 through but
>                                     then is blocked.  STP will do this
>             in it's
>                     default state
>                                 on most
>                                     switches.
>
>                                     Another option if you can't enable
>             STP is to use
>                                 broadcast mode (man
>                                     openais.conf for details).
>
>                                     Also verify firewalls are properly
>             configured on all
>                                 nodes.  You can
>                                     join us on the irc server freenode on
>                     #linux-cluster for
>                                 real-time
>                                     assistance.
>
>                                     Regards
>                                     -steve
>
>
>                                     On 09/22/2010 11:33 PM, Ranjith wrote:
>
>                                         Hi Steve,
>                                           I am running corosync 1.2.8
>                                           I didn't get what u meant by
>             blackbox. I
>                     suppose it is
>                                         logs/debugs.
>                                           I just checked logs/debugs and
>             I am able to
>                                 understand the below:
>
>                                 1--------------2--------------3
>                                         1) Node1 and Node2 are already
>             in a 2node
>                     cluster
>                                         2) Now Node3 sends join with
>             ({1} , {} )
>                                 (proc_list/fail_list)
>                                         3) Node2 sends join ({1,2,3} ,
>             {}) and Node 1/3
>                                 updates to
>                                         ({1,2,3}, {})
>                                         4) Now Node 2 gets consensus
>             after some messages
>                                 [But 1 is the rep]
>                                         5) Consensus timeout fires at
>             node 1 for node 3,
>                                 node1 sends join as
>                                         ({1,2}, {3})
>                                         6) Node2 updates because of the
>             above message to
>                                 ({1,2}, {3})
>                                         and sends
>                                         out join. This join received by
>             node 3
>                     causes it to
>                                 update
>                                         ({1,3}, {2})
>                                         7) Node1and Node2 enter
>             operational (fail list
>                                 cleared by node2) but
>                                         node 3 join timeout fires and again
>                     membership state.
>                                         8) This will continue to happen
>             until consensus
>                                 fires at node3
>                                         for node1
>                                         and it moves to ({3}, {1,2})
>                                         9) Now Node1and Node2 from 2
>             node cluster and 3
>                                 forms a single
>                                         node cluster
>                                         10) Now node 2 broadcast a
>             Normal message
>                                         11) This message is received by
>             Node3 as a
>                     foreign
>                                 message which
>                                         forces
>                                         it to go to gather state
>                                         12) Again above steps ....
>                                         The cluster is never stabilizing.
>                                         I have attached the debugs for
>             Node2:
>                                         (1 - 10.102.33.115, 2 -
>             10.102.33.150, 3
>                     -10.102.33.180)
>                                         Regards,
>                                         Ranjith
>
>                                         On Wed, Sep 22, 2010 at 10:53
>             PM, Steven Dake
>             <sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>
>             <mailto:sd...@redhat.com <mailto:sd...@redhat.com>>>>>> wrote:
>
>                                             On 09/21/2010 11:15 PM,
>             Ranjith wrote:
>
>                                                 Hi all,
>                                                 Kindly comment on the
>             above behaviour
>                                                 Regards,
>                                                 Ranjith
>
>                                                 On Tue, Sep 21, 2010 at
>             9:52 PM, Ranjith
>             <ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>>>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>
>             <mailto:ranjith.nath...@gmail.com
>             <mailto:ranjith.nath...@gmail.com>>>>>>> wrote:
>
>                                                     Hi all,
>                                                     I was testing the
>             corosync cluster
>                                 engine by using the
>                                                 testcpg exec
>                                                     provided along with
>             the release.
>                     I am
>                                 getting the below
>                                                 behaviour
>                                                     while testing some
>             specific
>                     scenarios.
>                                 Kindly
>                                         comment on the
>                                                     expected behaviour.
>                                                     1)   3 Node cluster
>
>                     1---------2---------3
>                                                          a) suppose I
>             bring the
>                     nodes 1&2
>                                 up, it will form a
>                                                 ring (1,2)
>                                                          b) now bring up 3
>                                                          c) 3 sends join
>             which
>                     restarts the
>                                 membership
>                                         process
>                                                          d) (1,2) again
>             forms the
>                     ring , 3
>                                 forms self
>                                         cluster
>                                                          e) now 3 sends
>             a join (due
>                     to join
>                                 or other
>                                         timeout)
>                                                          f) again
>             membership protocol is
>                                 started as 2
>                                         responds
>                                                 to this
>                                                     by going to gather
>             state ( i
>                     believe 2
>                                 should not accept
>                                                 this as 2
>                                                     would have earlier
>             decided that
>                     3 is failed)
>                                                          I am seeing a
>             continuous
>                     loop of
>                                 the above
>                                         behaviour  (
>                                                     operational ->
>             membership ->
>                     operational
>                                 -> ) due to
>                                         which the
>                                                     cluster is not
>             becoming stabilized
>                                                     2)   3 Node Cluster
>
>                     1---------2-----------3
>                                                           a) bring up
>             all the three
>                     nodes at
>                                 the same
>                                         time (None
>                                                 of the
>                                                     nodes have seen each
>             other
>                     before this)
>                                                           b) Now each
>             node forms a
>                     cluster
>                                 by itself ..
>                                         (Here i
>                                                 think it
>                                                     should from either a
>             (1,2) or
>                     (2,3) ring )
>                                                     Regards,
>                                                     Ranjith
>
>
>
>
>                                             Ranjith,
>
>                                             Which version of corosync
>             are you running?
>
>                                             can you run
>             corosync-blackbox and attach
>                     the output?
>
>                                             Thanks
>                                             -steve
>
>
>
>                       _______________________________________________
>                                                 Openais mailing list
>             Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>>>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>>>>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>>>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>
>             <mailto:Openais@lists.linux-foundation.org
>             <mailto:Openais@lists.linux-foundation.org>>>>>
>
>             https://lists.linux-foundation.org/mailman/listinfo/openais
>
>
>
>
>
>
>
>
>
>
>
>
>

_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/openais

Re: [Openais] Corosync 1.2.8 totem membership behaviour

Reply via email to