What would you suggest when running on Amazon EC2? No multicast, no GRE...

There's no guarantee that the cluster members will be anywhere near each
other
network-wise.

--
Robert Borkowski

On Mon, Dec 14, 2009 at 8:35 AM, Fabio M. Di Nitto <[email protected]>wrote:

> openais does support broadcast too, but not point to point.
>
> All I am saying, is that while using tunnel devices is a valid use
> cases, it might not operate properly as expected and it has never been
> tested before.
>
> With the vtun case, I am very familiar with that piece of software and I
> know that has probably more glitches than other tunnelling
> implementations :)
>
> Fabio
>
> Robert Borkowski wrote:
> > Unless openais has some way to run without multicast, that's my only
> > alternative.
> >
> > Well the other-other alternative is to run the app without clustering
> > and devise some
> > sort of duct tape and hot glue HA system :-)
> >
> > --
> > Robert Borkowski
> >
> > On Mon, Dec 14, 2009 at 2:53 AM, Fabio M. Di Nitto <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Binding over tun devices might be useful, but be aware of several
> >     different gotchas:
> >
> >     - MTU is not ethernet size (and it愀 not constant. vtun uses 50 bytes
> >     for its own header - irrelevant to corosync - others might use
> different
> >     size. this could affect certain opeartions)
> >     - tun implementation. vtun, for example, adds latency that could be
> >     relevant for cluster operations (the amount depends on the plugins
> >     loaded - crypto, compression and so on).
> >     - queues handling. vtun for example, in certain conditions, will
> block
> >     the application when writing to the network socket. I don愒 believe
> this
> >     is desirable vs dropping packets (expected behaviour?).
> >
> >     so is it really worth the troubles to be able to bind to tunnels?
> >
> >     Just 2c...
> >
> >     Fabio
> >
> >     Steven Dake wrote:
> >     > The binding code may not support binding to tuns without
> modification.
> >     >
> >     > I'll have a look this week.
> >     >
> >     > Regards
> >     > -steve
> >     >
> >     > On Sun, 2009-12-13 at 12:08 -0500, Robert Borkowski wrote:
> >     >> Hello,
> >     >>
> >     >>
> >     >> Is there any way to get openais/corosync working on Amazon EC2?
> >     >> Multicast is not permitted there...
> >     >> What I'd like to set up is a two node cluster.
> >     >>
> >     >>
> >     >> My current attempt to get this working is to set up vtun tunnels
> >     >> between the two nodes. vtun is supposed to be able to tunnel
> >     >> multicast.
> >     >> The two nodes have 192.168.1.1 and 192.168.1.2 on their tun0
> >     >> interfaces respectively, and I'm able to pass traffic through the
> >     >> tunnel.
> >     >>
> >     >>
> >     >> This is failing right now because totem won't bind to the tun0
> >     >> address.
> >     >> On the first node I tried setting bindnetaddr to 192.168.1.0 and
> >     >> 192.168.1.1. In both cases debugging indicates 'network interface
> is
> >     >> down' and totem binding to 127.0.0.1.
> >     >> Strangely enough when I configure it to bind on 192.168.1.2 it
> does
> >     >> bind, but obviously that's wrong and doesn't work.
> >     >>
> >     >>
> >     >> The OS is Ubuntu hardy heron. I tried the openais out of the heron
> >     >> repo (0.82-3ubuntu2), and built corosync from the karmic source
> repo
> >     >> (1.0.0-5ubuntu1).
> >     >> Both behave the same way.
> >     >>
> >     >>
> >     >> Any pointers?
> >     >>
> >     >>
> >     >>
> >     >>
> >     >> # ifconfig tun0
> >     >> tun0      Link encap:UNSPEC  HWaddr
> >     >> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> >     >>           inet addr:192.168.1.1  P-t-P:192.168.1.2
> >     >>  Mask:255.255.255.255
> >     >>           UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1450
>  Metric:1
> >     >>           RX packets:11 errors:0 dropped:0 overruns:0 frame:0
> >     >>           TX packets:11 errors:0 dropped:0 overruns:0 carrier:0
> >     >>           collisions:0 txqueuelen:500
> >     >>           RX bytes:924 (924.0 B)  TX bytes:924 (924.0 B)
> >     >>
> >     >>
> >     >> # egrep -v '#|^$' /etc/corosync/corosync.conf
> >     >> totem {
> >     >>         version: 2
> >     >>         token: 3000
> >     >>         token_retransmits_before_loss_const: 10
> >     >>         join: 60
> >     >>         consensus: 1500
> >     >>         vsftype: none
> >     >>         max_messages: 20
> >     >>         clear_node_high_bit: yes
> >     >>         secauth: off
> >     >>         threads: 0
> >     >>         rrp_mode: none
> >     >>         interface {
> >     >>                 ringnumber: 0
> >     >>                 bindnetaddr: 192.168.1.0
> >     >>                 mcastaddr: 226.94.1.1
> >     >>                 mcastport: 5405
> >     >>         }
> >     >> }
> >     >> amf {
> >     >>         mode: disabled
> >     >> }
> >     >> service {
> >     >>         ver:       0
> >     >>         name:      pacemaker
> >     >> }
> >     >> aisexec {
> >     >>         user:   root
> >     >>         group:  root
> >     >> }
> >     >> logging {
> >     >>         fileline: off
> >     >>         to_stderr: yes
> >     >>         to_logfile: no
> >     >>         to_syslog: yes
> >     >>         syslog_facility: daemon
> >     >>         debug: on
> >     >>         timestamp: on
> >     >>         logger_subsys {
> >     >>                 subsys: AMF
> >     >>                 debug: on
> >     >>                 tags:
> enter|leave|trace1|trace2|trace3|trace4|trace6
> >     >>         }
> >     >> }
> >     >>
> >     >>
> >     >>
> >     >>
> >     >>
> >     >>
> >     >> # corosync -f
> >     >> Dec 13 12:00:06 corosync [MAIN  ] Corosync Cluster Engine
> ('trunk'):
> >     >> started and ready to provide service.
> >     >> Dec 13 12:00:06 corosync [MAIN  ] Successfully read main
> >     configuration
> >     >> file '/etc/corosync/corosync.conf'.
> >     >> Dec 13 12:00:06 corosync [TOTEM ] Token Timeout (3000 ms)
> retransmit
> >     >> timeout (294 ms)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] token hold (225 ms) retransmits
> >     >> before loss (10 retrans)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] join (60 ms) send_join (0 ms)
> >     >> consensus (1500 ms) merge (200 ms)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] downcheck (1000 ms) fail to recv
> >     >> const (50 msgs)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] seqno unchanged const (30
> >     rotations)
> >     >> Maximum network MTU 1500
> >     >> Dec 13 12:00:06 corosync [TOTEM ] window size per rotation (50
> >     >> messages) maximum messages per rotation (20 messages)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] send threads (0 threads)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] RRP token expired timeout (294
> ms)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] RRP token problem counter (2000
> ms)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] RRP threshold (10 problem count)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] RRP mode set to none.
> >     >> Dec 13 12:00:06 corosync [TOTEM ] heartbeat_failures_allowed (0)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] max_network_delay (50 ms)
> >     >> Dec 13 12:00:06 corosync [TOTEM ] HeartBeat is Disabled. To
> >     enable set
> >     >> heartbeat_failures_allowed > 0
> >     >> Dec 13 12:00:06 corosync [TOTEM ] Initializing transmit/receive
> >     >> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> >     >> Dec 13 12:00:06 corosync [TOTEM ] Receive multicast socket recv
> >     buffer
> >     >> size (288000 bytes).
> >     >> Dec 13 12:00:06 corosync [TOTEM ] Transmit multicast socket send
> >     >> buffer size (262142 bytes).
> >     >> Dec 13 12:00:06 corosync [TOTEM ] The network interface is down.
> >     >> Dec 13 12:00:06 corosync [TOTEM ] Created or loaded sequence id
> >     >> 20.127.0.0.1 for this ring.
> >     >> Dec 13 12:00:06 corosync [TOTEM ] entering GATHER state from 15.
> >     >> Dec 13 12:00:06 corosync [SERV  ] Service failed to load
> 'pacemaker'.
> >     >> Dec 13 12:00:06 corosync [SERV  ] Service initialized 'corosync
> >     >> extended virtual synchrony service'
> >     >> Dec 13 12:00:06 corosync [SERV  ] Service initialized 'corosync
> >     >> configuration service'
> >     >> Dec 13 12:00:06 corosync [SERV  ] Service initialized 'corosync
> >     >> cluster closed process group service v1.01'
> >     >> Dec 13 12:00:06 corosync [SERV  ] Service initialized 'corosync
> >     >> cluster config database access v1.01'
> >     >> Dec 13 12:00:06 corosync [SERV  ] Service initialized 'corosync
> >     >> profile loading service'
> >     >> Dec 13 12:00:06 corosync [MAIN  ] Compatibility mode set to
> >     >> whitetank.  Using V1 and V2 of the synchronization engine.
> >     >> Dec 13 12:00:06 corosync [TOTEM ] Creating commit token because I
> am
> >     >> the rep.
> >     >> Dec 13 12:00:06 corosync [TOTEM ] Saving state aru 0 high seq
> >     received
> >     >> 0
> >     >> Dec 13 12:00:06 corosync [TOTEM ] Storing new sequence id for ring
> 18
> >     >> Dec 13 12:00:06 corosync [TOTEM ] entering COMMIT state.
> >     >> Dec 13 12:00:06 corosync [TOTEM ] got commit token
> >     >> Dec 13 12:00:06 corosync [TOTEM ] entering RECOVERY state.
> >     >> Dec 13 12:00:06 corosync [TOTEM ] position [0] member 127.0.0.1
> >     <http://127.0.0.1>:
> >     >> Dec 13 12:00:06 corosync [TOTEM ] previous ring seq 20 rep
> 127.0.0.1
> >     >> Dec 13 12:00:06 corosync [TOTEM ] aru 0 high delivered 0 received
> >     flag
> >     >> 1
> >     >> Dec 13 12:00:06 corosync [TOTEM ] Did not need to originate any
> >     >> messages in recovery.
> >     >> Dec 13 12:00:06 corosync [TOTEM ] got commit token
> >     >> Dec 13 12:00:06 corosync [TOTEM ] Sending initial ORF token
> >     >> Dec 13 12:00:06 corosync [TOTEM ] token retrans flag is 0 my set
> >     >> retrans flag0 retrans queue empty 1 count 0, aru 0
> >     >>
> >     >>
> >     >>
> >     >>
> >     >>
> >     >>
> >     >>
> >     >>
> >     >> --
> >     >> Robert Borkowski
> >     >>
> >     >> _______________________________________________
> >     >> Openais mailing list
> >     >> [email protected]
> >     <mailto:[email protected]>
> >     >> https://lists.linux-foundation.org/mailman/listinfo/openais
> >     >
> >     > _______________________________________________
> >     > Openais mailing list
> >     > [email protected]
> >     <mailto:[email protected]>
> >     > https://lists.linux-foundation.org/mailman/listinfo/openais
> >
> >
>
>
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to