What would you suggest when running on Amazon EC2? No multicast, no GRE... There's no guarantee that the cluster members will be anywhere near each other network-wise.
-- Robert Borkowski On Mon, Dec 14, 2009 at 8:35 AM, Fabio M. Di Nitto <[email protected]>wrote: > openais does support broadcast too, but not point to point. > > All I am saying, is that while using tunnel devices is a valid use > cases, it might not operate properly as expected and it has never been > tested before. > > With the vtun case, I am very familiar with that piece of software and I > know that has probably more glitches than other tunnelling > implementations :) > > Fabio > > Robert Borkowski wrote: > > Unless openais has some way to run without multicast, that's my only > > alternative. > > > > Well the other-other alternative is to run the app without clustering > > and devise some > > sort of duct tape and hot glue HA system :-) > > > > -- > > Robert Borkowski > > > > On Mon, Dec 14, 2009 at 2:53 AM, Fabio M. Di Nitto <[email protected] > > <mailto:[email protected]>> wrote: > > > > Binding over tun devices might be useful, but be aware of several > > different gotchas: > > > > - MTU is not ethernet size (and it愀 not constant. vtun uses 50 bytes > > for its own header - irrelevant to corosync - others might use > different > > size. this could affect certain opeartions) > > - tun implementation. vtun, for example, adds latency that could be > > relevant for cluster operations (the amount depends on the plugins > > loaded - crypto, compression and so on). > > - queues handling. vtun for example, in certain conditions, will > block > > the application when writing to the network socket. I don愒 believe > this > > is desirable vs dropping packets (expected behaviour?). > > > > so is it really worth the troubles to be able to bind to tunnels? > > > > Just 2c... > > > > Fabio > > > > Steven Dake wrote: > > > The binding code may not support binding to tuns without > modification. > > > > > > I'll have a look this week. > > > > > > Regards > > > -steve > > > > > > On Sun, 2009-12-13 at 12:08 -0500, Robert Borkowski wrote: > > >> Hello, > > >> > > >> > > >> Is there any way to get openais/corosync working on Amazon EC2? > > >> Multicast is not permitted there... > > >> What I'd like to set up is a two node cluster. > > >> > > >> > > >> My current attempt to get this working is to set up vtun tunnels > > >> between the two nodes. vtun is supposed to be able to tunnel > > >> multicast. > > >> The two nodes have 192.168.1.1 and 192.168.1.2 on their tun0 > > >> interfaces respectively, and I'm able to pass traffic through the > > >> tunnel. > > >> > > >> > > >> This is failing right now because totem won't bind to the tun0 > > >> address. > > >> On the first node I tried setting bindnetaddr to 192.168.1.0 and > > >> 192.168.1.1. In both cases debugging indicates 'network interface > is > > >> down' and totem binding to 127.0.0.1. > > >> Strangely enough when I configure it to bind on 192.168.1.2 it > does > > >> bind, but obviously that's wrong and doesn't work. > > >> > > >> > > >> The OS is Ubuntu hardy heron. I tried the openais out of the heron > > >> repo (0.82-3ubuntu2), and built corosync from the karmic source > repo > > >> (1.0.0-5ubuntu1). > > >> Both behave the same way. > > >> > > >> > > >> Any pointers? > > >> > > >> > > >> > > >> > > >> # ifconfig tun0 > > >> tun0 Link encap:UNSPEC HWaddr > > >> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 > > >> inet addr:192.168.1.1 P-t-P:192.168.1.2 > > >> Mask:255.255.255.255 > > >> UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1450 > Metric:1 > > >> RX packets:11 errors:0 dropped:0 overruns:0 frame:0 > > >> TX packets:11 errors:0 dropped:0 overruns:0 carrier:0 > > >> collisions:0 txqueuelen:500 > > >> RX bytes:924 (924.0 B) TX bytes:924 (924.0 B) > > >> > > >> > > >> # egrep -v '#|^$' /etc/corosync/corosync.conf > > >> totem { > > >> version: 2 > > >> token: 3000 > > >> token_retransmits_before_loss_const: 10 > > >> join: 60 > > >> consensus: 1500 > > >> vsftype: none > > >> max_messages: 20 > > >> clear_node_high_bit: yes > > >> secauth: off > > >> threads: 0 > > >> rrp_mode: none > > >> interface { > > >> ringnumber: 0 > > >> bindnetaddr: 192.168.1.0 > > >> mcastaddr: 226.94.1.1 > > >> mcastport: 5405 > > >> } > > >> } > > >> amf { > > >> mode: disabled > > >> } > > >> service { > > >> ver: 0 > > >> name: pacemaker > > >> } > > >> aisexec { > > >> user: root > > >> group: root > > >> } > > >> logging { > > >> fileline: off > > >> to_stderr: yes > > >> to_logfile: no > > >> to_syslog: yes > > >> syslog_facility: daemon > > >> debug: on > > >> timestamp: on > > >> logger_subsys { > > >> subsys: AMF > > >> debug: on > > >> tags: > enter|leave|trace1|trace2|trace3|trace4|trace6 > > >> } > > >> } > > >> > > >> > > >> > > >> > > >> > > >> > > >> # corosync -f > > >> Dec 13 12:00:06 corosync [MAIN ] Corosync Cluster Engine > ('trunk'): > > >> started and ready to provide service. > > >> Dec 13 12:00:06 corosync [MAIN ] Successfully read main > > configuration > > >> file '/etc/corosync/corosync.conf'. > > >> Dec 13 12:00:06 corosync [TOTEM ] Token Timeout (3000 ms) > retransmit > > >> timeout (294 ms) > > >> Dec 13 12:00:06 corosync [TOTEM ] token hold (225 ms) retransmits > > >> before loss (10 retrans) > > >> Dec 13 12:00:06 corosync [TOTEM ] join (60 ms) send_join (0 ms) > > >> consensus (1500 ms) merge (200 ms) > > >> Dec 13 12:00:06 corosync [TOTEM ] downcheck (1000 ms) fail to recv > > >> const (50 msgs) > > >> Dec 13 12:00:06 corosync [TOTEM ] seqno unchanged const (30 > > rotations) > > >> Maximum network MTU 1500 > > >> Dec 13 12:00:06 corosync [TOTEM ] window size per rotation (50 > > >> messages) maximum messages per rotation (20 messages) > > >> Dec 13 12:00:06 corosync [TOTEM ] send threads (0 threads) > > >> Dec 13 12:00:06 corosync [TOTEM ] RRP token expired timeout (294 > ms) > > >> Dec 13 12:00:06 corosync [TOTEM ] RRP token problem counter (2000 > ms) > > >> Dec 13 12:00:06 corosync [TOTEM ] RRP threshold (10 problem count) > > >> Dec 13 12:00:06 corosync [TOTEM ] RRP mode set to none. > > >> Dec 13 12:00:06 corosync [TOTEM ] heartbeat_failures_allowed (0) > > >> Dec 13 12:00:06 corosync [TOTEM ] max_network_delay (50 ms) > > >> Dec 13 12:00:06 corosync [TOTEM ] HeartBeat is Disabled. To > > enable set > > >> heartbeat_failures_allowed > 0 > > >> Dec 13 12:00:06 corosync [TOTEM ] Initializing transmit/receive > > >> security: libtomcrypt SOBER128/SHA1HMAC (mode 0). > > >> Dec 13 12:00:06 corosync [TOTEM ] Receive multicast socket recv > > buffer > > >> size (288000 bytes). > > >> Dec 13 12:00:06 corosync [TOTEM ] Transmit multicast socket send > > >> buffer size (262142 bytes). > > >> Dec 13 12:00:06 corosync [TOTEM ] The network interface is down. > > >> Dec 13 12:00:06 corosync [TOTEM ] Created or loaded sequence id > > >> 20.127.0.0.1 for this ring. > > >> Dec 13 12:00:06 corosync [TOTEM ] entering GATHER state from 15. > > >> Dec 13 12:00:06 corosync [SERV ] Service failed to load > 'pacemaker'. > > >> Dec 13 12:00:06 corosync [SERV ] Service initialized 'corosync > > >> extended virtual synchrony service' > > >> Dec 13 12:00:06 corosync [SERV ] Service initialized 'corosync > > >> configuration service' > > >> Dec 13 12:00:06 corosync [SERV ] Service initialized 'corosync > > >> cluster closed process group service v1.01' > > >> Dec 13 12:00:06 corosync [SERV ] Service initialized 'corosync > > >> cluster config database access v1.01' > > >> Dec 13 12:00:06 corosync [SERV ] Service initialized 'corosync > > >> profile loading service' > > >> Dec 13 12:00:06 corosync [MAIN ] Compatibility mode set to > > >> whitetank. Using V1 and V2 of the synchronization engine. > > >> Dec 13 12:00:06 corosync [TOTEM ] Creating commit token because I > am > > >> the rep. > > >> Dec 13 12:00:06 corosync [TOTEM ] Saving state aru 0 high seq > > received > > >> 0 > > >> Dec 13 12:00:06 corosync [TOTEM ] Storing new sequence id for ring > 18 > > >> Dec 13 12:00:06 corosync [TOTEM ] entering COMMIT state. > > >> Dec 13 12:00:06 corosync [TOTEM ] got commit token > > >> Dec 13 12:00:06 corosync [TOTEM ] entering RECOVERY state. > > >> Dec 13 12:00:06 corosync [TOTEM ] position [0] member 127.0.0.1 > > <http://127.0.0.1>: > > >> Dec 13 12:00:06 corosync [TOTEM ] previous ring seq 20 rep > 127.0.0.1 > > >> Dec 13 12:00:06 corosync [TOTEM ] aru 0 high delivered 0 received > > flag > > >> 1 > > >> Dec 13 12:00:06 corosync [TOTEM ] Did not need to originate any > > >> messages in recovery. > > >> Dec 13 12:00:06 corosync [TOTEM ] got commit token > > >> Dec 13 12:00:06 corosync [TOTEM ] Sending initial ORF token > > >> Dec 13 12:00:06 corosync [TOTEM ] token retrans flag is 0 my set > > >> retrans flag0 retrans queue empty 1 count 0, aru 0 > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> -- > > >> Robert Borkowski > > >> > > >> _______________________________________________ > > >> Openais mailing list > > >> [email protected] > > <mailto:[email protected]> > > >> https://lists.linux-foundation.org/mailman/listinfo/openais > > > > > > _______________________________________________ > > > Openais mailing list > > > [email protected] > > <mailto:[email protected]> > > > https://lists.linux-foundation.org/mailman/listinfo/openais > > > > > >
_______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
