On Mon, 2009-12-14 at 08:39 -0500, Robert Borkowski wrote: > What would you suggest when running on Amazon EC2? No multicast, no > GRE... > > > There's no guarantee that the cluster members will be anywhere near > each other > network-wise. >
vtun may work. I don't have a real clear solution of how to integrate with EC2. Regards -steve > > -- > Robert Borkowski > > On Mon, Dec 14, 2009 at 8:35 AM, Fabio M. Di Nitto > <[email protected]> wrote: > openais does support broadcast too, but not point to point. > > All I am saying, is that while using tunnel devices is a valid > use > cases, it might not operate properly as expected and it has > never been > tested before. > > With the vtun case, I am very familiar with that piece of > software and I > know that has probably more glitches than other tunnelling > implementations :) > > Fabio > > Robert Borkowski wrote: > > Unless openais has some way to run without multicast, that's > my only > > alternative. > > > > Well the other-other alternative is to run the app without > clustering > > and devise some > > sort of duct tape and hot glue HA system :-) > > > > -- > > Robert Borkowski > > > > On Mon, Dec 14, 2009 at 2:53 AM, Fabio M. Di Nitto > <[email protected] > > > > <mailto:[email protected]>> wrote: > > > > Binding over tun devices might be useful, but be aware > of several > > different gotchas: > > > > - MTU is not ethernet size (and it愀 not constant. vtun > uses 50 bytes > > for its own header - irrelevant to corosync - others > might use different > > size. this could affect certain opeartions) > > - tun implementation. vtun, for example, adds latency > that could be > > relevant for cluster operations (the amount depends on > the plugins > > loaded - crypto, compression and so on). > > - queues handling. vtun for example, in certain > conditions, will block > > the application when writing to the network socket. I > don愒 believe this > > is desirable vs dropping packets (expected behaviour?). > > > > so is it really worth the troubles to be able to bind to > tunnels? > > > > Just 2c... > > > > Fabio > > > > Steven Dake wrote: > > > The binding code may not support binding to tuns > without modification. > > > > > > I'll have a look this week. > > > > > > Regards > > > -steve > > > > > > On Sun, 2009-12-13 at 12:08 -0500, Robert Borkowski > wrote: > > >> Hello, > > >> > > >> > > >> Is there any way to get openais/corosync working on > Amazon EC2? > > >> Multicast is not permitted there... > > >> What I'd like to set up is a two node cluster. > > >> > > >> > > >> My current attempt to get this working is to set up > vtun tunnels > > >> between the two nodes. vtun is supposed to be able to > tunnel > > >> multicast. > > >> The two nodes have 192.168.1.1 and 192.168.1.2 on > their tun0 > > >> interfaces respectively, and I'm able to pass traffic > through the > > >> tunnel. > > >> > > >> > > >> This is failing right now because totem won't bind to > the tun0 > > >> address. > > >> On the first node I tried setting bindnetaddr to > 192.168.1.0 and > > >> 192.168.1.1. In both cases debugging indicates > 'network interface is > > >> down' and totem binding to 127.0.0.1. > > >> Strangely enough when I configure it to bind on > 192.168.1.2 it does > > >> bind, but obviously that's wrong and doesn't work. > > >> > > >> > > >> The OS is Ubuntu hardy heron. I tried the openais out > of the heron > > >> repo (0.82-3ubuntu2), and built corosync from the > karmic source repo > > >> (1.0.0-5ubuntu1). > > >> Both behave the same way. > > >> > > >> > > >> Any pointers? > > >> > > >> > > >> > > >> > > >> # ifconfig tun0 > > >> tun0 Link encap:UNSPEC HWaddr > > >> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 > > >> inet addr:192.168.1.1 P-t-P:192.168.1.2 > > >> Mask:255.255.255.255 > > >> UP POINTOPOINT RUNNING NOARP MULTICAST > MTU:1450 Metric:1 > > >> RX packets:11 errors:0 dropped:0 overruns:0 > frame:0 > > >> TX packets:11 errors:0 dropped:0 overruns:0 > carrier:0 > > >> collisions:0 txqueuelen:500 > > >> RX bytes:924 (924.0 B) TX bytes:924 (924.0 > B) > > >> > > >> > > >> # egrep -v '#|^$' /etc/corosync/corosync.conf > > >> totem { > > >> version: 2 > > >> token: 3000 > > >> token_retransmits_before_loss_const: 10 > > >> join: 60 > > >> consensus: 1500 > > >> vsftype: none > > >> max_messages: 20 > > >> clear_node_high_bit: yes > > >> secauth: off > > >> threads: 0 > > >> rrp_mode: none > > >> interface { > > >> ringnumber: 0 > > >> bindnetaddr: 192.168.1.0 > > >> mcastaddr: 226.94.1.1 > > >> mcastport: 5405 > > >> } > > >> } > > >> amf { > > >> mode: disabled > > >> } > > >> service { > > >> ver: 0 > > >> name: pacemaker > > >> } > > >> aisexec { > > >> user: root > > >> group: root > > >> } > > >> logging { > > >> fileline: off > > >> to_stderr: yes > > >> to_logfile: no > > >> to_syslog: yes > > >> syslog_facility: daemon > > >> debug: on > > >> timestamp: on > > >> logger_subsys { > > >> subsys: AMF > > >> debug: on > > >> tags: enter|leave|trace1|trace2| > trace3|trace4|trace6 > > >> } > > >> } > > >> > > >> > > >> > > >> > > >> > > >> > > >> # corosync -f > > >> Dec 13 12:00:06 corosync [MAIN ] Corosync Cluster > Engine ('trunk'): > > >> started and ready to provide service. > > >> Dec 13 12:00:06 corosync [MAIN ] Successfully read > main > > configuration > > >> file '/etc/corosync/corosync.conf'. > > >> Dec 13 12:00:06 corosync [TOTEM ] Token Timeout (3000 > ms) retransmit > > >> timeout (294 ms) > > >> Dec 13 12:00:06 corosync [TOTEM ] token hold (225 ms) > retransmits > > >> before loss (10 retrans) > > >> Dec 13 12:00:06 corosync [TOTEM ] join (60 ms) > send_join (0 ms) > > >> consensus (1500 ms) merge (200 ms) > > >> Dec 13 12:00:06 corosync [TOTEM ] downcheck (1000 ms) > fail to recv > > >> const (50 msgs) > > >> Dec 13 12:00:06 corosync [TOTEM ] seqno unchanged > const (30 > > rotations) > > >> Maximum network MTU 1500 > > >> Dec 13 12:00:06 corosync [TOTEM ] window size per > rotation (50 > > >> messages) maximum messages per rotation (20 messages) > > >> Dec 13 12:00:06 corosync [TOTEM ] send threads (0 > threads) > > >> Dec 13 12:00:06 corosync [TOTEM ] RRP token expired > timeout (294 ms) > > >> Dec 13 12:00:06 corosync [TOTEM ] RRP token problem > counter (2000 ms) > > >> Dec 13 12:00:06 corosync [TOTEM ] RRP threshold (10 > problem count) > > >> Dec 13 12:00:06 corosync [TOTEM ] RRP mode set to > none. > > >> Dec 13 12:00:06 corosync [TOTEM ] > heartbeat_failures_allowed (0) > > >> Dec 13 12:00:06 corosync [TOTEM ] max_network_delay > (50 ms) > > >> Dec 13 12:00:06 corosync [TOTEM ] HeartBeat is > Disabled. To > > enable set > > >> heartbeat_failures_allowed > 0 > > >> Dec 13 12:00:06 corosync [TOTEM ] Initializing > transmit/receive > > >> security: libtomcrypt SOBER128/SHA1HMAC (mode 0). > > >> Dec 13 12:00:06 corosync [TOTEM ] Receive multicast > socket recv > > buffer > > >> size (288000 bytes). > > >> Dec 13 12:00:06 corosync [TOTEM ] Transmit multicast > socket send > > >> buffer size (262142 bytes). > > >> Dec 13 12:00:06 corosync [TOTEM ] The network > interface is down. > > >> Dec 13 12:00:06 corosync [TOTEM ] Created or loaded > sequence id > > >> 20.127.0.0.1 for this ring. > > >> Dec 13 12:00:06 corosync [TOTEM ] entering GATHER > state from 15. > > >> Dec 13 12:00:06 corosync [SERV ] Service failed to > load 'pacemaker'. > > >> Dec 13 12:00:06 corosync [SERV ] Service initialized > 'corosync > > >> extended virtual synchrony service' > > >> Dec 13 12:00:06 corosync [SERV ] Service initialized > 'corosync > > >> configuration service' > > >> Dec 13 12:00:06 corosync [SERV ] Service initialized > 'corosync > > >> cluster closed process group service v1.01' > > >> Dec 13 12:00:06 corosync [SERV ] Service initialized > 'corosync > > >> cluster config database access v1.01' > > >> Dec 13 12:00:06 corosync [SERV ] Service initialized > 'corosync > > >> profile loading service' > > >> Dec 13 12:00:06 corosync [MAIN ] Compatibility mode > set to > > >> whitetank. Using V1 and V2 of the synchronization > engine. > > >> Dec 13 12:00:06 corosync [TOTEM ] Creating commit > token because I am > > >> the rep. > > >> Dec 13 12:00:06 corosync [TOTEM ] Saving state aru 0 > high seq > > received > > >> 0 > > >> Dec 13 12:00:06 corosync [TOTEM ] Storing new > sequence id for ring 18 > > >> Dec 13 12:00:06 corosync [TOTEM ] entering COMMIT > state. > > >> Dec 13 12:00:06 corosync [TOTEM ] got commit token > > >> Dec 13 12:00:06 corosync [TOTEM ] entering RECOVERY > state. > > >> Dec 13 12:00:06 corosync [TOTEM ] position [0] member > 127.0.0.1 > > > <http://127.0.0.1>: > > >> Dec 13 12:00:06 corosync [TOTEM ] previous ring seq > 20 rep 127.0.0.1 > > >> Dec 13 12:00:06 corosync [TOTEM ] aru 0 high > delivered 0 received > > flag > > >> 1 > > >> Dec 13 12:00:06 corosync [TOTEM ] Did not need to > originate any > > >> messages in recovery. > > >> Dec 13 12:00:06 corosync [TOTEM ] got commit token > > >> Dec 13 12:00:06 corosync [TOTEM ] Sending initial ORF > token > > >> Dec 13 12:00:06 corosync [TOTEM ] token retrans flag > is 0 my set > > >> retrans flag0 retrans queue empty 1 count 0, aru 0 > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> -- > > >> Robert Borkowski > > >> > > >> _______________________________________________ > > >> Openais mailing list > > >> [email protected] > > > <mailto:[email protected]> > > >> > https://lists.linux-foundation.org/mailman/listinfo/openais > > > > > > _______________________________________________ > > > Openais mailing list > > > [email protected] > > > <mailto:[email protected]> > > > > > https://lists.linux-foundation.org/mailman/listinfo/openais > > > > > > _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
