On 02/22/2011 04:47 AM, NAKAHIRA Kazutomo wrote: > Hi, Steven > > Thank you for your speedy response. > > I use iptables but it have no DROP/REJECT rules > for INPUT and OUTPUT chain. > > My iptables setting is below: > > [root@test1 ~]# iptables -L > Chain INPUT (policy ACCEPT) > target prot opt source destination > ACCEPT udp -- anywhere anywhere udp dpt:domain > ACCEPT tcp -- anywhere anywhere tcp dpt:domain > ACCEPT udp -- anywhere anywhere udp dpt:bootps > ACCEPT tcp -- anywhere anywhere tcp dpt:bootps > > Chain FORWARD (policy ACCEPT) > target prot opt source destination > ACCEPT all -- anywhere 192.168.xxx.0/24 state > RELATED,ESTABLISHED > ACCEPT all -- 192.168.xxx.0/24 anywhere > ACCEPT all -- anywhere anywhere > REJECT all -- anywhere anywhere reject-with > icmp-port-unreachable > REJECT all -- anywhere anywhere reject-with > icmp-port-unreachable > ACCEPT all -- anywhere anywhere PHYSDEV > match --physdev-is-bridged > > Chain OUTPUT (policy ACCEPT) > target prot opt source destination > > There are any problems? > > BTW, I use Corosync-1.3 on the RHEL6 with 12-nodes cluster. > Does anyone have a good record of Corosync-1.3 + RHEL6 with large scale > cluster? >
Are you running on bare metal equipment? Or in a VM? If running in a virtual machine, iptables should also be configured properly on the vm host. There are significant deployments with corosync-1.2.3 shipped in rhel6 which is similar to 1.3.0. I am a little stumped. A few things you can try: turn off selinux turn off iptables (service iptables stop) And see if either of those solve the problem. Since it sounds like your building corosync on your own, try the pre-built binaries, and see if those work. Regards -steve > Best Regards, > > (2011/02/22 3:08), Steven Dake wrote: >> Your firewall may be enabled for the ports corosync uses to communicate. >> Newer versions of corosync have a diag that tells the user this may be >> a problem for them. >> >> The firewall needs to be configured properly if it is enabled (which it >> is by default in RHEL/FEDORA). In a rhel environment, this can be done >> via system->preferences->firewall GUI or adding your own iptables rules. >> >> Regards >> -steve >> >> On 02/21/2011 12:56 AM, NAKAHIRA Kazutomo wrote: >>> Hi, all >>> >>> # This problem related to following previous subject and we use same >>> test environment. >>> https://lists.linux-foundation.org/pipermail/openais/2011-February/015673.html >>> >>> >>> The start process of corosync fell into an infinite loop >>> in my test environment. >>> >>> The corosync process output a lot of following logs to the debug logfile >>> and start-up process stalled. >>> >>> -- ha-debug -- >>> Feb 21 15:39:46 node1 corosync[19268]: [TOTEM ] totemsrp.c:1852 >>> entering GATHER state from 11. >>> -- ha-debug -- >>> >>> It seems that all nodes sending a lot of "join messages" and >>> they has no way out of the GATHER state. >>> >>> This loop is expected operation? >>> >>> The backtrace of corosync process is that: >>> (gdb) bt >>> #0 0x00000031bdca6a8d in nanosleep () from /lib64/libc.so.6 >>> #1 0x00000031bdcda904 in usleep () from /lib64/libc.so.6 >>> #2 0x000000351ae11245 in memb_join_message_send >>> (instance=0x7f81483aa010) >>> at totemsrp.c:2959 >>> #3 0x000000351ae13aeb in memb_state_gather_enter >>> (instance=0x7f81483aa010, >>> gather_from=11) at totemsrp.c:1815 >>> #4 0x000000351ae16e22 in memb_join_process (instance=0x7f81483aa010, >>> memb_join=0x232e6c8) at totemsrp.c:3997 >>> #5 0x000000351ae175a9 in message_handler_memb_join >>> (instance=0x7f81483aa010, >>> msg=<value optimized out>, msg_len=<value optimized out>, >>> endian_conversion_needed=<value optimized out>) at totemsrp.c:4161 >>> #6 0x000000351ae0e9a4 in rrp_deliver_fn (context=0x23022e0, >>> msg=0x232e6c8, >>> msg_len=596) at totemrrp.c:1511 >>> #7 0x000000351ae0b4d6 in net_deliver_fn (handle=<value optimized out>, >>> fd=<value optimized out>, revents=<value optimized out>, >>> data=0x232e020) >>> at totemudp.c:1244 >>> #8 0x000000351ae07202 in poll_run (handle=1265737887312248832) >>> at coropoll.c:510 >>> #9 0x0000000000406cfd in main (argc=<value optimized out>, >>> argv=<value optimized out>, envp=<value optimized out>) at >>> main.c:1813 >>> >>> >>> Our test environment is that: >>> RHEL6(kernel 2.6.32-71.14.1.el6.x86_64) >>> Corosync-1.3.0-1 >>> Pacemaker-1.0.10-1 >>> cluster-glue-1.0.6-1 >>> resource-agents-1.0.3-1 >>> >>> >>> corosync.conf is that: >>> -- corosync.conf -- >>> compatibility: whitetank >>> >>> aisexec { >>> user: root >>> group: root >>> } >>> >>> service { >>> name: pacemaker >>> ver: 0 >>> } >>> >>> totem { >>> version: 2 >>> secauth: off >>> rrp_mode: active >>> token: 16000 >>> consensus: 20000 >>> clear_node_high_bit: yes >>> rrp_problem_count_timeout: 30000 >>> fail_recv_const: 50 >>> send_join: 10 >>> interface { >>> ringnumber: 0 >>> bindnetaddr: AAA.BBB.xxx.0 >>> mcastaddr: 226.94.1.1 >>> mcastport: 5405 >>> } >>> interface { >>> ringnumber: 1 >>> bindnetaddr: AAA.BBB.yyy.0 >>> mcastaddr: 226.94.1.1 >>> mcastport: 5405 >>> } >>> } >>> >>> logging { >>> fileline: on >>> to_syslog: yes >>> syslog_facility: local1 >>> syslog_priority: info >>> debug: on >>> timestamp: on >>> } >>> -- corosync.conf -- >>> >>> We tried "fail_recv_const: 5000" and it lighten incidence of problem, >>> But corosync start-up problem keeps being generated now. >>> >>> If "send_join: 10" is not set, a lot of multicast packet causes crowding >>> the network and other network communications are blocked. >>> >>> >>> Best Regards, >>> >> > _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
