Add one for --srcport as well and I think you'll be ok. Actually, since my cluster traffic all goes over a separate switch I usually just allow all traffic in/out of eth1.
Brian Bret Palsson <b...@getjive.com> 2009-01-15 08:12: > So it looks like iptables is what is stopping it from working. After > disabling iptables completely for 1 minute then trying to mount on node 1 > it worked fine. > So my new question is why did `iptables -A INPUT -ptcp --dport 7777 -j > ACCEPT ; service iptables save` not allow ocfs2 to talk? What do people > add the their iptables? > -Bret > On Jan 14, 2009, at 4:50 PM, Sunil Mushran wrote: > > It's part and parcel of the fs. If you want mainline linux, > goto [1]http://kernel.org. > > Bret Palsson wrote: > > Can I get the source for DLM 1.5.0 and build it on my other machines? > > If so where do I grab it? > > Thanks, > > Bret > > On Jan 14, 2009, at 4:28 PM, Sunil Mushran wrote: > > I hate cut-paste's because I have no idea whether I can trust it > > or not. A misspelled 0 and 1 makes a whole world of difference. > > But the following seems to indicate that the configuration is bad. > > (3130,1):o2net_connect_expired:1659 ERROR: no connection established > > with node 0 after 30.0 seconds, giving up and returning errors. > > (4670,1):dlm_request_join:1033 ERROR: status = -107 > > (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (4670,1):dlm_join_domain:1485 ERROR: status = -107 > > (4670,1):dlm_register_domain:1732 ERROR: status = -107 > > (4670,1):o2cb_cluster_connect:302 ERROR: status = -107 > > (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107 > > (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107 > > ocfs2: Unmounting device (253,2) on (node 0) > > Why is the mount failing on node 0? I thought it was mounted on > > node 0? > > Maybe best if you file a bugzilla and attach the /var/log/messages > > of both nodes. Indicate the time you did the mount. > > Sunil > > Bret Palsson wrote: > > Output of Node 0 { > > OCFS2 Node Manager 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build > > 0f78045c75c0174e50e4cf0934bf9eae) > > OCFS2 DLM 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build > > 4ce8fae327880c466761f40fb7619490) > > OCFS2 DLMFS 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build > > 4ce8fae327880c466761f40fb7619490) > > OCFS2 User DLM kernel interface loaded > > SELinux: initialized (dev ocfs2_dlmfs, type ocfs2_dlmfs), not > > configured for labeling > > eth3: no IPv6 routers present > > OCFS2 1.4.1 Tue Dec 16 19:18:02 PST 2008 (build > > 3fc82af4b5669945497b322b6aabd031) > > ocfs2_dlm: Nodes in domain ("8B2CCF82F1BA4A70B587580B23D9D7F7"): 0 > > kjournald starting. Commit interval 5 seconds > > ocfs2: Mounting device (253,3) on (node 0, slot 0) with ordered > data > > mode. > > SELinux: initialized (dev dm-3, type ocfs2), not configured for > > labeling > > ocfs2_dlm: Nodes in domain ("222B65A090D6477481AD30DE9FCE7961"): 0 > > kjournald starting. Commit interval 5 seconds > > ocfs2: Mounting device (253,2) on (node 0, slot 0) with ordered > data > > mode. > > SELinux: initialized (dev dm-2, type ocfs2), not configured for > > labeling > > ocfs2_dlm: Nodes in domain ("0425C0367AF547E989864A46F3DBD6E6"): 0 > > kjournald starting. Commit interval 5 seconds > > ocfs2: Mounting device (253,4) on (node 0, slot 0) with ordered > data > > mode. > > SELinux: initialized (dev dm-4, type ocfs2), not configured for > > labeling > > } > > Output of Node 1 { > > OCFS2 Node Manager 1.5.0 > > OCFS2 DLM 1.5.0 > > ocfs2: Registered cluster interface o2cb > > OCFS2 DLMFS 1.5.0 > > OCFS2 User DLM kernel interface loaded > > device eth0 entered promiscuous mode > > OCFS2 1.5.0 > > } > > On Jan 14, 2009, at 3:58 PM, Sunil Mushran wrote: > > What about the dmesg on node 1? > > Now ideally we want the fs versions to be the same on all nodes. > > However as we have not changed the protocol since 1.4.1, this > > should still work. > > Bret Palsson wrote: > > node 0 (and FS) OCFS2 1.4.1 2.6.18-92.1.22.el5xen > > node 1 OCFS 21.5 2.6.28-vs2.3.0.36.4 > > Output of Node 1 { > > OCFS2 Node Manager 1.5.0 > > OCFS2 DLM 1.5.0 > > ocfs2: Registered cluster interface o2cb > > OCFS2 DLMFS 1.5.0 > > OCFS2 User DLM kernel interface loaded > > device eth0 entered promiscuous mode > > OCFS2 1.5.0 > > } > > On Jan 14, 2009, at 1:41 PM, Sunil Mushran wrote: > > versions? kernel and fs. > > Bret Palsson wrote: > > Does anyone have any idea what to try next? Here are the > steps I > > have > > taken and the problem: (I wanted to post my question > on the > > first > > line before I explained the problem and what I have tried) > > ---------- > > Node 0 has the file system mounted just fine and works > great. > > When trying to mount on Node 1: `mount.ocfs2 > /dev/mapper/data / > > cluster/ > > data` I get this error after about 30 seconds: > mount.ocfs2: > > Transport > > endpoint is not connected while mounting /dev/mapper/data > on / > > cluster/ > > data. Check 'dmesg' for more information on this error. > > Here is the output of dmesg: > > (3130,1):o2net_connect_expired:1659 ERROR: no connection > > established > > with node 0 after 30.0 seconds, giving up and returning > errors. > > (4670,1):dlm_request_join:1033 ERROR: status = -107 > > (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (4670,1):dlm_join_domain:1485 ERROR: status = -107 > > (4670,1):dlm_register_domain:1732 ERROR: status = -107 > > (4670,1):o2cb_cluster_connect:302 ERROR: status = -107 > > (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107 > > (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107 > > ocfs2: Unmounting device (253,2) on (node 0) > > (3130,0):o2net_connect_expired:1659 ERROR: no connection > > established > > with node 0 after 30.0 seconds, giving up and returning > errors. > > (5558,1):dlm_request_join:1033 ERROR: status = -107 > > (5558,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (5558,1):dlm_join_domain:1485 ERROR: status = -107 > > (5558,1):dlm_register_domain:1732 ERROR: status = -107 > > (5558,1):o2cb_cluster_connect:302 ERROR: status = -107 > > (5558,1):ocfs2_dlm_init:2753 ERROR: status = -107 > > (5558,1):ocfs2_mount_volume:1274 ERROR: status = -107 > > ocfs2: Unmounting device (253,2) on (node 0) > > So I figured that It must be a firewall issue. I first > disabled > > iptables on both machines and got the same results so I > started ip > > talbes adding an exception on both machines: `iptables -A > INPUT -p > > tcp > > --dport 7777 -j ACCEPT ; service iptables save` > > The machines can ping each other. and they have the exact > same > > config: > > cluster: > > node_count = 2 > > name = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 10.128.255.3 > > number = 0 > > name = m3.c12.jiveip.net > > cluster = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 10.128.7.33 > > number = 1 > > name = pbx_33.c12.jiveip.net > > cluster = ocfs2 > > I then decided to use tcpdump to see what's up (on both > machines): > > `tcpdump -i eth0 port 7777 -v` > > Here is a TCP dump showing port 7777 is not blocked (I > added an > > exception in IP tables) > > (Node 0) > > 13:13:11.711539 IP (tos 0x0, ttl 64, id 18286, offset 0, > flags > > [DF], > > proto: TCP (6), length: 60) 10.128.7.33.47601 > > > 10.128.255.3.cbt: S, > > cksum 0xd272 (correct), 3820380795:3820380795(0) win 5840 > <mss > > 1460,sackOK,timestamp 4294911253 0,nop,wscale 6> > > 13:13:14.710703 IP (tos 0x0, ttl 64, id 18287, offset 0, > flags > > [DF], > > proto: TCP (6), length: 60) 10.128.7.33.47601 > > > 10.128.255.3.cbt: S, > > cksum 0xc6ba (correct), 3820380795:3820380795(0) win 5840 > <mss > > 1460,sackOK,timestamp 4294914253 0,nop,wscale 6> > > 13:13:14.711213 IP (tos 0x0, ttl 64, id 2241, offset 0, > flags > > [DF], > > proto: TCP (6), length: 60) 10.128.7.33.54763 > > > 10.128.255.3.cbt: S, > > cksum 0xd2ae (correct), 3862378508:3862378508(0) win 5840 > <mss > > 1460,sackOK,timestamp 4294914253 0,nop,wscale 6> > > (Node 1) > > 13:13:09.956999 IP (tos 0x0, ttl 64, id 18286, offset 0, > flags > > [DF], > > proto: TCP (6), length: 60) 10.128.7.33.47601 > > > 10.128.255.3.cbt: S, > > cksum 0xd272 (correct), 3820380795:3820380795(0) win 5840 > <mss > > 1460,sackOK,timestamp 4294911253 0,nop,wscale 6> > > 13:13:12.956999 IP (tos 0x0, ttl 64, id 18287, offset 0, > flags > > [DF], > > proto: TCP (6), length: 60) 10.128.7.33.47601 > > > 10.128.255.3.cbt: S, > > cksum 0xc6ba (correct), 3820380795:3820380795(0) win 5840 > <mss > > 1460,sackOK,timestamp 4294914253 0,nop,wscale 6> > > 13:13:12.956999 IP (tos 0x0, ttl 64, id 2241, offset 0, > flags > > [DF], > > proto: TCP (6), length: 60) 10.128.7.33.54763 > > > 10.128.255.3.cbt: S, > > cksum 0xd2ae (correct), 3862378508:3862378508(0) win 5840 > <mss > > 1460,sackOK,timestamp 4294914253 0,nop,wscale 6> > > References > > Visible links > 1. http://kernel.org/ > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users