Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-15 Thread Brian Kroth
Add one for --srcport as well and I think you'll be ok.  Actually, since
my cluster traffic all goes over a separate switch I usually just allow
all traffic in/out of eth1.

Brian

Bret Palsson b...@getjive.com 2009-01-15 08:12:
So it looks like iptables is what is stopping it from working. After
disabling iptables completely for 1 minute then trying to mount on node 1
it worked fine.
So my new question is why did `iptables -A INPUT -ptcp --dport  -j
ACCEPT ; service iptables save` not allow ocfs2 to talk?  What do people
add the their iptables?
-Bret
On Jan 14, 2009, at 4:50 PM, Sunil Mushran wrote:
 
  It's part and parcel of the fs. If you want mainline linux,
  goto [1]http://kernel.org.
 
  Bret Palsson wrote:
 
Can I get the source for DLM 1.5.0 and build it on my other machines?
 
If so where do I grab it?
 
Thanks,
 
Bret
 
On Jan 14, 2009, at 4:28 PM, Sunil Mushran wrote:
 
  I hate cut-paste's because I have no idea whether I can trust it
 
  or not. A misspelled 0 and 1 makes a whole world of difference.
 
  But the following seems to indicate that the configuration is bad.
 
  (3130,1):o2net_connect_expired:1659 ERROR: no connection established
 
  with node 0 after 30.0 seconds, giving up and returning errors.
 
  (4670,1):dlm_request_join:1033 ERROR: status = -107
 
  (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 
  (4670,1):dlm_join_domain:1485 ERROR: status = -107
 
  (4670,1):dlm_register_domain:1732 ERROR: status = -107
 
  (4670,1):o2cb_cluster_connect:302 ERROR: status = -107
 
  (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
 
  (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
 
  ocfs2: Unmounting device (253,2) on (node 0)
 
  Why is the mount failing on node 0? I thought it was mounted on
 
  node 0?
 
  Maybe best if you file a bugzilla and attach the /var/log/messages
 
  of both nodes. Indicate the time you did the mount.
 
  Sunil
 
  Bret Palsson wrote:
 
Output of Node 0 {
 
OCFS2 Node Manager 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 
0f78045c75c0174e50e4cf0934bf9eae)
 
OCFS2 DLM 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 
4ce8fae327880c466761f40fb7619490)
 
OCFS2 DLMFS 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 
4ce8fae327880c466761f40fb7619490)
 
OCFS2 User DLM kernel interface loaded
 
SELinux: initialized (dev ocfs2_dlmfs, type ocfs2_dlmfs), not
 
configured for labeling
 
eth3: no IPv6 routers present
 
OCFS2 1.4.1 Tue Dec 16 19:18:02 PST 2008 (build
 
3fc82af4b5669945497b322b6aabd031)
 
ocfs2_dlm: Nodes in domain (8B2CCF82F1BA4A70B587580B23D9D7F7): 0
 
kjournald starting.  Commit interval 5 seconds
 
ocfs2: Mounting device (253,3) on (node 0, slot 0) with ordered
data
 
mode.
 
SELinux: initialized (dev dm-3, type ocfs2), not configured for
 
labeling
 
ocfs2_dlm: Nodes in domain (222B65A090D6477481AD30DE9FCE7961): 0
 
kjournald starting.  Commit interval 5 seconds
 
ocfs2: Mounting device (253,2) on (node 0, slot 0) with ordered
data
 
mode.
 
SELinux: initialized (dev dm-2, type ocfs2), not configured for
 
labeling
 
ocfs2_dlm: Nodes in domain (0425C0367AF547E989864A46F3DBD6E6): 0
 
kjournald starting.  Commit interval 5 seconds
 
ocfs2: Mounting device (253,4) on (node 0, slot 0) with ordered
data
 
mode.
 
SELinux: initialized (dev dm-4, type ocfs2), not configured for
 
labeling
 
}
 
Output of Node 1 {
 
OCFS2 Node Manager 1.5.0
 
OCFS2 DLM 1.5.0
 
ocfs2: Registered cluster interface o2cb
 
OCFS2 DLMFS 1.5.0
 
OCFS2 User DLM kernel interface loaded
 
device eth0 entered promiscuous mode
 
OCFS2 1.5.0
 
}
 
On Jan 14, 2009, at 3:58 PM, Sunil Mushran wrote:
 
  What about the dmesg on node 1?
 
  Now ideally we want the fs versions to be the same on all nodes.
 
  However as we have not changed the protocol since 1.4.1, this
 
  should still work.
 
  Bret Palsson wrote:
 
node 0 (and FS) OCFS2 1.4.1 2.6.18-92.1.22.el5xen
 
node 1 OCFS 21.5 2.6.28-vs2.3.0.36.4
 
Output of Node 1 {
 
OCFS2 Node Manager 1.5.0
 
OCFS2 DLM 1.5.0
 
ocfs2: Registered cluster interface o2cb
 
OCFS2 DLMFS 1.5.0
 

Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-14 Thread Bret Palsson
node 0 (and FS) OCFS2 1.4.1 2.6.18-92.1.22.el5xen
node 1 OCFS 21.5 2.6.28-vs2.3.0.36.4

Output of Node 1 {
OCFS2 Node Manager 1.5.0
OCFS2 DLM 1.5.0
ocfs2: Registered cluster interface o2cb
OCFS2 DLMFS 1.5.0
OCFS2 User DLM kernel interface loaded
device eth0 entered promiscuous mode
OCFS2 1.5.0
}
On Jan 14, 2009, at 1:41 PM, Sunil Mushran wrote:

 versions? kernel and fs.

 Bret Palsson wrote:
 Does anyone have any idea what to try next? Here are the steps I have
 taken and the problem: (I wanted to post my question on the first
 line before I explained the problem and what I have tried)

 --

 Node 0 has the file system mounted just fine and works great.

 When trying to mount on Node 1: `mount.ocfs2 /dev/mapper/data / 
 cluster/
 data`  I get this error after about 30 seconds: mount.ocfs2:  
 Transport
 endpoint is not connected while mounting /dev/mapper/data on / 
 cluster/
 data. Check 'dmesg' for more information on this error.


 Here is the output of dmesg:
 (3130,1):o2net_connect_expired:1659 ERROR: no connection established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (4670,1):dlm_request_join:1033 ERROR: status = -107
 (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (4670,1):dlm_join_domain:1485 ERROR: status = -107
 (4670,1):dlm_register_domain:1732 ERROR: status = -107
 (4670,1):o2cb_cluster_connect:302 ERROR: status = -107
 (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)
 (3130,0):o2net_connect_expired:1659 ERROR: no connection established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (5558,1):dlm_request_join:1033 ERROR: status = -107
 (5558,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (5558,1):dlm_join_domain:1485 ERROR: status = -107
 (5558,1):dlm_register_domain:1732 ERROR: status = -107
 (5558,1):o2cb_cluster_connect:302 ERROR: status = -107
 (5558,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (5558,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)


 So I figured that It must be a firewall issue. I first disabled
 iptables on both machines and got the same results so I started ip
 talbes adding an exception on both machines: `iptables -A INPUT -p  
 tcp
 --dport  -j ACCEPT ; service iptables save`

 The machines can ping each other. and they have the exact same  
 config:
 cluster:
  node_count = 2
  name = ocfs2
 node:
  ip_port = 
  ip_address = 10.128.255.3
  number = 0
  name = m3.c12.jiveip.net
  cluster = ocfs2
 node:
  ip_port = 
  ip_address = 10.128.7.33
  number = 1
  name = pbx_33.c12.jiveip.net
  cluster = ocfs2


 I then decided to use tcpdump to see what's up (on both machines):
 `tcpdump -i eth0 port  -v`

 Here is a TCP dump showing port  is not blocked (I added an
 exception in IP tables)
 (Node 0)
 13:13:11.711539 IP (tos 0x0, ttl  64, id 18286, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.47601  10.128.255.3.cbt: S,
 cksum 0xd272 (correct), 3820380795:3820380795(0) win 5840 mss
 1460,sackOK,timestamp 4294911253 0,nop,wscale 6
 13:13:14.710703 IP (tos 0x0, ttl  64, id 18287, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.47601  10.128.255.3.cbt: S,
 cksum 0xc6ba (correct), 3820380795:3820380795(0) win 5840 mss
 1460,sackOK,timestamp 4294914253 0,nop,wscale 6
 13:13:14.711213 IP (tos 0x0, ttl  64, id 2241, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.54763  10.128.255.3.cbt: S,
 cksum 0xd2ae (correct), 3862378508:3862378508(0) win 5840 mss
 1460,sackOK,timestamp 4294914253 0,nop,wscale 6

 (Node 1)
 13:13:09.956999 IP (tos 0x0, ttl  64, id 18286, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.47601  10.128.255.3.cbt: S,
 cksum 0xd272 (correct), 3820380795:3820380795(0) win 5840 mss
 1460,sackOK,timestamp 4294911253 0,nop,wscale 6
 13:13:12.956999 IP (tos 0x0, ttl  64, id 18287, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.47601  10.128.255.3.cbt: S,
 cksum 0xc6ba (correct), 3820380795:3820380795(0) win 5840 mss
 1460,sackOK,timestamp 4294914253 0,nop,wscale 6
 13:13:12.956999 IP (tos 0x0, ttl  64, id 2241, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.54763  10.128.255.3.cbt: S,
 cksum 0xd2ae (correct), 3862378508:3862378508(0) win 5840 mss
 1460,sackOK,timestamp 4294914253 0,nop,wscale 6





 ___
 Ocfs2-devel mailing list
 ocfs2-de...@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-devel




___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-14 Thread Sunil Mushran
What about the dmesg on node 1?

Now ideally we want the fs versions to be the same on all nodes.
However as we have not changed the protocol since 1.4.1, this
should still work.

Bret Palsson wrote:
 node 0 (and FS) OCFS2 1.4.1 2.6.18-92.1.22.el5xen
 node 1 OCFS 21.5 2.6.28-vs2.3.0.36.4

 Output of Node 1 {
 OCFS2 Node Manager 1.5.0
 OCFS2 DLM 1.5.0
 ocfs2: Registered cluster interface o2cb
 OCFS2 DLMFS 1.5.0
 OCFS2 User DLM kernel interface loaded
 device eth0 entered promiscuous mode
 OCFS2 1.5.0
 }
 On Jan 14, 2009, at 1:41 PM, Sunil Mushran wrote:

   
 versions? kernel and fs.

 Bret Palsson wrote:
 
 Does anyone have any idea what to try next? Here are the steps I have
 taken and the problem: (I wanted to post my question on the first
 line before I explained the problem and what I have tried)

 --

 Node 0 has the file system mounted just fine and works great.

 When trying to mount on Node 1: `mount.ocfs2 /dev/mapper/data / 
 cluster/
 data`  I get this error after about 30 seconds: mount.ocfs2:  
 Transport
 endpoint is not connected while mounting /dev/mapper/data on / 
 cluster/
 data. Check 'dmesg' for more information on this error.


 Here is the output of dmesg:
 (3130,1):o2net_connect_expired:1659 ERROR: no connection established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (4670,1):dlm_request_join:1033 ERROR: status = -107
 (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (4670,1):dlm_join_domain:1485 ERROR: status = -107
 (4670,1):dlm_register_domain:1732 ERROR: status = -107
 (4670,1):o2cb_cluster_connect:302 ERROR: status = -107
 (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)
 (3130,0):o2net_connect_expired:1659 ERROR: no connection established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (5558,1):dlm_request_join:1033 ERROR: status = -107
 (5558,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (5558,1):dlm_join_domain:1485 ERROR: status = -107
 (5558,1):dlm_register_domain:1732 ERROR: status = -107
 (5558,1):o2cb_cluster_connect:302 ERROR: status = -107
 (5558,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (5558,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)


 So I figured that It must be a firewall issue. I first disabled
 iptables on both machines and got the same results so I started ip
 talbes adding an exception on both machines: `iptables -A INPUT -p  
 tcp
 --dport  -j ACCEPT ; service iptables save`

 The machines can ping each other. and they have the exact same  
 config:
 cluster:
 node_count = 2
 name = ocfs2
 node:
 ip_port = 
 ip_address = 10.128.255.3
 number = 0
 name = m3.c12.jiveip.net
 cluster = ocfs2
 node:
 ip_port = 
 ip_address = 10.128.7.33
 number = 1
 name = pbx_33.c12.jiveip.net
 cluster = ocfs2


 I then decided to use tcpdump to see what's up (on both machines):
 `tcpdump -i eth0 port  -v`

 Here is a TCP dump showing port  is not blocked (I added an
 exception in IP tables)
 (Node 0)
 13:13:11.711539 IP (tos 0x0, ttl  64, id 18286, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.47601  10.128.255.3.cbt: S,
 cksum 0xd272 (correct), 3820380795:3820380795(0) win 5840 mss
 1460,sackOK,timestamp 4294911253 0,nop,wscale 6
 13:13:14.710703 IP (tos 0x0, ttl  64, id 18287, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.47601  10.128.255.3.cbt: S,
 cksum 0xc6ba (correct), 3820380795:3820380795(0) win 5840 mss
 1460,sackOK,timestamp 4294914253 0,nop,wscale 6
 13:13:14.711213 IP (tos 0x0, ttl  64, id 2241, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.54763  10.128.255.3.cbt: S,
 cksum 0xd2ae (correct), 3862378508:3862378508(0) win 5840 mss
 1460,sackOK,timestamp 4294914253 0,nop,wscale 6

 (Node 1)
 13:13:09.956999 IP (tos 0x0, ttl  64, id 18286, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.47601  10.128.255.3.cbt: S,
 cksum 0xd272 (correct), 3820380795:3820380795(0) win 5840 mss
 1460,sackOK,timestamp 4294911253 0,nop,wscale 6
 13:13:12.956999 IP (tos 0x0, ttl  64, id 18287, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.47601  10.128.255.3.cbt: S,
 cksum 0xc6ba (correct), 3820380795:3820380795(0) win 5840 mss
 1460,sackOK,timestamp 4294914253 0,nop,wscale 6
 13:13:12.956999 IP (tos 0x0, ttl  64, id 2241, offset 0, flags [DF],
 proto: TCP (6), length: 60) 10.128.7.33.54763  10.128.255.3.cbt: S,
 cksum 0xd2ae (correct), 3862378508:3862378508(0) win 5840 mss
 1460,sackOK,timestamp 4294914253 0,nop,wscale 6
   

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-14 Thread Bret Palsson
Output of Node 0 {

OCFS2 Node Manager 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build  
0f78045c75c0174e50e4cf0934bf9eae)
OCFS2 DLM 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build  
4ce8fae327880c466761f40fb7619490)
OCFS2 DLMFS 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build  
4ce8fae327880c466761f40fb7619490)
OCFS2 User DLM kernel interface loaded
SELinux: initialized (dev ocfs2_dlmfs, type ocfs2_dlmfs), not  
configured for labeling
eth3: no IPv6 routers present
OCFS2 1.4.1 Tue Dec 16 19:18:02 PST 2008 (build  
3fc82af4b5669945497b322b6aabd031)
ocfs2_dlm: Nodes in domain (8B2CCF82F1BA4A70B587580B23D9D7F7): 0
kjournald starting.  Commit interval 5 seconds
ocfs2: Mounting device (253,3) on (node 0, slot 0) with ordered data  
mode.
SELinux: initialized (dev dm-3, type ocfs2), not configured for labeling
ocfs2_dlm: Nodes in domain (222B65A090D6477481AD30DE9FCE7961): 0
kjournald starting.  Commit interval 5 seconds
ocfs2: Mounting device (253,2) on (node 0, slot 0) with ordered data  
mode.
SELinux: initialized (dev dm-2, type ocfs2), not configured for labeling
ocfs2_dlm: Nodes in domain (0425C0367AF547E989864A46F3DBD6E6): 0
kjournald starting.  Commit interval 5 seconds
ocfs2: Mounting device (253,4) on (node 0, slot 0) with ordered data  
mode.
SELinux: initialized (dev dm-4, type ocfs2), not configured for labeling
}

Output of Node 1 {
OCFS2 Node Manager 1.5.0
OCFS2 DLM 1.5.0
ocfs2: Registered cluster interface o2cb
OCFS2 DLMFS 1.5.0
OCFS2 User DLM kernel interface loaded
device eth0 entered promiscuous mode
OCFS2 1.5.0
}


On Jan 14, 2009, at 3:58 PM, Sunil Mushran wrote:

 What about the dmesg on node 1?

 Now ideally we want the fs versions to be the same on all nodes.
 However as we have not changed the protocol since 1.4.1, this
 should still work.

 Bret Palsson wrote:
 node 0 (and FS) OCFS2 1.4.1 2.6.18-92.1.22.el5xen
 node 1 OCFS 21.5 2.6.28-vs2.3.0.36.4

 Output of Node 1 {
 OCFS2 Node Manager 1.5.0
 OCFS2 DLM 1.5.0
 ocfs2: Registered cluster interface o2cb
 OCFS2 DLMFS 1.5.0
 OCFS2 User DLM kernel interface loaded
 device eth0 entered promiscuous mode
 OCFS2 1.5.0
 }
 On Jan 14, 2009, at 1:41 PM, Sunil Mushran wrote:


 versions? kernel and fs.

 Bret Palsson wrote:

 Does anyone have any idea what to try next? Here are the steps I  
 have
 taken and the problem: (I wanted to post my question on the  
 first
 line before I explained the problem and what I have tried)

 --

 Node 0 has the file system mounted just fine and works great.

 When trying to mount on Node 1: `mount.ocfs2 /dev/mapper/data /
 cluster/
 data`  I get this error after about 30 seconds: mount.ocfs2:
 Transport
 endpoint is not connected while mounting /dev/mapper/data on /
 cluster/
 data. Check 'dmesg' for more information on this error.


 Here is the output of dmesg:
 (3130,1):o2net_connect_expired:1659 ERROR: no connection  
 established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (4670,1):dlm_request_join:1033 ERROR: status = -107
 (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (4670,1):dlm_join_domain:1485 ERROR: status = -107
 (4670,1):dlm_register_domain:1732 ERROR: status = -107
 (4670,1):o2cb_cluster_connect:302 ERROR: status = -107
 (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)
 (3130,0):o2net_connect_expired:1659 ERROR: no connection  
 established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (5558,1):dlm_request_join:1033 ERROR: status = -107
 (5558,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (5558,1):dlm_join_domain:1485 ERROR: status = -107
 (5558,1):dlm_register_domain:1732 ERROR: status = -107
 (5558,1):o2cb_cluster_connect:302 ERROR: status = -107
 (5558,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (5558,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)


 So I figured that It must be a firewall issue. I first disabled
 iptables on both machines and got the same results so I started ip
 talbes adding an exception on both machines: `iptables -A INPUT -p
 tcp
 --dport  -j ACCEPT ; service iptables save`

 The machines can ping each other. and they have the exact same
 config:
 cluster:
node_count = 2
name = ocfs2
 node:
ip_port = 
ip_address = 10.128.255.3
number = 0
name = m3.c12.jiveip.net
cluster = ocfs2
 node:
ip_port = 
ip_address = 10.128.7.33
number = 1
name = pbx_33.c12.jiveip.net
cluster = ocfs2


 I then decided to use tcpdump to see what's up (on both machines):
 `tcpdump -i eth0 port  -v`

 Here is a TCP dump showing port  is not blocked (I added an
 exception in IP tables)
 (Node 0)
 13:13:11.711539 IP (tos 0x0, ttl  64, id 18286, offset 0, flags  
 [DF],
 proto: TCP (6), length: 60) 10.128.7.33.47601  10.128.255.3.cbt:  
 S,
 cksum 0xd272 (correct), 3820380795:3820380795(0) win 5840 mss
 1460,sackOK,timestamp 4294911253 

Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-14 Thread Michael Moody
I know it sounds stupid,

I had this error, and similar dmesg output when I simply didn't have the 
mountpoint existing (in my case, I mount /dev/sdc1 to /mnt/www, and /mnt/www 
didn't exist, I had the same output). It's worth checking at least, though I'm 
sure you already have.

Michael
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-14 Thread Sunil Mushran
I hate cut-paste's because I have no idea whether I can trust it
or not. A misspelled 0 and 1 makes a whole world of difference.

But the following seems to indicate that the configuration is bad.

(3130,1):o2net_connect_expired:1659 ERROR: no connection established
with node 0 after 30.0 seconds, giving up and returning errors.
(4670,1):dlm_request_join:1033 ERROR: status = -107
(4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
(4670,1):dlm_join_domain:1485 ERROR: status = -107
(4670,1):dlm_register_domain:1732 ERROR: status = -107
(4670,1):o2cb_cluster_connect:302 ERROR: status = -107
(4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
(4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
ocfs2: Unmounting device (253,2) on (node 0)

Why is the mount failing on node 0? I thought it was mounted on
node 0?

Maybe best if you file a bugzilla and attach the /var/log/messages
of both nodes. Indicate the time you did the mount.

Sunil

Bret Palsson wrote:
 Output of Node 0 {

 OCFS2 Node Manager 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build 
 0f78045c75c0174e50e4cf0934bf9eae)
 OCFS2 DLM 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build 
 4ce8fae327880c466761f40fb7619490)
 OCFS2 DLMFS 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build 
 4ce8fae327880c466761f40fb7619490)
 OCFS2 User DLM kernel interface loaded
 SELinux: initialized (dev ocfs2_dlmfs, type ocfs2_dlmfs), not 
 configured for labeling
 eth3: no IPv6 routers present
 OCFS2 1.4.1 Tue Dec 16 19:18:02 PST 2008 (build 
 3fc82af4b5669945497b322b6aabd031)
 ocfs2_dlm: Nodes in domain (8B2CCF82F1BA4A70B587580B23D9D7F7): 0
 kjournald starting.  Commit interval 5 seconds
 ocfs2: Mounting device (253,3) on (node 0, slot 0) with ordered data 
 mode.
 SELinux: initialized (dev dm-3, type ocfs2), not configured for labeling
 ocfs2_dlm: Nodes in domain (222B65A090D6477481AD30DE9FCE7961): 0
 kjournald starting.  Commit interval 5 seconds
 ocfs2: Mounting device (253,2) on (node 0, slot 0) with ordered data 
 mode.
 SELinux: initialized (dev dm-2, type ocfs2), not configured for labeling
 ocfs2_dlm: Nodes in domain (0425C0367AF547E989864A46F3DBD6E6): 0
 kjournald starting.  Commit interval 5 seconds
 ocfs2: Mounting device (253,4) on (node 0, slot 0) with ordered data 
 mode.
 SELinux: initialized (dev dm-4, type ocfs2), not configured for labeling
 }

 Output of Node 1 {
 OCFS2 Node Manager 1.5.0
 OCFS2 DLM 1.5.0
 ocfs2: Registered cluster interface o2cb
 OCFS2 DLMFS 1.5.0
 OCFS2 User DLM kernel interface loaded
 device eth0 entered promiscuous mode
 OCFS2 1.5.0
 }


 On Jan 14, 2009, at 3:58 PM, Sunil Mushran wrote:

 What about the dmesg on node 1?

 Now ideally we want the fs versions to be the same on all nodes.
 However as we have not changed the protocol since 1.4.1, this
 should still work.

 Bret Palsson wrote:
 node 0 (and FS) OCFS2 1.4.1 2.6.18-92.1.22.el5xen
 node 1 OCFS 21.5 2.6.28-vs2.3.0.36.4

 Output of Node 1 {
 OCFS2 Node Manager 1.5.0
 OCFS2 DLM 1.5.0
 ocfs2: Registered cluster interface o2cb
 OCFS2 DLMFS 1.5.0
 OCFS2 User DLM kernel interface loaded
 device eth0 entered promiscuous mode
 OCFS2 1.5.0
 }
 On Jan 14, 2009, at 1:41 PM, Sunil Mushran wrote:


 versions? kernel and fs.

 Bret Palsson wrote:

 Does anyone have any idea what to try next? Here are the steps I have
 taken and the problem: (I wanted to post my question on the first
 line before I explained the problem and what I have tried)

 --

 Node 0 has the file system mounted just fine and works great.

 When trying to mount on Node 1: `mount.ocfs2 /dev/mapper/data /
 cluster/
 data`  I get this error after about 30 seconds: mount.ocfs2:
 Transport
 endpoint is not connected while mounting /dev/mapper/data on /
 cluster/
 data. Check 'dmesg' for more information on this error.


 Here is the output of dmesg:
 (3130,1):o2net_connect_expired:1659 ERROR: no connection established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (4670,1):dlm_request_join:1033 ERROR: status = -107
 (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (4670,1):dlm_join_domain:1485 ERROR: status = -107
 (4670,1):dlm_register_domain:1732 ERROR: status = -107
 (4670,1):o2cb_cluster_connect:302 ERROR: status = -107
 (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)
 (3130,0):o2net_connect_expired:1659 ERROR: no connection established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (5558,1):dlm_request_join:1033 ERROR: status = -107
 (5558,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (5558,1):dlm_join_domain:1485 ERROR: status = -107
 (5558,1):dlm_register_domain:1732 ERROR: status = -107
 (5558,1):o2cb_cluster_connect:302 ERROR: status = -107
 (5558,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (5558,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)


 So I figured that It must be a firewall issue. I first disabled
 iptables 

Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-14 Thread Sunil Mushran
AFAIR, mount will typically will error out with something like mountpoint
does not exist. It should.

Michael Moody wrote:

 I know it sounds stupid,

 I had this error, and similar dmesg output when I simply didn’t have 
 the mountpoint existing (in my case, I mount /dev/sdc1 to /mnt/www, 
 and /mnt/www didn’t exist, I had the same output). It’s worth checking 
 at least, though I’m sure you already have.

 Michael


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-14 Thread Michael Moody
Well, the last time this happened to me, the error was not mountpoint does not 
exist. But, as that's been a while, it's very possible that it does now.

Michael

-Original Message-
From: Sunil Mushran [mailto:sunil.mush...@oracle.com]
Sent: Wednesday, January 14, 2009 4:30 PM
To: Michael Moody
Cc: ocfs2-users@oss.oracle.com
Subject: Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected 
while mounting

AFAIR, mount will typically will error out with something like mountpoint
does not exist. It should.

Michael Moody wrote:

 I know it sounds stupid,

 I had this error, and similar dmesg output when I simply didn't have
 the mountpoint existing (in my case, I mount /dev/sdc1 to /mnt/www,
 and /mnt/www didn't exist, I had the same output). It's worth checking
 at least, though I'm sure you already have.

 Michael


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-14 Thread Bret Palsson
Can I get the source for DLM 1.5.0 and build it on my other machines?  
If so where do I grab it?

Thanks,

Bret

On Jan 14, 2009, at 4:28 PM, Sunil Mushran wrote:

 I hate cut-paste's because I have no idea whether I can trust it
 or not. A misspelled 0 and 1 makes a whole world of difference.

 But the following seems to indicate that the configuration is bad.

 (3130,1):o2net_connect_expired:1659 ERROR: no connection established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (4670,1):dlm_request_join:1033 ERROR: status = -107
 (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (4670,1):dlm_join_domain:1485 ERROR: status = -107
 (4670,1):dlm_register_domain:1732 ERROR: status = -107
 (4670,1):o2cb_cluster_connect:302 ERROR: status = -107
 (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)

 Why is the mount failing on node 0? I thought it was mounted on
 node 0?

 Maybe best if you file a bugzilla and attach the /var/log/messages
 of both nodes. Indicate the time you did the mount.

 Sunil

 Bret Palsson wrote:
 Output of Node 0 {

 OCFS2 Node Manager 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 0f78045c75c0174e50e4cf0934bf9eae)
 OCFS2 DLM 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 4ce8fae327880c466761f40fb7619490)
 OCFS2 DLMFS 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 4ce8fae327880c466761f40fb7619490)
 OCFS2 User DLM kernel interface loaded
 SELinux: initialized (dev ocfs2_dlmfs, type ocfs2_dlmfs), not
 configured for labeling
 eth3: no IPv6 routers present
 OCFS2 1.4.1 Tue Dec 16 19:18:02 PST 2008 (build
 3fc82af4b5669945497b322b6aabd031)
 ocfs2_dlm: Nodes in domain (8B2CCF82F1BA4A70B587580B23D9D7F7): 0
 kjournald starting.  Commit interval 5 seconds
 ocfs2: Mounting device (253,3) on (node 0, slot 0) with ordered data
 mode.
 SELinux: initialized (dev dm-3, type ocfs2), not configured for  
 labeling
 ocfs2_dlm: Nodes in domain (222B65A090D6477481AD30DE9FCE7961): 0
 kjournald starting.  Commit interval 5 seconds
 ocfs2: Mounting device (253,2) on (node 0, slot 0) with ordered data
 mode.
 SELinux: initialized (dev dm-2, type ocfs2), not configured for  
 labeling
 ocfs2_dlm: Nodes in domain (0425C0367AF547E989864A46F3DBD6E6): 0
 kjournald starting.  Commit interval 5 seconds
 ocfs2: Mounting device (253,4) on (node 0, slot 0) with ordered data
 mode.
 SELinux: initialized (dev dm-4, type ocfs2), not configured for  
 labeling
 }

 Output of Node 1 {
 OCFS2 Node Manager 1.5.0
 OCFS2 DLM 1.5.0
 ocfs2: Registered cluster interface o2cb
 OCFS2 DLMFS 1.5.0
 OCFS2 User DLM kernel interface loaded
 device eth0 entered promiscuous mode
 OCFS2 1.5.0
 }


 On Jan 14, 2009, at 3:58 PM, Sunil Mushran wrote:

 What about the dmesg on node 1?

 Now ideally we want the fs versions to be the same on all nodes.
 However as we have not changed the protocol since 1.4.1, this
 should still work.

 Bret Palsson wrote:
 node 0 (and FS) OCFS2 1.4.1 2.6.18-92.1.22.el5xen
 node 1 OCFS 21.5 2.6.28-vs2.3.0.36.4

 Output of Node 1 {
 OCFS2 Node Manager 1.5.0
 OCFS2 DLM 1.5.0
 ocfs2: Registered cluster interface o2cb
 OCFS2 DLMFS 1.5.0
 OCFS2 User DLM kernel interface loaded
 device eth0 entered promiscuous mode
 OCFS2 1.5.0
 }
 On Jan 14, 2009, at 1:41 PM, Sunil Mushran wrote:


 versions? kernel and fs.

 Bret Palsson wrote:

 Does anyone have any idea what to try next? Here are the steps  
 I have
 taken and the problem: (I wanted to post my question on the  
 first
 line before I explained the problem and what I have tried)

 --

 Node 0 has the file system mounted just fine and works great.

 When trying to mount on Node 1: `mount.ocfs2 /dev/mapper/data /
 cluster/
 data`  I get this error after about 30 seconds: mount.ocfs2:
 Transport
 endpoint is not connected while mounting /dev/mapper/data on /
 cluster/
 data. Check 'dmesg' for more information on this error.


 Here is the output of dmesg:
 (3130,1):o2net_connect_expired:1659 ERROR: no connection  
 established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (4670,1):dlm_request_join:1033 ERROR: status = -107
 (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (4670,1):dlm_join_domain:1485 ERROR: status = -107
 (4670,1):dlm_register_domain:1732 ERROR: status = -107
 (4670,1):o2cb_cluster_connect:302 ERROR: status = -107
 (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)
 (3130,0):o2net_connect_expired:1659 ERROR: no connection  
 established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (5558,1):dlm_request_join:1033 ERROR: status = -107
 (5558,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (5558,1):dlm_join_domain:1485 ERROR: status = -107
 (5558,1):dlm_register_domain:1732 ERROR: status = -107
 (5558,1):o2cb_cluster_connect:302 ERROR: status = -107
 (5558,1):ocfs2_dlm_init:2753 

Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-14 Thread Sunil Mushran
It's part and parcel of the fs. If you want mainline linux,
goto http://kernel.org.

Bret Palsson wrote:
 Can I get the source for DLM 1.5.0 and build it on my other machines? 
 If so where do I grab it?

 Thanks,

 Bret

 On Jan 14, 2009, at 4:28 PM, Sunil Mushran wrote:

 I hate cut-paste's because I have no idea whether I can trust it
 or not. A misspelled 0 and 1 makes a whole world of difference.

 But the following seems to indicate that the configuration is bad.

 (3130,1):o2net_connect_expired:1659 ERROR: no connection established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (4670,1):dlm_request_join:1033 ERROR: status = -107
 (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (4670,1):dlm_join_domain:1485 ERROR: status = -107
 (4670,1):dlm_register_domain:1732 ERROR: status = -107
 (4670,1):o2cb_cluster_connect:302 ERROR: status = -107
 (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)

 Why is the mount failing on node 0? I thought it was mounted on
 node 0?

 Maybe best if you file a bugzilla and attach the /var/log/messages
 of both nodes. Indicate the time you did the mount.

 Sunil

 Bret Palsson wrote:
 Output of Node 0 {

 OCFS2 Node Manager 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 0f78045c75c0174e50e4cf0934bf9eae)
 OCFS2 DLM 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 4ce8fae327880c466761f40fb7619490)
 OCFS2 DLMFS 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 4ce8fae327880c466761f40fb7619490)
 OCFS2 User DLM kernel interface loaded
 SELinux: initialized (dev ocfs2_dlmfs, type ocfs2_dlmfs), not
 configured for labeling
 eth3: no IPv6 routers present
 OCFS2 1.4.1 Tue Dec 16 19:18:02 PST 2008 (build
 3fc82af4b5669945497b322b6aabd031)
 ocfs2_dlm: Nodes in domain (8B2CCF82F1BA4A70B587580B23D9D7F7): 0
 kjournald starting.  Commit interval 5 seconds
 ocfs2: Mounting device (253,3) on (node 0, slot 0) with ordered data
 mode.
 SELinux: initialized (dev dm-3, type ocfs2), not configured for 
 labeling
 ocfs2_dlm: Nodes in domain (222B65A090D6477481AD30DE9FCE7961): 0
 kjournald starting.  Commit interval 5 seconds
 ocfs2: Mounting device (253,2) on (node 0, slot 0) with ordered data
 mode.
 SELinux: initialized (dev dm-2, type ocfs2), not configured for 
 labeling
 ocfs2_dlm: Nodes in domain (0425C0367AF547E989864A46F3DBD6E6): 0
 kjournald starting.  Commit interval 5 seconds
 ocfs2: Mounting device (253,4) on (node 0, slot 0) with ordered data
 mode.
 SELinux: initialized (dev dm-4, type ocfs2), not configured for 
 labeling
 }

 Output of Node 1 {
 OCFS2 Node Manager 1.5.0
 OCFS2 DLM 1.5.0
 ocfs2: Registered cluster interface o2cb
 OCFS2 DLMFS 1.5.0
 OCFS2 User DLM kernel interface loaded
 device eth0 entered promiscuous mode
 OCFS2 1.5.0
 }


 On Jan 14, 2009, at 3:58 PM, Sunil Mushran wrote:

 What about the dmesg on node 1?

 Now ideally we want the fs versions to be the same on all nodes.
 However as we have not changed the protocol since 1.4.1, this
 should still work.

 Bret Palsson wrote:
 node 0 (and FS) OCFS2 1.4.1 2.6.18-92.1.22.el5xen
 node 1 OCFS 21.5 2.6.28-vs2.3.0.36.4

 Output of Node 1 {
 OCFS2 Node Manager 1.5.0
 OCFS2 DLM 1.5.0
 ocfs2: Registered cluster interface o2cb
 OCFS2 DLMFS 1.5.0
 OCFS2 User DLM kernel interface loaded
 device eth0 entered promiscuous mode
 OCFS2 1.5.0
 }
 On Jan 14, 2009, at 1:41 PM, Sunil Mushran wrote:


 versions? kernel and fs.

 Bret Palsson wrote:

 Does anyone have any idea what to try next? Here are the steps I 
 have
 taken and the problem: (I wanted to post my question on the 
 first
 line before I explained the problem and what I have tried)

 --

 Node 0 has the file system mounted just fine and works great.

 When trying to mount on Node 1: `mount.ocfs2 /dev/mapper/data /
 cluster/
 data`  I get this error after about 30 seconds: mount.ocfs2:
 Transport
 endpoint is not connected while mounting /dev/mapper/data on /
 cluster/
 data. Check 'dmesg' for more information on this error.


 Here is the output of dmesg:
 (3130,1):o2net_connect_expired:1659 ERROR: no connection 
 established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (4670,1):dlm_request_join:1033 ERROR: status = -107
 (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (4670,1):dlm_join_domain:1485 ERROR: status = -107
 (4670,1):dlm_register_domain:1732 ERROR: status = -107
 (4670,1):o2cb_cluster_connect:302 ERROR: status = -107
 (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
 (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
 ocfs2: Unmounting device (253,2) on (node 0)
 (3130,0):o2net_connect_expired:1659 ERROR: no connection 
 established
 with node 0 after 30.0 seconds, giving up and returning errors.
 (5558,1):dlm_request_join:1033 ERROR: status = -107
 (5558,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 (5558,1):dlm_join_domain:1485 ERROR: status = -107
 (5558,1):dlm_register_domain:1732 ERROR: