Hi

I am currently testing GlusterFS in with replication.
I am running Ubuntu hardy using packages from the PPA on launchpad.net. I am currently using glusterfs 2.0.6.

I have 2 machines, both exporting 1 brick each. This is the config I'm using:
----8<----8<----8<----8<----8<----8<----8<----8<----8<----
volume posix
 type storage/posix
 option directory /home/export/
end-volume

volume locks
  type features/locks
  subvolumes posix
end-volume

volume cache
  type performance/io-cache
  subvolumes locks
end-volume

volume brick
  type performance/io-threads
  option thread-count 8
  subvolumes cache
end-volume

### Add network serving capability to above brick.
volume server
 type protocol/server
 option transport-type tcp
 subvolumes brick
 option auth.addr.brick.allow * # Allow access to "brick" volume
end-volume
----8<----8<----8<----8<----8<----8<----8<----8<----8<----

I then have 2 clients (which happen to be the same 2 machines) that connect to both bricks and replicate them using this config:

----8<----8<----8<----8<----8<----8<----8<----8<----8<----
### Add client feature and attach to remote subvolume of server1
volume brick1
 type protocol/client
 option transport-type tcp
 option remote-host 172.19.45.102      # IP address of the remote brick
 option remote-subvolume brick        # name of the remote volume
end-volume

### Add client feature and attach to remote subvolume of server2
volume brick2
 type protocol/client
 option transport-type tcp
 option remote-host 172.19.45.103      # IP address of the remote brick
 option remote-subvolume brick        # name of the remote volume
end-volume

volume replicate
 type cluster/replicate
 subvolumes brick1 brick2
end-volume
----8<----8<----8<----8<----8<----8<----8<----8<----8<----

If I start the 2 servers up, then mount both clients everything works file. I have shared storage which is replicated to each host.

If I shut the one brick down, the client on that machine also dies and I get strange errors:
----8<----8<----8<----8<----8<----8<----8<----8<----8<----
# cd /mnt/gluster
bash: cd: /mnt/gluster: Transport endpoint is not connected
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             9.5G  1.1G  7.9G  13% /
varrun                125M   68K  125M   1% /var/run
varlock               125M     0  125M   0% /var/lock
udev                  125M   44K  125M   1% /dev
devshm                125M     0  125M   0% /dev/shm
df: `/mnt/gluster': Transport endpoint is not connected
# mount
/dev/sda1 on / type ext3 (rw,relatime,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
/sys on /sys type sysfs (rw,noexec,nosuid,nodev)
varrun on /var/run type tmpfs (rw,noexec,nosuid,nodev,mode=0755)
varlock on /var/lock type tmpfs (rw,noexec,nosuid,nodev,mode=1777)
udev on /dev type tmpfs (rw,mode=0755)
devshm on /dev/shm type tmpfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
securityfs on /sys/kernel/security type securityfs (rw)
/etc/glusterfs/glusterfs.vol on /mnt/gluster type fuse.glusterfs (rw,allow_other,default_permissions,max_read=131072)
----8<----8<----8<----8<----8<----8<----8<----8<----8<----

Here is a copy of debug logs:
[2009-10-01 08:16:15] D [glusterfsd.c:354:_get_specfp] glusterfs: loading volume file /etc/glusterfs/glusterfs.vol
================================================================================
Version      : glusterfs 2.0.6 built on Aug 31 2009 20:14:31
TLA Revision : v2.0.6
Starting Time: 2009-10-01 08:16:15
Command line : glusterfs --log-level=DEBUG --volfile=/etc/glusterfs/glusterfs.vol /mnt/gluster/
PID          : 17884
System name  : Linux
Nodename     : cj-cpt-molb01
Kernel Release : 2.6.24-24-server
Hardware Identifier: i686

Given volfile:
+------------------------------------------------------------------------------+
  1: ### Add client feature and attach to remote subvolume of server1
  2: volume brick1
  3:  type protocol/client
  4:  option transport-type tcp
5: option remote-host 172.19.45.102 # IP address of the remote brick
  6:  option remote-subvolume brick        # name of the remote volume
  7: end-volume
  8:
  9: ### Add client feature and attach to remote subvolume of server2
 10: volume brick2
 11:  type protocol/client
 12:  option transport-type tcp
13: option remote-host 172.19.45.103 # IP address of the remote brick
 14:  option remote-subvolume brick        # name of the remote volume
 15: end-volume
 16:
 17: volume replicate
 18:  type cluster/replicate
 19:  subvolumes brick1 brick2
 20: end-volume

+------------------------------------------------------------------------------+
[2009-10-01 08:16:15] D [glusterfsd.c:1205:main] glusterfs: running in pid 17884 [2009-10-01 08:16:15] D [client-protocol.c:5952:init] brick1: defaulting frame-timeout to 30mins [2009-10-01 08:16:15] D [client-protocol.c:5963:init] brick1: defaulting ping-timeout to 10 [2009-10-01 08:16:15] D [transport.c:141:transport_load] transport: attempt to load file /usr/lib/glusterfs/2.0.6/transport/socket.so [2009-10-01 08:16:15] D [transport.c:141:transport_load] transport: attempt to load file /usr/lib/glusterfs/2.0.6/transport/socket.so [2009-10-01 08:16:15] D [client-protocol.c:5952:init] brick2: defaulting frame-timeout to 30mins [2009-10-01 08:16:15] D [client-protocol.c:5963:init] brick2: defaulting ping-timeout to 10 [2009-10-01 08:16:15] D [transport.c:141:transport_load] transport: attempt to load file /usr/lib/glusterfs/2.0.6/transport/socket.so [2009-10-01 08:16:15] D [transport.c:141:transport_load] transport: attempt to load file /usr/lib/glusterfs/2.0.6/transport/socket.so [2009-10-01 08:16:15] D [client-protocol.c:6280:notify] brick1: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-10-01 08:16:15] D [client-protocol.c:6280:notify] brick1: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-10-01 08:16:15] D [client-protocol.c:6280:notify] brick2: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-10-01 08:16:15] D [client-protocol.c:6280:notify] brick2: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-10-01 08:16:15] D [client-protocol.c:6280:notify] brick1: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-10-01 08:16:15] D [client-protocol.c:6280:notify] brick1: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-10-01 08:16:15] D [client-protocol.c:6280:notify] brick2: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-10-01 08:16:15] D [client-protocol.c:6280:notify] brick2: got GF_EVENT_PARENT_UP, attempting connect on transport [2009-10-01 08:16:15] N [glusterfsd.c:1224:main] glusterfs: Successfully started [2009-10-01 08:16:15] D [client-protocol.c:6294:notify] brick1: got GF_EVENT_CHILD_UP [2009-10-01 08:16:15] D [client-protocol.c:6294:notify] brick1: got GF_EVENT_CHILD_UP [2009-10-01 08:16:15] N [client-protocol.c:5559:client_setvolume_cbk] brick1: Connected to 172.19.45.102:6996, attached to remote volume 'brick'. [2009-10-01 08:16:15] N [afr.c:2203:notify] replicate: Subvolume 'brick1' came back up; going online. [2009-10-01 08:16:15] N [client-protocol.c:5559:client_setvolume_cbk] brick1: Connected to 172.19.45.102:6996, attached to remote volume 'brick'. [2009-10-01 08:16:15] N [afr.c:2203:notify] replicate: Subvolume 'brick1' came back up; going online. [2009-10-01 08:16:15] D [client-protocol.c:6294:notify] brick2: got GF_EVENT_CHILD_UP [2009-10-01 08:16:15] D [client-protocol.c:6294:notify] brick2: got GF_EVENT_CHILD_UP [2009-10-01 08:16:15] N [client-protocol.c:5559:client_setvolume_cbk] brick2: Connected to 172.19.45.103:6996, attached to remote volume 'brick'. [2009-10-01 08:16:15] N [client-protocol.c:5559:client_setvolume_cbk] brick2: Connected to 172.19.45.103:6996, attached to remote volume 'brick'.
[2009-10-01 08:17:24] N [client-protocol.c:6246:notify] brick1: disconnected
[2009-10-01 08:17:27] E [socket.c:745:socket_connect_finish] brick1: connection to 172.19.45.102:6996 failed (Connection refused) [2009-10-01 08:17:27] E [socket.c:745:socket_connect_finish] brick1: connection to 172.19.45.102:6996 failed (Connection refused)



Any ideas?


--
Adrian Moisey
Systems Designer | CareerJunction | Better jobs. More often.
Web: www.careerjunction.co.za | Email: [email protected]
Phone: +27 21 818 8621 | Mobile: +27 82 858 7830 | Fax: +27 21 818 8855
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to