Re: [Gluster-users] Performance and redundancy help

Chad Fri, 26 Feb 2010 09:02:02 -0800

Ok, using glusterfs-volgen I rebuilt the config files and got the cluster 
working again.
The performance is improved from 1.6 to about 24, NFS is at about 11, and 
straight to the disks is 170+.
There seems to be a HUGE performance loss by using a network file system.


I must still be doing something wrong, because I have several problems:
1. If I create a file on the client while 1 of the servers is down, when the 
server comes back up the file is still missing.
        I thought the servers were supposed to be replicated, why don't they 
re-sync?

2. If server 1 is down and 2 is up, it takes about 30 seconds to do a df on the client, that is WAY to long, lots of my applications will timeout in under thattime.

        How do I get the client to work the same if 1 of the servers is down?

3. On the server if the network is down and I try to do a df it hangs my shell (the server has the glusterfs mounted as a client) This is typical behavior forNFS also, is there any way to timeout instead of hanging?


Here are my new config files:
----- server.vol -----
volume tcb_posix
  type storage/posix
  option directory /mnt/tcb_data
end-volume

volume tcb_locks
    type features/locks
    subvolumes tcb_posix
end-volume

volume tcb_brick
    type performance/io-threads
    option thread-count 8
    subvolumes tcb_locks
end-volume

volume tcb_server
    type protocol/server
    option transport-type tcp
    option auth.addr.tcb_brick.allow *
    option transport.socket.listen-port 50001
    option transport.socket.nodelay on
    subvolumes tcb_brick
end-volume

----------------------------------------
----- client.vol -----
volume tcb_remote_glust1
    type protocol/client
    option transport-type tcp
    option remote-host x.x.x.x
    option transport.socket.nodelay on
    option transport.remote-port 50001
    option remote-subvolume tcb_brick
end-volume

volume tcb_remote_glust2
    type protocol/client
    option transport-type tcp
    option remote-host y.y.y.y
    option transport.socket.nodelay on
    option transport.remote-port 50001
    option remote-subvolume tcb_brick
end-volume

volume tcb_mirror
    type cluster/replicate
    subvolumes tcb_remote_glust1 tcb_remote_glust2
end-volume

volume tcb_writebehind
    type performance/write-behind
    option cache-size 4MB
    subvolumes tcb_mirror
end-volume

volume tcb_readahead
    type performance/read-ahead
    option page-count 4
    subvolumes tcb_writebehind
end-volume

volume tcb_iocache
    type performance/io-cache
    option cache-size `grep 'MemTotal' /proc/meminfo  | awk '{print $2 * 0.2 / 
1024}' | cut -f1 -d.`MB
    option cache-timeout 1
    subvolumes tcb_readahead
end-volume

volume tcb_quickread
    type performance/quick-read
    option cache-timeout 1
    option max-file-size 64kB
    subvolumes tcb_iocache
end-volume

volume tcb_statprefetch
    type performance/stat-prefetch
    subvolumes tcb_quickread
end-volume

^C



Chad wrote:

Ok, I tried to change over to this, but now I just get:

[2010-02-24 09:30:41] E [authenticate.c:234:gf_authenticate] auth: noauthentication module is interested in accepting remote-client10.0.0.24:1007[2010-02-24 09:30:41] E [server-protocol.c:5822:mop_setvolume]tcb_remote: Cannot authenticate client from 10.0.0.24:1007


I am sure it is something simple, I just don't know what.

Is this a port problem? the port in the log is 1007, but the server ison 50001.


Here are my config files:
----- server.vol: -----
volume tcb_posix-export
  type storage/posix
  option directory /mnt/tcb_data
end-volume

volume tcb_locks-export
  type features/locks
  subvolumes tcb_posix-export
end-volume

volume tcb_export
  type performance/io-threads
  option thread-count 8
  subvolumes tcb_locks-export
end-volume

volume tcb_remote
        type protocol/server
        option transport-type tcp
        option transport.socket.listen-port 50001
        option transport.socket.nodelay on
        subvolumes tcb_export tcb_locks-export

option auth.ip.tcb_locks-export.allow10.0.0.*,10.0.20.*,10.0.30.*,192.168.1.*,192.168.20.*,192.168.30.*,127.0.0.1option auth.ip.tcb_export.allow10.0.0.*,10.0.20.*,10.0.30.*,192.168.1.*,192.168.20.*,192.168.30.*,127.0.0.1

end-volume


----- client.vol -----
volume tcb_remote1
  type protocol/client
  option transport-type tcp
  option remote-port 50001
  option remote-host 10.0.0.24
  option remote-subvolume tcb_remote
end-volume

volume tcb_remote2
  type protocol/client
  option transport-type tcp
  option remote-port 50001
  option remote-host 10.0.0.25
  option remote-subvolume tcb_remote
end-volume

volume tcb_mirror
  type cluster/afr
  subvolumes tcb_remote1 tcb_remote2
end-volume

volume tcb_wb
  type performance/write-behind
  option cache-size 1MB
  subvolumes tcb_mirror
end-volume

volume tcb_ioc
  type performance/io-cache
  option cache-size 32MB
  subvolumes tcb_wb
end-volume

volume tcb_iothreads
  type performance/io-threads
  option thread-count 16
  subvolumes tcb_ioc
end-volume
^C



Chad wrote:

I finally got the servers transported 2000 miles, set-up, wired, andbooted.

Here are the vol files.

Just to reiterate, the issues are slow performance on read/write, andclients hanging when 1 server goes down.



### glusterfs.vol ###
############################################
# Start tcb_cluster
############################################
# the exported volume to mount                    # required!
volume tcb_cluster
        type protocol/client
        option transport-type tcp/client
        option remote-host glustcluster
        option remote-port 50001
        option remote-subvolume tcb_cluster
end-volume

############################################
# Start cs_cluster
############################################
# the exported volume to mount                    # required!
volume cs_cluster
        type protocol/client
        option transport-type tcp/client
        option remote-host glustcluster
        option remote-port 50002
        option remote-subvolume cs_cluster
end-volume

############################################
# Start pbx_cluster
############################################
# the exported volume to mount                    # required!
volume pbx_cluster
        type protocol/client
        option transport-type tcp/client
        option remote-host glustcluster
        option remote-port 50003
        option remote-subvolume pbx_cluster
end-volume


---------------------------------------------------
### glusterfsd.vol ###
#############################################
# Start tcb_data cluster
#############################################
volume tcb_local
        type storage/posix
        option directory /mnt/tcb_data
end-volume

volume tcb_locks
        type features/locks

option mandatory-locks on # enables mandatory lockingon all files

        subvolumes tcb_local
end-volume

# dataspace on remote machine, look in /etc/hosts to see that
volume tcb_locks_remote
        type protocol/client
        option transport-type tcp
        option remote-port 50001
        option remote-host 192.168.1.25
        option remote-subvolume tcb_locks
end-volume

# automatic file replication translator for dataspace
volume tcb_cluster_afr
        type cluster/replicate
        subvolumes tcb_locks tcb_locks_remote
end-volume

# the actual exported volume
volume tcb_cluster
        type performance/io-threads
        option thread-count 256
        option cache-size 128MB
        subvolumes tcb_cluster_afr
end-volume

volume tcb_cluster_server
        type protocol/server
        option transport-type tcp
        option transport.socket.listen-port 50001
        option auth.addr.tcb_locks.allow *
        option auth.addr.tcb_cluster.allow *
        option transport.socket.nodelay on
        subvolumes tcb_cluster
end-volume

#############################################
# Start cs_data cluster
#############################################
volume cs_local
        type storage/posix
        option directory /mnt/cs_data
end-volume

volume cs_locks
        type features/locks

option mandatory-locks on # enables mandatory lockingon all files

        subvolumes cs_local
end-volume

# dataspace on remote machine, look in /etc/hosts to see that
volume cs_locks_remote
        type protocol/client
        option transport-type tcp
        option remote-port 50002
        option remote-host 192.168.1.25
        option remote-subvolume cs_locks
end-volume

# automatic file replication translator for dataspace
volume cs_cluster_afr
        type cluster/replicate
        subvolumes cs_locks cs_locks_remote
end-volume

# the actual exported volume
volume cs_cluster
        type performance/io-threads
        option thread-count 256
        option cache-size 128MB
        subvolumes cs_cluster_afr
end-volume

volume cs_cluster_server
        type protocol/server
        option transport-type tcp
        option transport.socket.listen-port 50002
        option auth.addr.cs_locks.allow *
        option auth.addr.cs_cluster.allow *
        option transport.socket.nodelay on
        subvolumes cs_cluster
end-volume

#############################################
# Start pbx_data cluster
#############################################
volume pbx_local
        type storage/posix
        option directory /mnt/pbx_data
end-volume

volume pbx_locks
        type features/locks

option mandatory-locks on # enables mandatory lockingon all files

        subvolumes pbx_local
end-volume

# dataspace on remote machine, look in /etc/hosts to see that
volume pbx_locks_remote
        type protocol/client
        option transport-type tcp
        option remote-port 50003
        option remote-host 192.168.1.25
        option remote-subvolume pbx_locks
end-volume

# automatic file replication translator for dataspace
volume pbx_cluster_afr
        type cluster/replicate
        subvolumes pbx_locks pbx_locks_remote
end-volume

# the actual exported volume
volume pbx_cluster
        type performance/io-threads
        option thread-count 256
        option cache-size 128MB
        subvolumes pbx_cluster_afr
end-volume

volume pbx_cluster_server
        type protocol/server
        option transport-type tcp
        option transport.socket.listen-port 50003
        option auth.addr.pbx_locks.allow *
        option auth.addr.pbx_cluster.allow *
        option transport.socket.nodelay on
        subvolumes pbx_cluster
end-volume


--
^C



Smart Weblications GmbH - Florian Wiessner wrote:

Hi,

Am 16.02.2010 01:58, schrieb Chad:

I am new to glusterfs, and this list, please let me know if I have made
any mistakes in posting this to the list.
I am not sure what your standards are.

I came across glusterfs last week, it was super easy to set-up and test
and is almost exactly what I want/need.
I set up 2 "glusterfs servers" that serve up a mirrored raid5 disk
partitioned into 3 5oogb partitions to 6 clients.
I am using round robin DNS, but I also tried to use heartbeat and
ldirectord (see details below).
Each server has 2 NICs: 1 for the clients, the other has a cross over
cable connecting the 2 servers. Both NICs are 1000mbps.

There are only 2 issues.
#1. When one of the servers goes down the clients hang at least for a
little while (more testing is needed) I am not sure if the clients can
recover at all.
#2. The read/write tests I performed came in at 1.6 when using

glusterfs, NFS on all the same machines came in at 11, and a directtest

on the data server came
in at 111. How do I improve the performance?

please share your vol-files. i don't understand why you would needloadbalancers.

###############################################
My glusterfs set-up:
2 supermicro dual Xeon 3.0 ghz CPUs, 8gb ram, 4 @ 750gb seagate sata
HDs, 3 in raid5 with 1 hot spare. (data servers)


why not use raid10? same capacity, better speed..

6 supermicro dual AMD 2.8 ghz CPUs, 4gb ram, 2 @ 250gb seagate sata HDs
in raid 1. (client machines)
glusterfs is set-up with round robin DNS to handle the loadbalancing of
the 2 data servers.


afaik there is no need to setup dns rr nor loadbalancing for the gluster

servers, glusterfs should take care of that itself. but without yourvolfiles i

can't give any hints.

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] Performance and redundancy help

Reply via email to