I have four servers, absolutely identical, connected to the same switches. One 
interface is on a 100Mb switch, the other is on a 1Gb switch. I access the 
nodes via the 100Mb port, gluster is configured on the 1Gb port. The nodes are 
all loaded with Scientific Linux 6.3, Virtualization Host, with glusterfs-3.2.7 
from EPEL. The nodes are 2-socket quad-core AMD (so 8 cores total) servers with 
6x 300GB internal drives. I'm using LVM on top of h/w RAID0, and have a 1.5TB 
xfs brick on each node. I have libvirtd running, but no VMs created yet.

I initially configured each pair of servers as a separate cluster with a 1x2 
replicated volume. Mount the volumes as glusterfs from localhost, and dd tests 
gives me ~90MB/s.. pretty decent for 1Gb network (max 125MB/s). So, tear that 
all down, and join all four nodes together, and create a 2x2 
distributed-replicated volume. Now is where it gets interesting. First node dd 
test is consistent. Second node dd test is half-speed. Third node dd test is 
back to full speed. Fourth node dd test is back to half-speed. So when I look 
in the bricks directly, I see that the nodes that were slower had their file in 
a brick that was not part of the replica they were hosting.

For example..
gluster volume create vol1 replica 2 transport tcp server1:/brick1 
server2:/brick2 server3:/brick3 server4:/brick4

server1:/brick1 and server2:/brick2 are the first replica pair
server3:/brick3 and server4:/brick4 are the second replica pair

server1.. file1 goes into brick1/brick2 - fast
server2.. file2 goes into brick3/brick4 - slow
server3.. file3 goes into brick3/brick4 - fast
server4.. file4 goes into brick1/brick2 - slow

So I delete that volume, and create another..
gluster volume create vol2 replica 2 transport tcp server2:/brick2 
server3:/brick3 server4:/brick4 server1:/brick1

server2:/brick2 and server3:/brick3 are the first replica pair
server4:/brick4 and server1:/brick1 are the second replica pair

server2.. file2 goes into brick2/brick3 - fast
server3.. file3 goes into brick2/brick3 - fast
server4.. file4 goes into brick4/brick1 - fast
server1.. file1 goes into brick4/brick1 - fast

So now I'm like seriously WTF. So I remove all output files, and try four 
consecutive tests from the same node with output file1, file2, file3, file4. 
And sure enough two of them are fast and two are slow, and the fast ones are 
placed in "its" replica pair and the slow ones are in the other. And I notice 
that every time I delete them, the files get created in the same replica pair 
each time, no matter what order I create them. I've tried this with nfs mounts 
also (instead of glusterfs), and the results are the same.

Has anyone seen this behavior before? Is this a known issue or 
mis-configuration?
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to