Response inline.

Jeff White
Linux/Unix Systems Engineer
University of Pittsburgh - CSSD
[email protected]


On 10/17/2011 01:11 PM, Jeff Shaw wrote:
Hello Gluster users,
Before I put Gluster into production, I am wondering how it determines whether 
a byte can be written, and where I should look in the source code to change 
these behaviors. My experiences are with glusterfs 3.2.4 on CentOS 6 64-bit.

Suppose I have a Gluster volume made up of four 1 MB bricks, like this

Volume Name: test
Type: Distributed-Replicate
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gluster0-node0:/brick0
Brick2: gluster0-node1:/brick1
Brick3: gluster0-node0:/brick2
Brick4: gluster0-node1:/brick3

The mounted Gluster volume will report that the size of the volume is 2 MB, 
which creates a false impression that it can hold a 2 MB file. This isn't too 
bad, since people are used to a file system's maximum file size being smaller 
than the file system's maximum total size.

Scenario 1: One brick runs out of space first.

Taking this a step further, suppose brick0 is actually 2 MB, and I attempt to 
copy a file having 2 MB to the Gluster volume. If Gluster chooses to copy the 
file to brick0 and brick1, then the copy succeeds, although brick1 only stores 
half the file. When brick0 fails, only half of the file is available for 
reading. It would be better if Gluster failed to continue writing when one 
brick in the replication group ran out of space.

Scenario 2: One brick is umounted.

Suppose after Scenario 1 completes, brick0 goes offline. Then, a user attempts 
to retrieve the 2 MB file. The user receives the file fragment. Because 
gluster0-node0:/brick0 is unmounted, the file doesn't exist at that location, 
and so the gateway copies the file fragment from gluster0-node1:/brick1 onto 
gluster0-node0:/brick0. Then, even worse, the user starts copying files onto 
the Gluster volume. All the files destined for the first replication group 
appear under /brick0, even though it's unmounted. This eventually will fill up 
the root file system.

I think to fix this, when creating a file, Gluster should make sure that the 
file system that the brick was originally created on is mounted.

I had an idea for this already: http://bugs.gluster.com/show_bug.cgi?id=3578

Also, perhaps bricks should only be able to be created at mount points.

I think this would be too limiting. Some people might have a large /data mount point but only want /data/gluster to hold the gluster files.

A colleague of mine suggested mounting all the Gluster bricks within another 
file system's path that's read only.

This would be more complicated for a gluster admin to set up but could be possible. You could also mount tmpfs or something to /data then your real storage to /data/gluster. That might work even though tmpfs itself won't work with Gluster (I don't think so at least) so I'm not sure what would happen if /data/gluster was unmounted and gluster suddenly fell into an unsupported filesystem type.

In any case I wouldn't force this to need to be true but just have it as a way an admin could design their servers if they wish.

Gluster's source code is quite large, so if someone could point me to the right 
files to edit, I'd be happy to change its behavior to match what I expect.

Thanks,
Jeff
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to