Date: Wed, 16 Feb 2011 12:40:35 -0500
From: "William L. Sebok"<[email protected]>
Subject: [Gluster-users]  "remove-brick" command SHOULD migrate data
To: Rahul C S<[email protected]>
Cc: Mark Wolfire<[email protected]>, [email protected],
        Kwang-Ho Park<[email protected]>,       "Derek C. Richardson"
        <[email protected]>,     Randall Perrine<[email protected]>
Message-ID:<[email protected]>
Content-Type: text/plain; charset=us-ascii

On Tue, Feb 15, 2011 at 11:49:06PM -0600, Rahul C S wrote:
For the last question,
"remove-brick" command does not migrate data, the data in that brick cannot
be accessed from the client unlike "replace-brick" which actually migrates
data from one brick to the another.
I strongly suggest for an enhancement a version of remove-brick that actually
does migrate the data.  This would be *extremely* useful in dealing with a
distributed/replicated filesystem with a computer and/or brick that is dead or
likely to be down for an extended period (I configure bricks to be replicated
between different computers). The remove-brick command on startup
could make an estimate whether the data would fit on the remaining bricks.
If after the migration started it turned out that the data really does not all
fit there would still would not be any loss as long as the last file movement
wasn't completely committed.  The command then could be aborted.

This would be no different in concept and risks (and usefulness) than reducing
the size of a partition with gparted.  I would make the remove-brick command
only remove a brick without migrating it if there were some "force" option in
effect.  I have trouble seeing why one would otherwise want to use the
remove-brick command and throw away the data except in some dire emergency.

Bill Sebok      Computer Software Manager, Univ. of Maryland, Astronomy
        Internet: [email protected]    URL: http://furo.astro.umd.edu/


-----------------------------

Hello All-
I would like to add my support to this feature request. I assumed that remove-brick did migrate data at first, until I looked into it more carefully. Unless I have got the wrong end of the stick, the only way to shrink a distributed or distributed-replicated volume at the moment is to perform the following steps.

1) Tell the users to stop using the volume, even though they will still be able to mount it and write to it
2) Remove the brick (and its mirror if appropriate) with remove-brick
3) Remove all the link files from the backend filesystem of the brick that has just been removed, using "find /brick/path -size 0b -perm 1000 -exec /bin/rm -v {} \;" or similar 4) Copy the files from the backend filesystem to the mount point of the volume
5) Tell the users it is safe to carry on using the volume again.

That would be quite risky in my department because the volumes are auto-mounted on many different clients, and even if everyone has bothered to read my email telling them to stop using the volume they might accidentally leave a process running that is writing files to it. If some of those files have the same names as files that were on the removed brick, we could end up with a situation where the new files from the running process are overwritten by old versions being copied from the removed brick. To avoid this scenario, and other potential disasters I haven't thought of yet, I would have to do the following to safely shrink a volume.

1) Take the volume off line and then delete it
2) Create a new volume with a temporary name, containing all the bricks from the original volume except the brick (and its mirror if appropriate) I want to remove 3) Remove the link files from the backend filesystem of the brick that has just been removed 4) Copy the files from the backend filesystem to the mount point of the temporary volume 5) Delete the temporary volume, and re-create it using the original name so it can be auto-mounted.
6) Put the volume on line again.

If there is an easier way of safely shrinking a distributed or distributed-replicated volume please let me know.

Regards,
-Dan.
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to