Re: Cloning a Btrfs partition

2011-12-07 Thread Freddie Cash
On Wed, Dec 7, 2011 at 10:35 AM, BJ Quinn b...@placs.net wrote:
 I've got a 6TB btrfs array (two 3TB drives in a RAID 0). It's about 2/3 full 
 and has lots of snapshots. I've written a script that runs through the 
 snapshots and copies the data efficiently (rsync --inplace --no-whole-file) 
 from the main 6TB array to a backup array, creating snapshots on the backup 
 array and then continuing on copying the next snapshot. Problem is, it looks 
 like it will take weeks to finish.

 I've tried simply using dd to clone the btrfs partition, which technically 
 appears to work, but then it appears that the UUID between the arrays is 
 identical, so I can only mount one or the other. This means I can't continue 
 to simply update the backup array with the new snapshots created on the main 
 array (my script is capable of catching up the backup array with the new 
 snapshots, but if I can't mount both arrays...).

 Any suggestions?

Until an analog of zfs send is added to btrfs (and I believe there
are some side projects ongoing to add something similar), your only
option is the one you are currently using via rsync.

--
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs subvolume snapshot syntax too smart

2011-04-04 Thread Freddie Cash
On Mon, Apr 4, 2011 at 12:47 PM, Goffredo Baroncelli kreij...@libero.it wrote:
 On 04/04/2011 09:09 PM, krz...@gmail.com wrote:
 I understand btrfs intent but same command run twice should not give
 diffrent results. This really makes snapshot automation hard


 root@sv12 [/ssd]# btrfs subvolume snapshot /ssd/sub1 /ssd/5
 Create a snapshot of '/ssd/sub1' in '/ssd/5'
 root@sv12 [/ssd]# btrfs subvolume snapshot /ssd/sub1 /ssd/5
 Create a snapshot of '/ssd/sub1' in '/ssd/5/sub1'
 root@sv12 [/ssd]# btrfs subvolume snapshot /ssd/sub1 /ssd/5
 Create a snapshot of '/ssd/sub1' in '/ssd/5/sub1'
 ERROR: cannot snapshot '/ssd/sub1'

 The same is true for cp:

 # cp -rf /ssd/sub1 /ssd/5       - copy sub1 as 5
 # cp -rf /ssd/sub1 /ssd/5       - copy sub1 in 5

 However you are right. It could be fixed easily adding a switch like
 --script, which force to handle the last part of the destination as
 the name of the subvolume, raising an error if it already exists.

 subvolume snapshot is the only command which suffers of this kind of
 problem ?

Isn't this a situation where supporting a trailing / would help?

For example, with the / at the end, means put the snapshot into the
folder.  Thus btrfs subvolume snapshot /ssd/sub1 /ssd/5/ would
create a sub1 snapshot inside the 5/ folder.  Running it a second
time would error out since /ssd/5/sub1/ already exists.  And if the 5/
folder doesn't exist, it would error out.

And without the / at the end, means name the snapshot.  Thus btrfs
subvolume snapshot /ssd/sub1 /ssd/5 would create a snapshot named
/ssd/5.  Running the command again would error out due to the
snapshot already existing.  And if the 5/ folder doesn't exist, it's
created.  And it errors out if the 5/ folder already exists.

Or, something along those lines.  Similar to how other apps work
with/without a trailing /.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: efficiency of btrfs cow

2011-03-06 Thread Freddie Cash
On Sun, Mar 6, 2011 at 8:02 AM, Fajar A. Nugraha l...@fajar.net wrote:
 On Sun, Mar 6, 2011 at 10:46 PM, Brian J. Murrell br...@interlinx.bc.ca 
 wrote:
 # cp -al /backup/previous-backup/ /backup/current-backup
 # rsync -aAHX ... --exclude /backup / /backup/current-backup

 The shortcoming of this of course is that it just takes 1 byte in a
 (possibly huge) file to require that the whole file be recopied to the
 backup.

 If you have snapshots anyway, why not :
 - create a snapshot before each backup run
 - use the same directory (e.g. just /backup), no need to cp anything
 - add --inplace to rsync

You may also want to test with/without --no-whole-file as well.
That's most useful when the two filesystems are on the same system and
should reduce the amount of data copied around, as it forces rsync to
only use file deltas.  This is very much a win on ZFS, which is also
CoW, so it should be a win on Btrfs.


-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs wishlist

2011-03-01 Thread Freddie Cash
On Tue, Mar 1, 2011 at 10:39 AM, Chris Mason chris.ma...@oracle.com wrote:
 Excerpts from Roy Sigurd Karlsbakk's message of 2011-03-01 13:35:42 -0500:

 - Pool-like management with multiple RAIDs/mirrors (VDEVs)

 We have a pool of drives nowI'm not sure exactly what the vdevs are.

This functionality is in btrfs already, but it's using different
terminology and configuration methods.

In ZFS, the lowest level in the storage stack is the physical block device.

You group these block devices together into a virtual device (aka
vdev).  The possible vdevs are:
  - single disk vdev, with no redundancy
  - mirror vdev, with any number of devices (n-way mirroring)
  - raidz1 vdev, single-parity redundancy
  - raidz2 vdev, dual-parity redundancy
  - raidz3 vdev, triple-party redundancy
  - log vdev, separate device for journaling, or as a write cache
  - cache vdev, separate device that acts as a read cache

A ZFS pool is made up of a collection of the vdevs.

For example, a simple, non-redundant pool setup for a laptop would be:
  zpool create laptoppool da0

To create a pool with a dual-parity vdev using 8 disks:
  zpool create mypool raidz2 da0 da1 da2 da3 da4 da5 da6 da7

To later add to the existing pool:
  zpool add mypool raidz2 da8 da9 da10 da11 da12 da13 da14 da15

Later, you create your ZFS filesystems ontop of the pool.

With btrfs, you setup the redundancy and the filesystem all in one
shot, thus combining the vdev with the pool (aka filesystem).

ZFS has better separation of the different layers (device, pool,
filesystem), and better tools for working with them (zpool / zfs) but
similar functionality is (or at least appears to be) in btrfs already.

Using device mapper / md underneath btrfs also gives you a similar setup to ZFS.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-24 Thread Freddie Cash
On Sat, Jan 22, 2011 at 5:45 AM, Hugo Mills hugo-l...@carfax.org.uk wrote:
 On Fri, Jan 21, 2011 at 11:28:19AM -0800, Freddie Cash wrote:
 So, is Btrfs pooled storage or not?  Do you throw 24 disks into a
 single Btrfs filesystem, and then split that up into separate
 sub-volumes as needed?

   Yes, except that the subvolumes aren't quite as separate as you
 seem to think that they are. There's no preallocation of storage to a
 subvolume (in the way that LVM works), so you're only limited by the
 amount of free space in the whole pool. Also, data stored in the pool
 is actually free for use by any subvolume, and can be shared (see the
 deeper explanation below).

Ah, perfect, that I understand.  :)  It's the same with ZFS:  you add
storage to a pool, filesystems in the pool are free to use as much as
there is available, you don't have to pre-allocate or partition or
anything that.  ZFS supports quotas and reservations, though, so you
can (if you want/need) allocate bytes to specific filesystems.

  From the looks of things, you don't have to
 partition disks or worry about sizes before formatting (if the space
 is available, Btrfs will use it).  But it also looks like you still
 have to manage disks.

 Or, maybe it's just that the initial creation is done via mkfs (as in,
 formatting a partition with a filesystem) that's tripping me up after
 using ZFS for so long (zpool creates the storage pool, manages the
 disks, sets up redundancy levels, etc;  zfs creates filesystems and
 volumes, and sets properties; no newfs/mkfs involved).

   So potentially zpool - mkfs.btrfs, and zfs - btrfs. However, I
 don't know enough about ZFS internals to know whether this is a
 reasonable analogy to make or not.

That's what I figured.  It's not a perfect analogue, but it's close
enough.  Clears things up a bit.

The big different is that ZFS separates storage management (the pool)
from filesystem management; while btrfs creates a pool underneath
one filesystem, and allows you to split it up via sub-volumes.

I think I'm figuring this out.  :)

   Note that the actual file data, and the management of its location
 on the disk (and its replication), is completely shared across
 subvolumes. The same extent may be used multiple times by different
 files, and those files may be in any subvolumes on the filesystem. In
 theory, the same extent could even appear several times in the same
 file. This sharing is how snapshots and COW copies are implemented.
 It's also the basis for Josef's dedup implementation.

That's similar to how ZFS works, only they use blocks instead of
extents, but it works in a similar manner.

I think I've got this mostly figured out.

Now, to just wait for multiple parity redundancy (RAID5/6/+) support
to hit the tree, so I can start playing around with it.  :)

Thanks for taking the time to explain some things.  Sorry if I came
across as being harsh or whatnot.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-21 Thread Freddie Cash
On Sun, Jan 9, 2011 at 10:30 AM, Hugo Mills hugo-l...@carfax.org.uk wrote:
 On Sun, Jan 09, 2011 at 09:59:46AM -0800, Freddie Cash wrote:
 Let see if I can match up the terminology and layers a bit:

 LVM Physical Volume == Btrfs disk == ZFS disk / vdevs
 LVM Volume Group == Btrfs filesystem == ZFS storage pool
 LVM Logical Volume == Btrfs subvolume == ZFS volume
 'normal' filesysm == Btrfs subvolume (when mounted) == ZFS filesystem

 Does that look about right?

   Kind of. The thing is that the way that btrfs works is massively
 different to the way that LVM works (and probably massively different
 to the way that ZFS works, but I don't know much about ZFS, so I can't
 comment there). I think that trying to think of btrfs in LVM terms is
 going to lead you to a large number of incorrect conclusions. It's
 just not a good model to use.

My biggest issue trying to understand Btrfs is figuring out the layers involved.

With ZFS, it's extremely easy:

disks -- vdev -- pool -- filesystems

With LVM, it's fairly easy:

disks - volume group -- volumes -- filesystems

But, Btrfs doesn't make sense to me:

disks -- filesystem -- sub-volumes???

So, is Btrfs pooled storage or not?  Do you throw 24 disks into a
single Btrfs filesystem, and then split that up into separate
sub-volumes as needed?  From the looks of things, you don't have to
partition disks or worry about sizes before formatting (if the space
is available, Btrfs will use it).  But it also looks like you still
have to manage disks.

Or, maybe it's just that the initial creation is done via mkfs (as in,
formatting a partition with a filesystem) that's tripping me up after
using ZFS for so long (zpool creates the storage pool, manages the
disks, sets up redundancy levels, etc;  zfs creates filesystems and
volumes, and sets properties; no newfs/mkfs involved).

It looks like ZFS, Btrfs, and LVM should work in similar manners, but
the overloaded terminology (pool, volume, sub-volume, filesystem are
different in all three) and new terminology that's only in Btrfs is
confusing.

 Just curious, why all the new terminology in btrfs for things that
 already existed?  And why are old terms overloaded with new meanings?
 I don't think I've seen a write-up about that anywhere (or I don't
 remember it if I have).

   The main awkward piece of btrfs terminology is the use of RAID to
 describe btrfs's replication strategies. It's not RAID, and thinking
 of it in RAID terms is causing lots of confusion. Most of the other
 things in btrfs are, I think, named relatively sanely.

No, the main awkward piece of btrfs terminology is overloading
filesystem to mean collection of disks and creating sub-volume
to mean filesystem.  At least, that's how it looks from way over
here.  :)

 Perhaps it's time to start looking at separating the btrfs pool
 creation tools out of mkfs (or renaming mkfs.btrfs), since you're
 really building a a storage pool, and not a filesystem.  It would
 prevent a lot of confusion with new users.  It's great that there's a
 separate btrfs tool for manipulating btrfs setups, but mkfs.btrfs is
 just wrong for creating the btrfs setup.

   I think this is the wrong thing to do. I hope my explanation above
 helps.

As I understand it, the mkfs.btrfs is used to create the initial
filesystem across X disks with Y redundancy.  For everthing else
afterward, the btrfs tool is used to add disks, create snapshots,
delete snapshots, change redundancy settings, create sub-volumes, etc.
 Why not just add a create option to btrfs and retire mkfs.btrfs
completely.  Or rework mkfs.btrfs to create sub-volumes of an existing
btrfs setup?

What would be great is if there was an image that showed the layers in
Btrfs and how they interacted with the userspace tools.

Having a set of graphics that compared the layers in Btrfs with the
layers in the normal Linux disk/filesystem partitioning scheme, and
the LVM layering, would be best.

There's lots of info in the wiki, but no images, ASCII-art, graphics,
etc.  Trying to picture this mentally is not working.  :)

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backup Command

2011-01-21 Thread Freddie Cash
On Fri, Jan 21, 2011 at 11:07 AM,  cac...@quantum-sci.com wrote:

 Well thanks to some help from you guys I seem to have my backup server almost 
 fully running and functional with rsync.  Amazing functions, this 
 snapshotting and rsync.

 I still don't know why I cannot remove snapshots though. (Debian Testing with 
 2.6.32-28)

 And I don't know how to reach out from the backup server to the HTPC and stop 
 MythTV there, so I can export the Mysql database safely, from a cron job on 
 the backup server.  Suggestions?

Simplified, but workable:

#!/bin/sh

ssh someu...@mythtv.pc /path/to/some/script stop

/path/to/your/rsync/script

ssh someu...@mythtv.pc /path/to/some/script start

The above script would be your backup wrapper script, that gets called by cron.

On the HTPC, /path/to/some/script would be a script that takes 2
arguments (stop|start).

The stop argument would stop mythtv using the init script for it.
Then would do whatever you need to do to the database (stop it, dump
it, whatever).

The start argument would do the reverse, starting the database and MythTV.

And, /path/to/your/rsync/script would call your actual backups
script that runs rsync.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-09 Thread Freddie Cash
On Sun, Jan 9, 2011 at 7:32 AM, Alan Chandler
a...@chandlerfamily.org.uk wrote:
 I think I start to get it now.  Its the fact that subvolumes can be
 snapshotted etc without mounting them that is the difference.  I guess I am
 too used to thinking like LVM and I was thinking subvolumes where like an
 LV.  They are, but not quite the same.

Let see if I can match up the terminology and layers a bit:

LVM Physical Volume == Btrfs disk == ZFS disk / vdevs
LVM Volume Group == Btrfs filesystem == ZFS storage pool
LVM Logical Volume == Btrfs subvolume == ZFS volume
'normal' filesysm == Btrfs subvolume (when mounted) == ZFS filesystem

Does that look about right?

LVM: A physical volume is the lowest layer in LVM and they are
combined into a volume group which is then split up into logical
volumes, and formatted with a filesystem.

Btrfs: A bunch of disks are formatted into a btrfs filesystem
which is then split up into sub-volumes (sub-volumes are
auto-formatted with a btrfs filesystem).

ZFS: A bunch of disks are combined into virtual devices, then combined
into a ZFS storage pool, which can be split up into either volumes
formatted with any filesystem, or ZFS filesystems.

Just curious, why all the new terminology in btrfs for things that
already existed?  And why are old terms overloaded with new meanings?
I don't think I've seen a write-up about that anywhere (or I don't
remember it if I have).

Perhaps it's time to start looking at separating the btrfs pool
creation tools out of mkfs (or renaming mkfs.btrfs), since you're
really building a a storage pool, and not a filesystem.  It would
prevent a lot of confusion with new users.  It's great that there's a
separate btrfs tool for manipulating btrfs setups, but mkfs.btrfs is
just wrong for creating the btrfs setup.
-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Various Questions

2011-01-08 Thread Freddie Cash
On Sat, Jan 8, 2011 at 5:25 AM, Carl Cook cac...@quantum-sci.com wrote:

 In addition to the questions below, if anyone has a chance could you advise 
 on why my destination drive has more data  than the source after this command:
 # rsync --hard-links --delete --inplace --archive --numeric-ids /media/disk/* 
 /home
 sending incremental file list

What happens if you delete /home, then run the command again, but
without the *?  You generally don't use wildcards for the source or
destination when using rsync.  You just tell it which directory to
start in.

If you do an ls /home and ls /media/disk are they different?

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Various Questions

2011-01-07 Thread Freddie Cash
On Fri, Jan 7, 2011 at 9:15 AM, Carl Cook cac...@quantum-sci.com wrote:
 How do you know what options to rsync are on by default?  I can't find this 
 anywhere.  For example, it seems to me that --perms -ogE  --hard-links and 
 --delete-excluded should be on by default, for a true sync?

Who cares which ones are on by default?  List the ones you want to use
on the command-line, everytime.  That way, if the defaults change,
your setup won't.

 If using the  --numeric-ids switch for rsync, do you just have to manually 
 make sure the IDs and usernames are the same on source and destination 
 machines?

You use the --numeric-ids switch so that it *doesn't* matter if the
IDs/usernames are the same.  It just sends the ID number on the wire.
Sure, if you do an ls on the backup box, the username will appear to
be messed up.  But if you compare the user ID assigned to the file,
and the user ID to the backed up etc/passwd file, they are correct.
Then, if you ever need to restore the HTPC from backups, the
etc/passwd file is transferred over, the user IDs are transferred
over, and when you do an ls on the HTPC, everything matches up
correctly.

 For files that fail to transfer, wouldn't it be wise to use  
 --partial-dir=DIR to at least recover part of lost files?

Or, just run rsync again, if the connection is dropped.

 The rsync man page says that rsync uses ssh by default, but is that the case? 
  I think -e may be related to engaging ssh, but don't understand the 
 explanation.

Does it matter what the default is, if you specify exactly how you
want it to work on the command-line?

 So for my system where there is a backup server, I guess I run the rsync 
 daemon on the backup server which presents a port, then when the other 
 systems decide it's time for a backup (cron) they:
 - stop mysql, dump the database somewhere, start mysql;
 - connect to the backup server's rsync port and dump their data to 
 (hopefully) some specific place there.
 Right?

That's one way (push backups).  It works ok for small numbers of
systems being backed up.  But get above a handful of machines, and it
gets very hard to time everything so that you don't hammer the disks
on the backup server.

Pull backups (backups server does everything) works better, in my
experience.  Then you just script things up once, run 1 script, worry
about 1 schedule, and everything is stored on the backups server.  No
need to run rsync daemons everywhere, just run the rsync client, using
-e ssh, and let it do everything.

If you need it to run a script on the remote machine first, that's
easy enough to do:
  - ssh to remote system, run script to stop DBs, dump DBs, snapshot
FS, whatever
  - then run rsync
  - ssh to remote system run script to start DBs, delete snapshot, whatever

You're starting to over-think things.  Keep it simple, don't worry
about defaults, specify everything you want to do, and do it all from
the backups box.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-06 Thread Freddie Cash
On Thu, Jan 6, 2011 at 9:35 AM, Carl Cook cac...@quantum-sci.com wrote:

 I am setting up a backup server for the garage, to back up my HTPC in case of 
 theft or fire.  The HTPC has a 4TB RAID10 array (mdadm, JFS), and will be 
 connected to the backup server using GB ethernet.  The backup server will 
 have a 4TB BTRFS RAID0 array.  Debian Testing running on both.

 I want to keep a duplicate copy of the HTPC data, on the backup server, and I 
 think a regular full file copy is not optimal and may take days to do.  So 
 I'm looking for a way to sync the arrays at some interval.  Ideally the sync 
 would scan the HTPC with a CRC check to look for differences, copy over the 
 differences, then email me on success.

 Is there a BTRFS tool that would do this?

No, but there's a great tool called rsync that does exactly what you want.  :)

This is (basically) the same setup we use at work to backup all our
remote Linux/FreeBSD systems to a central backups server (although our
server runs FreeBSD+ZFS).

Just run rsync on the backup server, tell it to connect via ssh to the
remote server, and rsync / (root filesystem) into /backups/htpc/ (or
whatever directory you want).  Use an exclude file to exclude the
directories you don't want backed up (like /proc, /sys, /dev).

If you are comfortable compiling software, then you should look into
adding the HPN patches to OpenSSH, and enabling the None cipher.  That
will give you 30-40% network throughput increase.

After the rsync completes, snapshot the filesystem on the backup
server, using the current date for the name.

Then repeat the rsync process the next day, into the exact same
directory.  Only files that have changed will be transferred.  Then
snapshot the filesystem using the current date.

And repeat ad nauseum.  :)

Some useful rsync options to read up on:
  --hard-links
  --numeric-ids
  --delete-during
  --delete-excluded
  --archive

The first time you run the rsync command, it will take awhile, as it
transfers every file on the HTPC to the backups server.  However, you
can stop and restart this process as many times as you like.  rsync
will just pick up where it left off.

 Also with this system, I'm concerned that if there is corruption on the HTPC, 
 it could be propagated to the backup server.  Is there some way to address 
 this?  Longer intervals to sync, so I have a chance to discover?

Using snapshots on the backup server allows you to go back in time to
recover files that may have been accidentally deleted, or to recover
files that have been corrupted.

Be sure to use rsync 3.x, as that will start transferring data a *lot*
sooner, shortening the overall time needed for the sync.  rsync 2.x
scans the entire remote filesystem first, builds a list of files, then
compares that list to the files on the backup server.  rsync 3.x scans
a couple directories, then starts transferring data while scanning
ahead.

Once you have a working command-line for rsync, adding it to a script
and then using cron to schedule it completes the setup.

Works beautifully.  :)  Saved our bacon several times over the past 2 years.
-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-06 Thread Freddie Cash
On Thu, Jan 6, 2011 at 11:33 AM, Marcin Kuk marcin@gmail.com wrote:
 Rsync is good, but not for all cases. Be aware of databases files -
 you should do snapshot filesystem before rsyncing.

We script a dump of all databases before the rsync runs, so we get
both text and binary backups.  If restoring the binary files doesn't
work, then we just suck in the text dumps.

If the remote system supports snapshots, doing a snapshot before the
rsync runs is a good idea, though.  It'll be nice when more
filesystems support in-line snapshots.  The LVM method is pure crap.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-06 Thread Freddie Cash
On Thu, Jan 6, 2011 at 12:07 PM, C Anthony Risinger anth...@extof.me wrote:
 On Thu, Jan 6, 2011 at 1:47 PM, Freddie Cash fjwc...@gmail.com wrote:
 On Thu, Jan 6, 2011 at 11:33 AM, Marcin Kuk marcin@gmail.com wrote:
 Rsync is good, but not for all cases. Be aware of databases files -
 you should do snapshot filesystem before rsyncing.

 We script a dump of all databases before the rsync runs, so we get
 both text and binary backups.  If restoring the binary files doesn't
 work, then we just suck in the text dumps.

 If the remote system supports snapshots, doing a snapshot before the
 rsync runs is a good idea, though.  It'll be nice when more
 filesystems support in-line snapshots.  The LVM method is pure crap.

 do you also use the --in-place option for rsync?  i would think this
 is critical to getting the most out of btrfs folding backups, ie.
 the most reuse between snapshots?  im able to set this exact method up
 for my home network, thats why i ask... i have a central server that
 runs everything, and i want to sync a couple laptops and netbooks
 nightly, and a few specific directories whenever they change.  btrfs
 on both ends.

Yes, we do use --inplace, forgot about that one.

Full rsync command used:
${rsync} ${rsync_options} \
--exclude-from=${defaultsdir}/${rsync_exclude} ${rsync_exclude_server} \
--rsync-path=${rsync_path} --rsh=${ssh} -p ${rsync_port} -i
${defaultsdir}/${rsync_key} \
--log-file=${logdir}/${rsync_server}.log \
${rsync_us...@${rsync_server}:${basedir}/
${backupdir}/${sitedir}/${serverdir}/${basedir}/

Where rsync_options is:
--archive --delete-during --delete-excluded --hard-links --inplace
--numeric-ids --stats

 better yet, any chance you'd share some scripts? :-)

A description of what we use, including all scripts, is here:
http://forums.freebsd.org/showthread.php?t=11971

 as for the DB stuff, you definitely need to snapshot _before_ rsync.  roughly:

 ) read lock and flush tables
 ) snapshot
 ) unlock tables
 ) mount snapshot
 ) rsync from snapshot

Unfortunately, we don't use btrfs or LVM on remote servers, so there's
no snapshotting available during the backup run.  In a perfect world,
btrfs would be production-ready, ZFS would be available on Linux, and
we'd no longer need the abomination called LVM.  :)

Until then, DB text dumps are our fall-back.  :)

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-06 Thread Freddie Cash
On Thu, Jan 6, 2011 at 1:06 PM, Gordan Bobic gor...@bobich.net wrote:
 Unfortunately, we don't use btrfs or LVM on remote servers, so there's
 no snapshotting available during the backup run.  In a perfect world,
 btrfs would be production-ready, ZFS would be available on Linux, and
 we'd no longer need the abomination called LVM.  :)

 As a matter of fact, ZFS _IS_ available on Linux:
 http://zfs.kqinfotech.com/

Available, usable, and production-ready are not synonymous.  :)
ZFS on Linux is not even in the experimental/testing stage right now.

ZFS-fuse is good for proof-of-concept stuff, but chokes on heavy
usage, especially with dedupe enabled.  We tried it for a couple weeks
to see what was available in ZFS versions above 14, but couldn't keep
it running for more than a day or two at a time.  Supposedly, things
are better now, but I wouldn't trust 15 TB of backups to it.  :)

The Lawrence-Liverpool ZFS module for Linux doesn't support ZFS
filesystems yet, only ZFS volumes.  It should be usable as an LVM
replacement, though, or as an iSCSI target box.  Haven't tried it yet.

The Middle-East (forget which country it's from) ZFS module for Linux
is in the private beta stage, but only available for a few distros and
kernel versions, and is significantly slower than ZFS on FreeBSD.
Hopefully, it will enter public beta this year, it sounds promising.
Don't think I'd trust 15 TB of backups to it for at least another
year, though.

If btrfs gets dedupe, nicer disk management (it's hard to use
non-pooled storage now), a working fsck (or similar), and integration
into Debian, then we may look at that as well.  :)

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-06 Thread Freddie Cash
On Thu, Jan 6, 2011 at 1:42 PM, Carl Cook cac...@quantum-sci.com wrote:
 On Thu 06 January 2011 11:16:49 Freddie Cash wrote:
  Also with this system, I'm concerned that if there is corruption on the 
  HTPC, it could be propagated to the backup server.  Is there some way to 
  address this?  Longer intervals to sync, so I have a chance to discover?

 Using snapshots on the backup server allows you to go back in time to
 recover files that may have been accidentally deleted, or to recover
 files that have been corrupted.

 How?  I can see that rsync will not transfer the files that have not changed, 
 but I assume it transfers the changed ones.  How can you go back in time?  Is 
 there like a snapshot file that records the state of all files there?

I don't know the specifics of how it works in btrfs, but it should be
similar to how ZFS does it.  The gist of it is:

Each snapshot gives you a point-in-time view of the entire filesystem.
 Each snapshot can be mounted (ZFS is read-only; btrfs is read-only or
read-write).  So, you mount the snapshot for 2010-12-15 onto /mnt,
then cd to the directory you want (/mnt/htpc/home/fcash/videos/) and
copy the file out that you want to restore (cp coolvid.avi ~/).

With ZFS, things are nice and simple:
  - each filesystem has a .zfs/snapshot directory
  - in there are sub-directories, each named after the snapshot name
  - cd into the snapshot name, the OS auto-mounts the snapshot, and off you go

Btrfs should be similar?  Don't know the specifics.

How it works internally, is some of the magic and the beauty of
Copy-on-Write filesystems.  :)
-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-06 Thread Freddie Cash
On Thu, Jan 6, 2011 at 1:44 PM, Carl Cook cac...@quantum-sci.com wrote:
 On Thu 06 January 2011 12:07:17 C Anthony Risinger wrote:
 as for the DB stuff, you definitely need to snapshot _before_ rsync.  
 roughly:

 ) read lock and flush tables
 ) snapshot
 ) unlock tables
 ) mount snapshot
 ) rsync from snapshot

 ie. the same as whats needed for LVM:

 http://blog.dbadojo.com/2007/09/mysql-backups-using-lvm-snapshots.html

 to get the DB file on disk consistent prior to archiving.

 I'm a little alarmed by this.  Running a mysql server for MythTV database.   
 Do these operations need to somehow be done before rsync?  Or Else?

 I don't understand what you're saying.

Simplest solution is to write a script to create a mysqldump of all
databases into a directory, add that to cron so that it runs at the
same time everyday, 10-15 minutes before the rsync run is done.  That
way, rsync to the backup server picks up both the text dump of the
database(s), along with the binary files under /var/lib/mysql/* (the
actual running database).

When you need to restore the HTPC due to failed harddrive or what not,
you just rsync everything back to the new harddrive and try to run
MythTV.  If things work, great, done.  If something is wonky, then
delete all the MySQL tables/databases, and use the dump file to
recreate things.

Something like this:
#!/bin/bash
# Backup mysql databases.
#
# Take a list of databases, and dump each one to a separate file.

debug=0

while getopts hv OPTION; do
case ${OPTION} in
h)
echo Usage: $0 [-h] [-v]
echo 
echo -h  show this help blurb
echo -v  be verbose about what's happening
exit 0
;;
v)
debug=1
;;
esac
done

for I in $( mysql -u root --password=blahblahblah -Bse show databases ); do
OUTFILE=/var/backups/$I.sql
if [ $debug = 1 ]; then
echo -n Doing backup for $I:
fi

/usr/bin/mysqldump -u root --password=blahblahblah --opt $I  $OUTFILE
/bin/chmod 600 $OUTFILE

if [ $debug = 1 ]; then
echo  done.
fi
done

exit 0

That will create a text dump of everything in each database, creating
a separate file per database.  It can be used via the mysql command
to recreate the database at a later date.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Offline Deduplication for Btrfs

2011-01-05 Thread Freddie Cash
On Wed, Jan 5, 2011 at 11:46 AM, Josef Bacik jo...@redhat.com wrote:
 Dedup is only usefull if you _know_ you are going to have duplicate 
 information,
 so the two major usecases that come to mind are

 1) Mail server.  You have small files, probably less than 4k (blocksize) that
 you are storing hundreds to thousands of.  Using dedup would be good for this
 case, and you'd have to have a small dedup blocksize for it to be usefull.

 2) Virtualized guests.  If you have 5 different RHEL5 virt guests, chances are
 you are going to share data between them, but unlike with the mail server
 example, you are likely to find much larger chunks that are the same, so you'd
 want a larger dedup blocksize, say 64k.  You want this because if you did just
 4k you'd end up with a ridiculous amount of framentation and performance would
 go down the toilet, so you need a larger dedup blocksize to make for better
 performance.

You missed out on the most obvious, and useful, use case for dedupe:
  central backups server.

Our current backup server does an rsync backup of 127 servers every
night into a single ZFS pool.  90+ of those servers are identical
Debian installs (school servers), 20-odd of those are identical
FreeBSD installs (firewalls/routers), and the rest are mail/web/db
servers (Debian, Ubuntu, RedHat, Windows).

Just as a test, we copied a week of backups to a Linux box running
ZFS-fuse with dedupe enabled, and had a combined dedupe/compress
ration in the low double-digits (11 or 12x, something like that).  Now
we're just waiting for ZFSv22+ to hit FreeBSD to enable dedupe on the
backups server.

For backups, and central storage for VMs, online dedupe is a massive
win.  Offline, maybe.  Either way, dedupe is worthwhile.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Offline Deduplication for Btrfs

2011-01-05 Thread Freddie Cash
On Wed, Jan 5, 2011 at 12:15 PM, Josef Bacik jo...@redhat.com wrote:
 Yeah for things where you are talking about sending it over the network or
 something like that every little bit helps.  I think deduplication is far more
 interesting and usefull at an application level than at a filesystem level.  
 For
 example with a mail server, there is a good chance that the files will be
 smaller than a blocksize and not be able to be deduped, but if the application
 that was storing them recognized that it had the same messages and just linked
 everything in its own stuff then that would be cool.  Thanks,

Cyrus IMAP and Zimbra (probably a lot of others) already do that,
hard-linking identical message bodies.  The e-mail server use-case is
for dedupe is pretty much covered already.


-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Offline Deduplication for Btrfs

2011-01-05 Thread Freddie Cash
On Wed, Jan 5, 2011 at 5:03 PM, Gordan Bobic gor...@bobich.net wrote:
 On 01/06/2011 12:22 AM, Spelic wrote:
 Definitely agree that it should be a per-directory option, rather than per
 mount.

JOOC, would the dedupe table be done per directory, per mount, per
sub-volume, or per volume?  The larger the pool of data to check
against, the better your dedupe ratios will be.

I'm not up-to-date on all the terminology that btrfs uses, and how it
compares to ZFS (disks - vdevs - pool - filesystem/volume), so the
terms above may be incorrect.  :)

In the ZFS world, dedupe is done pool-wide in that any block in the
pool is a candidate for dedupe, but the dedupe property can be
enabled/disabled on a per-filesystem basis.  Thus, only blocks in
filesystems with the dedupe property enabled will be deduped.  But
blocks from any filesystem can be compared against.

 This is the point I was making - you end up paying double the cost in disk
 I/O and the same cost in CPU terms if you do it offline. And I am not
 convniced the overhead of calculating checksums is that great. There are
 already similar overheads in checksums being calculated to enable smart data
 recovery in case of silent disk corruption.

 Now that I mentioned, that, it's an interesting point. Could these be
 unified? If we crank up the checksums on files a bit, to something suitably
 useful for deduping, it could make the deduping feature almost free.

This is what ZFS does.  Every block in the pool has a checksum
attached to it.  Originally, the default algorithm was fletcher2, with
fletcher4 and sha256 as alternates.  When dedupe was enabled, the
default was changed to fletcher4.  Dedupe also came with the option to
enable/disable a byte-for-byte verify when the hashes match.

By switching the checksum algorithm for the pool to sha256 ahead of
time, you can enable dedupe, and get the dedupe checksumming for free.
 :)

 Also, the OS is small even if identical on multiple virtual images, how
 much is going to occupy anyway? Less than 5GB per disk image usually.
 And that's the only thing that would be deduped because data likely to
 be different on each instance. How many VMs running you have? 20? That's
 at most 100GB saved one-time at the cost of a lot of fragmentation.

 That's also 100GB fewer disk blocks in contention for page cache. If you're
 hitting the disks, you're already going to slow down by several orders of
 magnitude. Better to make the caching more effective.

If you setup your VMs as diskless images, using NFS off a storage
server running whatever FS using dedupe, you can get a lot more out
of it than using disk image files (where you have all the block sizes
and alignment to worry about).  And the you can use all the fancy
snapshotting, cloning, etc features of whatever FS as well.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Appending data to the middle of a file using btrfs-specific features

2010-12-06 Thread Freddie Cash
On Mon, Dec 6, 2010 at 11:14 AM, Nirbheek Chauhan
nirbheek.chau...@gmail.com wrote:
 As an aside, my primary motivation for this was that doing an
 incremental backup of things like git bare repositories and databases
 using btrfs subvolume snapshots is expensive w.r.t. disk space. Even
 though rsync calculates a binary delta before transferring data, it
 has to write everything out (except if just appending). So in that
 case, each incremental backup is hardly so.

Since btrfs is Copy-on-Write, have you experimented with --inplace on
the rsync command-line?  That way, rsync writes the changes over-top
of the existing file, thus allowing btrfs to only write out the blocks
that have changed, via CoW?

We do this with our ZFS rsync backups, and found disk usage to go way
down over the default write out new data to new file, rename overtop
method that rsync uses.

There's also the --no-whole-file option which causes rsync to only
send delta changes for existing files, another useful feature with CoW
filesystems.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Appending data to the middle of a file using btrfs-specific features

2010-12-06 Thread Freddie Cash
On Mon, Dec 6, 2010 at 12:30 PM, Nirbheek Chauhan
nirbheek.chau...@gmail.com wrote:
 On Tue, Dec 7, 2010 at 1:05 AM, Freddie Cash fjwc...@gmail.com wrote:
 On Mon, Dec 6, 2010 at 11:14 AM, Nirbheek Chauhan
 nirbheek.chau...@gmail.com wrote:
 As an aside, my primary motivation for this was that doing an
 incremental backup of things like git bare repositories and databases
 using btrfs subvolume snapshots is expensive w.r.t. disk space. Even
 though rsync calculates a binary delta before transferring data, it
 has to write everything out (except if just appending). So in that
 case, each incremental backup is hardly so.

 Since btrfs is Copy-on-Write, have you experimented with --inplace on
 the rsync command-line?  That way, rsync writes the changes over-top
 of the existing file, thus allowing btrfs to only write out the blocks
 that have changed, via CoW?

 We do this with our ZFS rsync backups, and found disk usage to go way
 down over the default write out new data to new file, rename overtop
 method that rsync uses.

 There's also the --no-whole-file option which causes rsync to only
 send delta changes for existing files, another useful feature with CoW
 filesystems.

 I had tried the --inplace option, but it didn't seem to do anything
 for me, so I didn't explore that further. However, after following
 your suggestion and retrying with --no-whole-file, I see that the
 behaviour is quite different! It seems that --whole-file is enabled by
 default for local file transfers, and so --inplace had no effect.

Yes, correct, --whole-file is used for local transfers since it's
assumed you have all the disk I/O in the world, so why try to limit
the amount of data transferred.  :)

 But the behaviour of --inplace is not entirely to write out *only* the
 blocks that have changed. From what I could make out, it does the
 following:

 (1) Calculate a delta b/w the src and trg files
 (2) Seek to the first difference in the target file
 (3) Start writing data

That may be true, I've never looked into the actual algorithm(s) that
rsync uses.  Just played around with CLI options until we found the
set that works best in our situation (--inplace --delete-during
--no-whole-file --numeric-ids --hard-links --archive, over SSH with
HPN patches).

 I'm glossing over the final step because I didn't look deeper, but I
 think you can safely assume that after the first difference, all data
 is rewritten. So this is halfway between rewrite the whole file and
 write only the changed bits into the file. It doesn't actually use
 any CoW features from what I can see. There is lots of room for btrfs
 reflinking magic. :)

 Note that I tested this behaviour on a btrfs partition with a vanilla
 rsync-3.0.7 tarball; the copy you use with ZFS might be doing some CoW
 magic.

All the CoW magic is handled by the filesystem, and not the tools on
top.  If the tool only updates X bytes, which fit into 1 block on the
fs, then only that 1 block gets updated via CoW.

Personally, I don't think the tools need to be updated to understand
CoW or to integrate with the underlying FS.  Instead, they should just
operate on blocks of X size, and let the FS figure out what to do.

Otherwise, you end up with rsync for ZFS, rsync for ZFS, rsync
for BtrFS, rsync for FAT32, etc.

But, I'm just a lowly sysadmin, what do I know about filesystem internals?  ;)


-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What to do about subvolumes?

2010-12-01 Thread Freddie Cash
On Wed, Dec 1, 2010 at 11:35 AM, Hugo Mills hugo-l...@carfax.org.uk wrote:
 On Wed, Dec 01, 2010 at 12:38:30PM -0500, Josef Bacik wrote:
 If you delete your subvolume A, like use the btrfs tool to delete it, you 
 will
 only be stuck with what you changed in snapshot B.  So if you only changed 
 5gig
 worth of information, and you deleted the original subvolume, you would have
 5gig charged to your quota.

   This doesn't work, though, if the owners of the original and
 new subvolume are different:

 Case 1:

  * Porthos creates 10G data.
  * Athos makes a snapshot of Porthos's data.
  * A sysadmin (Richelieu) changes the ownership on Athos's snapshot of
   Porthos's data to Athos.
  * Porthos deletes his copy of the data.

 Case 2:

  * Porthos creates 10G of data.
  * Athos makes a snapshot of Porthos's data.
  * Porthos deletes his copy of the data.
  * A sysadmin (Richelieu) changes the ownership on Athos's snapshot of
   Porthos's data to Athos.

 Case 3:

  * Porthos creates 10G data.
  * Athos makes a snapshot of Porthos's data.
  * Aramis makes a snapshot of Porthos's data.
  * A sysadmin (Richelieu) changes the ownership on Athos's snapshot of
   Porthos's data to Athos.
  * Porthos deletes his copy of the data.

 Case 4:

  * Porthos creates 10G data.
  * Athos makes a snapshot of Porthos's data.
  * Aramis makes a snapshot of Athos's data.
  * Porthos deletes his copy of the data.
   [Consider also Richelieu changing ownerships of Athos's and Aramis's
   data at alternative points in this sequence]

   In each of these, who gets charged (and how much) for their copy of
 the data?

  The idea is you are only charged for what blocks
 you have on the disk.  Thanks,

   My point was that it's perfectly possible to have blocks on the
 disk that are effectively owned by two people, and that the person to
 charge for those blocks is, to me, far from clear. You either end up
 charging twice for a single set of blocks on the disk, or you end up
 in a situation where one person's actions can cause another person's
 quota to fill up. Neither of these is particularly obvious behaviour.

As a sysadmin and as a user, quotas shouldn't be about physical
blocks of storage used but should be about logical storage used.

IOW, if the filesystem is compressed, using 1 GB of physical space to
store 10 GB of data, my quota used should be 10 GB.

Similar for deduplication.  The quota is based on the storage *before*
the file is deduped.  Not after.

Similar for snapshots.  If UserA has 10 GB of quota used, I snapshot
their filesystem, then my quota used would be 10 GB as well.  As
data in my snapshot changes, my quota used is updated to reflect
that (change 1 GB of data compared to snapshot, use 1 GB of quota).

You have to (or at least should) keep two sets of stats for storage usage:
  - logical amount used (real file size, before compression, before
de-dupe, before snapshots, etc)
  - physical amount used (what's actually written to disk)

User-level quotas are based on the logical storage used.
Admin-level quotas (if you want to implement them) would be based on
physical storage used.

Thus, the output of things like df, du, ls would show the logical
storage used and file sizes.  And you would either have an additional
option to those apps (--real or something) to show the actual
storage used and file sizes as stored on disk.

Trying to make quotas and disk usage utilities to work based on what's
physically on disk is just backwards, imo.  And prone to a lot of
confusion.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What to do about subvolumes?

2010-12-01 Thread Freddie Cash
On Wed, Dec 1, 2010 at 1:28 PM, Hugo Mills hugo-l...@carfax.org.uk wrote:
 On Wed, Dec 01, 2010 at 12:24:28PM -0800, Freddie Cash wrote:
 On Wed, Dec 1, 2010 at 11:35 AM, Hugo Mills hugo-l...@carfax.org.uk wrote:
   The idea is you are only charged for what blocks
  you have on the disk.  Thanks,
 
    My point was that it's perfectly possible to have blocks on the
  disk that are effectively owned by two people, and that the person to
  charge for those blocks is, to me, far from clear. You either end up
  charging twice for a single set of blocks on the disk, or you end up
  in a situation where one person's actions can cause another person's
  quota to fill up. Neither of these is particularly obvious behaviour.

 As a sysadmin and as a user, quotas shouldn't be about physical
 blocks of storage used but should be about logical storage used.

 IOW, if the filesystem is compressed, using 1 GB of physical space to
 store 10 GB of data, my quota used should be 10 GB.

 Similar for deduplication.  The quota is based on the storage *before*
 the file is deduped.  Not after.

 Similar for snapshots.  If UserA has 10 GB of quota used, I snapshot
 their filesystem, then my quota used would be 10 GB as well.  As
 data in my snapshot changes, my quota used is updated to reflect
 that (change 1 GB of data compared to snapshot, use 1 GB of quota).

   So if I've got 10G of data, and I snapshot it, I've just used
 another 10G of quota?

Sorry, forgot the per user bit above.

If UserA has 10 GB of data, then UserB snapshots it, UserB's quota
usage is 10 GB.

If UserA has 10 GB of data and snapshots it, then only 10 GB of quota
usage is used, as there is 0 difference between the snapshot and the
filesystem.  As UserA modifies data, their quota usage increases by
the amount that is modified (ie 10 GB data, snapshot, modify 1 GB data
== 11 GB quota usage).

If you combine the two scenarios, you end up with:
  - UserA has 10 GB of data == 10 GB quota usage
  - UserB snapshots UserA's filesystem (clone), so UserB has 10 GB
quota usage (even though 0 blocks have changed on disk)
  - UserA snapshots UserA's filesystem == no change to quota usage (no
blocks on disk have changed)
  - UserA modifies 1 GB of data in the filesystem == 1 GB new quota
usage (11 GB total) (1 GB of blocks owned by UserA have changed, plus
the 10 GB in the snapshot)
  - UserB still only has 10 GB quota usage, since their snapshot
hasn't changed (0 blocks changed)

If UserA deletes their filesystem and all their snapshots, freeing up
11 GB of quota usage on their account, UserB's quota will still be 10
GB, and the blocks on the disk aren't actually removed (still
referenced by UserB's snapshot).

Basically, within a user's account, only the data unique to a snapshot
should count toward the quota.

Across accounts, the original (root) snapshot would count completely
to the new user's quota, and then only data unique to subsequent
snapshots would count.

I hope that makes it more clear.  :)  All the different layers and
whatnot get confusing.  :)

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blog: BTRFS is effectively stable

2010-10-30 Thread Freddie Cash
On Fri, Oct 29, 2010 at 4:38 PM, Chris Samuel ch...@csamuel.org wrote:
 A friend of mine who builds storage systems designed for HPC
 use has been keeping an eye on btrfs and has just done some
 testing of it with 2.6.36 and seems to like what he sees in
 terms of stability.

That's a *very* misleading conclusion to come to based solely on a
single file I/O test.  It's more realistic to say stable under fio
load in ideal conditions.

For example:
  No device-yanking tests were done.
  No power-cord yanking tests were done.
  No device cables were yanked, shaken, or plugged/unplugged in rapid
succession.
  No dd the raw device underneath the filesystem while doing file
I/O tests were done.
  No recovery tests were done.

IOW, you can't really say it's stable across the board like that.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Francis Galiegue would like your help testing a survey

2010-09-28 Thread Freddie Cash
So far, very nice.  Some comments inline below.

On Tue, Sep 28, 2010 at 8:07 AM, Francis Galiegue fgalie...@gmail.com wrote:
 On Tue, Sep 28, 2010 at 16:57, David Pottage da...@electric-spoon.com wrote:
 On 28/09/10 15:27, Francis Galiegue wrote:

 Here is a preview of the survey.

 I have not included *all* feature requests yet, otherwise it wouldn't fit
 on a screen :), but I think I have chosen the most important ones.

 Please comment!

 Click on the following link to test this survey:

 http://appv3.sgizmo.com/testsurvey/survey?id=376617crc=98980edfce58a795c966488276754ddb


 A lot of the questions are dependent on if the user is a btrfs user or not.
 It would be nice ask that as a first question, and then to hide some
 questions depending on the answer.


 Yep, but the problem is, as far as I can see, you don't have this
 option for the type of account (free) I'm using on the site :/

Perhaps add a separate choice (Do not currently use btrfs) to each
question after number 6?  That way, non-users like me can just breeze
through the rest of the survey, but you still get the information from
us for the first half of the survey.


-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs user survey?

2010-09-27 Thread Freddie Cash
On Mon, Sep 27, 2010 at 3:44 PM, Chris Ball c...@laptop.org wrote:
    Well, all in all, you get the idea, and I'm probably not the guy
    to craft questions for such a survey. But having input from as
    large a panel of users as possible would be a nice thing to have.

 Your questions are fine -- I might add:

 * Rank the following future features in importance, 4 == most important
 [ ] working fsck
 [ ] GUIs for userspace actions e.g. snapshots
 [ ] data deduplication
 [ ] hot data relocation

You're missing RAID levels above 1, and deduplication.  :)  (And
probably a few others.)

 Please use something like Google Spreadsheets (which has a forms
 option) if you're going to run such a survey, rather than having
 everyone reply on-list -- we shouldn't bother this list with any
 results other than the final summary.

Something like SurveyMonkey (http://www.surveymonkey.com) or
SurveyGizmo (http://www.surveygizmo.com) or similar would be better,
as it does all the reporting for you, builds the nice survey interface
with checkboxes, radio buttons, text fields, etc.  And they're still
free (as in beer).

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: remote mirroring in the works?

2010-08-30 Thread Freddie Cash
On Mon, Aug 30, 2010 at 2:15 PM, Fred van Zwieten fvzwie...@gmail.com wrote:
 I just glanced over the DRBD/LVM combi, but I don't see it being
 functionally equal to SnapMirror. Let me (try to) explain how
 snapmirror works:

 On system A there is a volume (vol1). We let this vol1(A) replicate
 thru SnapMirror to vol1(B). This is done by creating a snap vol1sx(A)
 and replicate all changed blocks between this snapshot (x) and the
 previous snapshot (x-1). The first time, there is no x-1 and the whole
 volume will be replicated, but after this initial full copy, only
 the changed blocks between the two snapshot's are being replicated to
 system B. This is also called snap based replication. Why we want
 this? Easy. To support consistent DB snap's. The proces works by first
 putting the DB in a consistent mode (depends on DB implementation),
 create a snapshot, let the DB continue, replicate the changes. This
 way a DB consistent state will be replicated. The cool thing about the
 NetApp implementation is that on system B the snap's (x, x-1, x-2,
 etc) are also available. When there is trouble, you can choose to
 online the DB on system B on any of the snap's, or, even cooler, to
 replicate one of those snap's back to system A, doing a block based
 rollback at the filesystem level.

In the ZFS world, this would be the zfs send and zfs recv
functionality.  In case anyone wants to read up on how it works over
there, for ideas on how it could be implemented for btrfs in the
future.

-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is there a more aggressive fixer than btrfsck?

2010-06-29 Thread Freddie Cash
On Tue, Jun 29, 2010 at 3:37 AM, Daniel Kozlowski
dan.kozlow...@gmail.com wrote:
 On Mon, Jun 28, 2010 at 10:31 PM, Rodrigo E. De León Plicet
 rdele...@gmail.com wrote:
 On Mon, Jun 28, 2010 at 8:48 AM, Daniel Kozlowski
 dan.kozlow...@gmail.com wrote:
 Sean Bartell wingedtachikoma at gmail.com writes:

  Is there a more aggressive filesystem restorer than btrfsck?  It simply
  gives up immediately with the following error:
 
  btrfsck: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root-node)'
  failed.

 btrfsck currently only checks whether a filesystem is consistent. It
 doesn't try to perform any recovery or error correction at all, so it's
 mostly useful to developers. Any error handling occurs while the
 filesystem is mounted.


 Is there any plan to implement this functionality. It would seem to me to 
 be a
 pretty basic feature that is missing ?

 If Btrfs aims to be at least half of what ZFS is, then it will not
 impose a need for fsck at all.

 Read No, ZFS really doesn't need a fsck at the following URL:

 http://www.c0t0d0s0.org/archives/6071-No,-ZFS-really-doesnt-need-a-fsck.html


 Interesting idea. it would seem to me however that the functionality
 described in that article is more concerned with a bad transaction
 rather then something like a hardware failure where a block written
 more then 128 transactions ago is now corrupted and consiquently the
 entire partition is now unmountable( that is what I think i am looking
 at with BTRFS )

In the ZFS case, this is handled by checksumming and redundant data,
and can be discovered (and fixed) via either reading the affected data
block (in which case, the checksum is wrong, the data is read from a
redundant data block, and the correct data is written over the
incorrect data) or by running a scrub.

Self-healing, checksumming, data redundancy eliminate the need for
online (or offline) fsck.

Automatic transaction rollback at boot eliminates the need for fsck at
boot, as there is no such thing as a dirty filesystem.  Either the
data is on disk and correct, or it doesn't exist.  Yes, you may lose
data.  But you will never have a corrupted filesystem.

Not sure how things work for btrfs.


-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html