you would need to move up to 15K rpm drives to have a very large array and
the cost will grow exponentially trying to get such a large array.
as Les said, look at a zfs array with block level dedup. I have a 3TB setup
right now and I have some been running a backup against a unix server and 2
linux servers in my main office here to see how the dedup works
opensolaris:~$ zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
rpool 74G 5.77G 68.2G 7% 1.00x ONLINE -
storage 3.06T 1.04T 2.02T 66% 19.03x ONLINE -
this is just rsync(3) pulling data over to a directory
/storage/host1 which is a zfs fileset off pool storage for each host.
my script is very simple at this point
zfs snapshot storage/ho...@`date +%Y.%m.%d-%M.%S`
rsync -aHXA --exclude-from=/etc/backups/host1excludes.conf host1:/
/storage/host1
to build the pool and fileset
format #gives all available disks
zpool status will tell you what disks are already in pools
zpool create storage mirror disk1 disk2 disk3 etc etc spare disk11 cache
disk12 log disk13
#cache disk is a high RPM disk or SSD, basically a massive buffer for IO
caching,
#log is a transaction log and doesnt need a lot of size but IO is good so
high RPM or smaller SSD
#cache and log are optional and are mainly for performance improvements when
using slower storage drives like my 7200RPM SATA drives
zfs create -o dedup=on (or dedup=verify) -o compression=on -o storage/host1
dedup is very very good for writes BUT requires a big CPU. dont re-purpose
your old P3 for this.
compression is actually going to help your write performance assuming you
have a fast CPU. it will reduce the IO load and zfs will re-order writes on
the fly.
dedup is all in-line so it reduces IO load for anything with common blocks.
it is also block level not file level so a large file with slight changes
will get deduped.
dedup+compression really needs a fast dual core or quad core.
if you look at my zpool list above you can see my dedup at 19x and usage at
1.04 which effectively means Im getting 19TB in 1TB worth of space. my
servers have relatively few files that change and the large files get
appended to so I really only store the changes.
snapshots are almost instant and can be browsed at
/storage/host1/.zfs/snapshot/ and are labeled by the @`date xxx` so i get
folders for the dates. these are read only snapshots and can be shared via
samba or nfs.
zfs list -t snapshot
opensolaris:/storage/host1/.zfs/snapshot# zfs list -t snapshot
NAME
rpool/ROOT/opensola...@install 270M - 3.26G -
storage/ho...@2010.02.19-48.33
zfs set sharesmb=on storage/ho...@2010.02.19-48.33
-or-
zfs set sharenfs=on storage/ho...@2010.02.19-48.33
if you dont want to go pure opensolaris then look at nexenta. it is a
functional opensolaris-debian/ubuntu hybrid with ZFS and it has dedup. it
does not currently share via iscsi so keep that in mind. I believe it also
uses a full samba package for samba shares while opensolaris can use the
native CIFS server which is faster than samba.
opensolaris can also join Active Directory. You also need to extend your AD
schema. If you do you can give a priviliged use UID and GUI mappings in AD
and then you can access the windows1/C$ shares. I would create a backup
user and add them to restricted groups in GP to be local administrators on
the machines (but not domain admins). You would probably want to figure out
how to do a VSS and rsync that over instead of the active filesystem because
you will get tons of file locals if you dont.
good luck
On Fri, Feb 19, 2010 at 6:51 AM, Les Mikesell <lesmikes...@gmail.com> wrote:
> Ralf Gross wrote:
> >
> > I think I've to look for a different solution, I just can't imagine a
> > pool with > 10 TB.
>
> Backuppc's usual scaling issues are with the number of files/links more
> than
> total size, so the problems may be different when you work with huge files.
> I
> thought someone had posted here about using nfs with a common archive and
> several servers running the backups but I've forgotten the details about
> how he
> avoided conflicts and managed it. Maybe this would be the place to look at
> opensolaris with zfs's new block-level de-dup and a simpler rsync copy.
>
>
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/