Re: [BackupPC-users] Advice on creating duplicate backup server
just playing devils advocate here as this conversation has already chosen its direction. would it be reasonable to compare two filesystems based on their journal? I am assuming that essentially all backuppc installations are on a journaled filesystem or could be upgraded to one. wouldnt replaying that journal to a tar file, sending the tar file to the remote host, then restoring the journal be pretty efficient? I have no experience with this, just wanted to throw another concept out there. On Mon, Dec 8, 2008 at 10:49 PM, Jeffrey J. Kosowsky <[EMAIL PROTECTED]>wrote: > Holger Parplies wrote at about 04:10:17 +0100 on Tuesday, December 9, 2008: > > Hi, > > > > Jeffrey J. Kosowsky wrote on 2008-12-08 09:37:16 -0500 [Re: > [BackupPC-users] Advice on creating duplicate backup server]: > > > > > > It just hit me that given the known architecture of the pool and cpool > > > directories shouldn't it be possible to come up with a scheme that > > > works better than either rsync (which can choke on too many hard > > > links) and 'dd' (which has no notion of incremental and requires you > > > to resize the filesystem etc.). > > > > yes, that hit someone on the list several years ago (I don't remember > the > > name, sorry). I implemented the idea he sketched (well, more or less, > there's > > some work left to make it really useful). > > > > > My thought is as follows: > > > 1. First, recurse through the pc directory to create a list of > > >files/paths and the corresponding pool links. > > >Note that finding the pool links can be done in one of several > > >ways: > > >- Method 1: Create a sorted list of pool files (which should be > > > significantly shorter than the list of all files due to the > > > nature of pooling and therefore require less memory than rsyn) > > > and then look up the links. > > > > Wrong. You need one entry per inode that points to an arbitrary path > (the > > first one you copy). Every file(*) is in the pool, meaning a list of all > pool > > files is exactly what you need. A different way to look at it: every > file with > > a link count > 1 is a pooled file, and it's these files that cause > rsync&co > > problems, not single link files. (Well, yes, rsync pre-3 needed a > complete > > list of all files.) > OK. I had assumed (wrongly) that rsync needed to keep track of each > file that is hard-linked, not just one copy. > Still, there are some savings by knowing that you can find your one > copy in the pool and you don't have to look at all through the pc tree. > > > > (*) Files that are not in the pool: > > 1.) 0-byte files. They take up no file system blocks, so pooling > them > > saves only inodes. Not pooling them makes things simpler. > > 2.) log files (they get appended to; that would make pooling > somewhat > > difficult; besides, what chance is there of a pool hit?), > > backups files (including backups.old) > > attrib files are pooled, contrary to popular belief, and that makes > > sense, because they are often identical with the same attrib file > from > > the previous backup(s). > Yes. I am aware of this from the routines I wrote to check/fix pool > consistency and missing links to the pool > > > > > > The algorithm I implemented is somewhat similar: > > 1.) Walk pool/, cpool/ and pc/, printing information on the files and > > directories to a file (which will be quite large; by default I put > it > > on the destination pool FS, because there should be large amounts of > > space there). > > 2.) Sort the file with the 'sort' command. The lines in the file are > > designed such that they will be sorted into a meaningful order: > > - directories first, so I can create them and subsequently not worry > > about whether the place I want to copy/link a file to already > exists > > or not > > - files next, sorted by inode number, with the (c)pool file > preceeding its > > pc/ links > > The consequence is that I get all references to one inode on > adjacent > > lines. The first time, I copy the file. For the repetitions, I > link to > > the first copy. All I need to keep in memory is something like one > line > > from the file list, one "previous inode number", one "file name of > > previous inode". > > 'sort' handles hug
Re: [BackupPC-users] Advice on creating duplicate backup server
Holger Parplies wrote at about 04:10:17 +0100 on Tuesday, December 9, 2008: > Hi, > > Jeffrey J. Kosowsky wrote on 2008-12-08 09:37:16 -0500 [Re: [BackupPC-users] > Advice on creating duplicate backup server]: > > > > It just hit me that given the known architecture of the pool and cpool > > directories shouldn't it be possible to come up with a scheme that > > works better than either rsync (which can choke on too many hard > > links) and 'dd' (which has no notion of incremental and requires you > > to resize the filesystem etc.). > > yes, that hit someone on the list several years ago (I don't remember the > name, sorry). I implemented the idea he sketched (well, more or less, there's > some work left to make it really useful). > > > My thought is as follows: > > 1. First, recurse through the pc directory to create a list of > >files/paths and the corresponding pool links. > >Note that finding the pool links can be done in one of several > >ways: > >- Method 1: Create a sorted list of pool files (which should be > > significantly shorter than the list of all files due to the > > nature of pooling and therefore require less memory than rsyn) > > and then look up the links. > > Wrong. You need one entry per inode that points to an arbitrary path (the > first one you copy). Every file(*) is in the pool, meaning a list of all pool > files is exactly what you need. A different way to look at it: every file > with > a link count > 1 is a pooled file, and it's these files that cause rsync&co > problems, not single link files. (Well, yes, rsync pre-3 needed a complete > list of all files.) OK. I had assumed (wrongly) that rsync needed to keep track of each file that is hard-linked, not just one copy. Still, there are some savings by knowing that you can find your one copy in the pool and you don't have to look at all through the pc tree. > > (*) Files that are not in the pool: > 1.) 0-byte files. They take up no file system blocks, so pooling them > saves only inodes. Not pooling them makes things simpler. > 2.) log files (they get appended to; that would make pooling somewhat > difficult; besides, what chance is there of a pool hit?), > backups files (including backups.old) > attrib files are pooled, contrary to popular belief, and that makes > sense, because they are often identical with the same attrib file from > the previous backup(s). Yes. I am aware of this from the routines I wrote to check/fix pool consistency and missing links to the pool > > > The algorithm I implemented is somewhat similar: > 1.) Walk pool/, cpool/ and pc/, printing information on the files and > directories to a file (which will be quite large; by default I put it > on the destination pool FS, because there should be large amounts of > space there). > 2.) Sort the file with the 'sort' command. The lines in the file are > designed such that they will be sorted into a meaningful order: > - directories first, so I can create them and subsequently not worry > about whether the place I want to copy/link a file to already exists > or not > - files next, sorted by inode number, with the (c)pool file preceeding > its > pc/ links > The consequence is that I get all references to one inode on adjacent > lines. The first time, I copy the file. For the repetitions, I link to > the first copy. All I need to keep in memory is something like one line > from the file list, one "previous inode number", one "file name of > previous inode". > 'sort' handles huge files quite nicely, but it seems to create large > (amounts of) files under /tmp, possibly under $TMPDIR if you set that > (not > sure). You need to make sure you've got the space, but if you're copying > a > multi-GB/TB pool, you probably have. My guess is that the necessary > amount > of space roughly equals the size of the file I'm sorting. > 3.) Walk the sorted file, line by line, creating directories and copying > files > (with File::Copy::cp, but I plan to change that to PoolWrite, so I can > add > (part of) one pool to an existing second pool, or something that > communicates over TCP/IP, so I can copy to a different machine) and > linking files (with Perl function link()). > In theory, a pool could also be compressed or uncompressed on the fly > (uncompressed for copying to zfs, for instance). Yes... I was thinking very similarly tho
Re: [BackupPC-users] Advice on creating duplicate backup server
Hi, Jeffrey J. Kosowsky wrote on 2008-12-08 09:37:16 -0500 [Re: [BackupPC-users] Advice on creating duplicate backup server]: > > It just hit me that given the known architecture of the pool and cpool > directories shouldn't it be possible to come up with a scheme that > works better than either rsync (which can choke on too many hard > links) and 'dd' (which has no notion of incremental and requires you > to resize the filesystem etc.). yes, that hit someone on the list several years ago (I don't remember the name, sorry). I implemented the idea he sketched (well, more or less, there's some work left to make it really useful). > My thought is as follows: > 1. First, recurse through the pc directory to create a list of >files/paths and the corresponding pool links. >Note that finding the pool links can be done in one of several >ways: >- Method 1: Create a sorted list of pool files (which should be > significantly shorter than the list of all files due to the >nature of pooling and therefore require less memory than rsyn) >and then look up the links. Wrong. You need one entry per inode that points to an arbitrary path (the first one you copy). Every file(*) is in the pool, meaning a list of all pool files is exactly what you need. A different way to look at it: every file with a link count > 1 is a pooled file, and it's these files that cause rsync&co problems, not single link files. (Well, yes, rsync pre-3 needed a complete list of all files.) (*) Files that are not in the pool: 1.) 0-byte files. They take up no file system blocks, so pooling them saves only inodes. Not pooling them makes things simpler. 2.) log files (they get appended to; that would make pooling somewhat difficult; besides, what chance is there of a pool hit?), backups files (including backups.old) attrib files are pooled, contrary to popular belief, and that makes sense, because they are often identical with the same attrib file from the previous backup(s). The algorithm I implemented is somewhat similar: 1.) Walk pool/, cpool/ and pc/, printing information on the files and directories to a file (which will be quite large; by default I put it on the destination pool FS, because there should be large amounts of space there). 2.) Sort the file with the 'sort' command. The lines in the file are designed such that they will be sorted into a meaningful order: - directories first, so I can create them and subsequently not worry about whether the place I want to copy/link a file to already exists or not - files next, sorted by inode number, with the (c)pool file preceeding its pc/ links The consequence is that I get all references to one inode on adjacent lines. The first time, I copy the file. For the repetitions, I link to the first copy. All I need to keep in memory is something like one line from the file list, one "previous inode number", one "file name of previous inode". 'sort' handles huge files quite nicely, but it seems to create large (amounts of) files under /tmp, possibly under $TMPDIR if you set that (not sure). You need to make sure you've got the space, but if you're copying a multi-GB/TB pool, you probably have. My guess is that the necessary amount of space roughly equals the size of the file I'm sorting. 3.) Walk the sorted file, line by line, creating directories and copying files (with File::Copy::cp, but I plan to change that to PoolWrite, so I can add (part of) one pool to an existing second pool, or something that communicates over TCP/IP, so I can copy to a different machine) and linking files (with Perl function link()). In theory, a pool could also be compressed or uncompressed on the fly (uncompressed for copying to zfs, for instance). Once again, because people seem to be determined to miss the point: it's *not* processing by sorted inode numbers in order to save disk seeks that is the point, it's the fact that the 'link' system call takes two paths link $source_path, $dest_path; # to use Perl notation while the 'stat' system call gives you only an inode number. To link a filename to a previously copied inode, you need to know the name you copied it to. A general purpose tool can't know when it will need the information, so it needs to keep information on all inodes with link count > 1 it has encountered. You can keep a mapping of inode_number->file_name in memory for a few thousand files, but not for hundreds of millions. By sorting the list by inode number, I can be sure that I'll never need the info for one inode again once I've reached the next inode, so I only have to keep info for one file in memory, regardl
Re: [BackupPC-users] Advice on creating duplicate backup server
you could mess around with LVM snapshots. I hear that you can make an LVM snapshot and rsync that over, then restore it to the backup LVM. I have not tried this but have seen examples around the net. have you tried rsync3? it works for me. I dont quite have 3TB so I cant really advise you on that size, Im not sure where the line is on file count that rsync3 cant handle. ZFS would be ideal for this but you have to make the leap to a solaris/opensolaris kernel. ZFS Fuse is completely non-functional for backuppc as it will crash as soon as you starting hitting the filesystem and the delayed write caching kicks in. ZFS on freebsd is not mature enough and tends to crash out with heavy IO. with zfs it works something like this: http://blogs.sun.com/clive/resource/zfs_repl.ksh you can send a full zfs snapshot like zfs send /pool/[EMAIL PROTECTED] | ssh remotehost zfs recv -v /remotepool/remotefs or send an incremental afterwards with zfs send -i /pool/[EMAIL PROTECTED] | ssh remotehost zfs recv -F -v /remotepool/remotefs feel free to compress the ssh stream with -C if you like, but I would first check your bandwidth usage and see if you are using the whole thing. If not, then the compression will slow you down. The real downside here is the switch to solaris if you are a linux person. You can also try nexenta which is the opensolaris kernel on a debian/ubuntu userland complete with apt. You also get filesystem level compression with ZFS so you dont need to compress your pool. This should make recovering files outside of backuppc a little more convenient . how is a tape taking 1-2weeks? 1 week = 5.2KB/s. If you are that IO constrained, nothing is going to work right for you. How full is your pool? you could also consider not keeping a copy of the pool remotely be rather pullting a tar backup off the backuppc system on some schedule and sending that to the remote machine for storage. The problem with using NBD or anything like that and using 'dd' is that there is no resume support and with 3TB you are likely to get errors every now and then. even with a full T1 you are stuck at at least 6hour with theoritical numbers and are probably looking at %50 more than that. as far as some other scheme to syncing up the pools, hardlinks will get you. You could use find to traverse the entire pool and take some info down on each file such as name, size, type etc etc and then use some fancy perl to sort this out into managable groups and then use rsync on individual files. On Mon, Dec 8, 2008 at 7:37 AM, Jeffrey J. Kosowsky <[EMAIL PROTECTED]>wrote: > Stuart Luscombe wrote at about 10:02:04 + on Monday, December 8, 2008: > > Hi there, > > > > > > > > I've been struggling with this for a little while now so I thought it > about > > time I got some help! > > > > > > > > We currently have a server running BackupPC v3.1.0 which has a pool of > > around 3TB and we've got to a stage where a tape backup of of the pool > is > > taking 1-2 weeks, which isn't effective at all. The decision was made > to > > buy a server that is an exact duplicate of our current one and have it > > hosted in another building, as a 2 week old backup isn't ideal in the > event > > of a disaster. > > > > > > > > I've got the OS (CentOS) installed on the new server and have installed > > BackupPC v3.1.0, but I'm having problems working out how to sync the > pool > > with the main backup server. I managed to rsync the cpool folder without > any > > real bother, but the pool folder is the problem, if I try an rsync it > > eventually dies with an 'out of memory' error (the server has 8GB), and > a cp > > -a didn't seem to work either, as the server filled up, assumedly as > it's > > not copying the hard links correctly? > > > > > > > > So my query here really is am I going the right way about this? If not, > > what's the best method to take so that say once a day the duplicate > server > > gets updated. > > > > > > > > Many Thanks > > It just hit me that given the known architecture of the pool and cpool > directories shouldn't it be possible to come up with a scheme that > works better than either rsync (which can choke on too many hard > links) and 'dd' (which has no notion of incremental and requires you > to resize the filesystem etc.). > > My thought is as follows: > 1. First, recurse through the pc directory to create a list of > files/paths and the corresponding pool links. > Note that finding the pool links can be done in one of several > ways: > - Method 1: Create a sorted list of pool files (which should be > significantly shorter than the list of all files due to the > nature of pooling and therefore require less memory than rsyn) > and then look up the links. > - Method 2: Calculate the md5sum file path of the file to determine > out where it is in the pool. Where necessary, determine among > chain duplicates > - Method 3: Not possible ye
Re: [BackupPC-users] Advice on creating duplicate backup server
Stuart Luscombe wrote at about 10:02:04 + on Monday, December 8, 2008: > Hi there, > > > > I've been struggling with this for a little while now so I thought it about > time I got some help! > > > > We currently have a server running BackupPC v3.1.0 which has a pool of > around 3TB and we've got to a stage where a tape backup of of the pool is > taking 1-2 weeks, which isn't effective at all. The decision was made to > buy a server that is an exact duplicate of our current one and have it > hosted in another building, as a 2 week old backup isn't ideal in the event > of a disaster. > > > > I've got the OS (CentOS) installed on the new server and have installed > BackupPC v3.1.0, but I'm having problems working out how to sync the pool > with the main backup server. I managed to rsync the cpool folder without any > real bother, but the pool folder is the problem, if I try an rsync it > eventually dies with an 'out of memory' error (the server has 8GB), and a cp > -a didn't seem to work either, as the server filled up, assumedly as it's > not copying the hard links correctly? > > > > So my query here really is am I going the right way about this? If not, > what's the best method to take so that say once a day the duplicate server > gets updated. > > > > Many Thanks It just hit me that given the known architecture of the pool and cpool directories shouldn't it be possible to come up with a scheme that works better than either rsync (which can choke on too many hard links) and 'dd' (which has no notion of incremental and requires you to resize the filesystem etc.). My thought is as follows: 1. First, recurse through the pc directory to create a list of files/paths and the corresponding pool links. Note that finding the pool links can be done in one of several ways: - Method 1: Create a sorted list of pool files (which should be significantly shorter than the list of all files due to the nature of pooling and therefore require less memory than rsyn) and then look up the links. - Method 2: Calculate the md5sum file path of the file to determine out where it is in the pool. Where necessary, determine among chain duplicates - Method 3: Not possible yet but would be possible if the md5sum file paths were appended to compressed backups. This would add very little to the storage but it would allow you to very easily determine the right link. If so then you could just read the link path from the file. Files with only 1 link (i.e. no hard links) would be tagged for straight copying. 2. Then rsync *just* the pool -- this should be no problem since by definition there are no hard links within the pool itself 3. Finally, run through the list generated in #1 to create the new pc directory by creating the necessary links (and for files with no hard links, just copy/rsync them) The above could also be easily adapted to allow for "incremental" syncing. Specifically, in #1, you would use rsync to just generate a list of *changed* files in the pc directory. In #2, you would continue to use rsync to just sync *changed* pool entries. In #3 you would only act on the shortened incremental sync list generated in #1. The more I think about it, the more I LIKE the idea of appending the md5sums file paths to compressed pool files (Method #3) since this would make the above very fast. (Note if I were implementing this, I would also include the chain number in cases where there are multiple files with the same md5sum path and of course then BackupPC_nightly would have to adjust this any time it changed around the chain numbering). Even without the above, Method #1 would still be much less memory intensive than rsync and Method #2 while potentially a little slow would require very little memory and wouldn't be nearly that bad if you are doing incremental backups. -- Just as any FYI, if anyone wants to implement method #2, here is the routine I use to generate the md5sum file path from a (compressed) file (note that it is based on the analogous uncompressed version in Lib.pm). use BackupPC::Lib; use BackupPC::Attrib; use BackupPC::FileZIO; use constant _128KB => 131072; use constant _1MB => 1048576; # Compute the MD5 digest of a compressed file. This is the compressed # file version of the Lib.pm function File2MD5. # For efficiency we don't use the whole file for big files # - for files <= 256K we use the file size and the whole file. # - for files <= 1M we use the file size, the first 128K and # the last 128K. # - for files > 1M, we use the file size, the first 128K and # the 8th 128K (ie: the 128K up to 1MB). # See the documentation for a discussion of the tradeoffs in # how much data we use and how many collisions we get. # # Returns the MD5 digest (a hex s
Re: [BackupPC-users] Advice on creating duplicate backup server
Hi, > > Instead of trying to sync the pool, can't you just run a second > > BackupPC server that also backs up your machines? > If you don't need the current backup history on the redundant server, save > yourself the pain of the initial pool copy and just follow this path - > presuming network and client load constraints allow you to. What do you guys think of DRBD as a solution to this? Benefits: - fast synchronization - backup server failover possible - no changes to BackupPC itself or the backup strategy Caveats: - the very same device parameters on both servers: identical file system and size etc. - if not on a cluster file system, only one side gets to read/modify the data Thomas signature.asc Description: This is a digitally signed message part -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Advice on creating duplicate backup server
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Holger Parplies wrote: > Hi, > > Nils Breunese (Lemonbit) wrote on 2008-12-08 12:23:40 +0100 [Re: > [BackupPC-users] Advice on creating duplicate backup server]: >> Stuart Luscombe wrote: >> >>> I?ve got the OS (CentOS) installed on the new server and have >>> installed BackupPC v3.1.0, but I?m having problems working out how >>> to sync the pool with the main backup server. > > I can't help you with *keeping* the pools in sync (other than recommending to How about my personal favourite - enbd or nbd I've had success using this to mirror a fileserver with it's "hot standby" partner... Simply setup all your drives as needed in your "master", format, use as needed (as you already have), then setup your drives in your slave (ie, raid1/5/6/etc) but don't format them. Then setup nbd/enbd (very simple, run one command on the slave and one on the master). Now, the tricky part, follow carefully: 1) umount the drive on the master. 2) create a raid1 array on the master with one device being your filesystem you unmounted in (1) and the second device "missing" 3) hot-add the device from nbd (/dev/nbd/0) to your new raid1 array 4) Use mdadm to configure the device in (3) as a write-mostly or write-only if possible. Now you have a real-time mirror on a remote machine. If everything goes pear-shaped, you do something like this: 1) Make sure the master is dead... 2) kill nbd on the slave 3) mount the device you used for nbd as /var/lib/backuppc 4) start backuppc on the slave (PS, this assumes you have some method of sync'ing the other system configs between the two machines (hint - rsync)... You may need to experiment a bit, but perhaps LVM + snapshots might help as well Of course, the simplest method to ensure off-site and up to date backups is to simple run a second independant backuppc server, assuming you have enough time + bandwidth... Hope that helps... Regards, Adam -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkk9JTsACgkQGyoxogrTyiVQxQCdG3oRrrHYj4b5WY+TzkBNxDIh x50AoLYsfYeE1qYjdbC81CQAuCR0Tw/a =Gto6 -END PGP SIGNATURE- -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Advice on creating duplicate backup server
Hi, Nils Breunese (Lemonbit) wrote on 2008-12-08 12:23:40 +0100 [Re: [BackupPC-users] Advice on creating duplicate backup server]: > Stuart Luscombe wrote: > > > I?ve got the OS (CentOS) installed on the new server and have > > installed BackupPC v3.1.0, but I?m having problems working out how > > to sync the pool with the main backup server. I can't help you with *keeping* the pools in sync (other than recommending to run the backups from both servers, like Nils said), but I may be able to help you with an initial copy - presuming 'dd' doesn't work, which would be the preferred method. Can you mount either the old pool on the new machine or the new pool on the old machine via NFS? Or even better, put both disk sets in one machine for copying? You would need to shut down BackupPC for the duration of the copy - is that feasible? 3TB means you're facing about 10 hours even with 'dd', fast hardware and no intervening network - anything more complicated will obviously take longer. Your pool size is 3TB - how large is the file system it is on? Is the destination device at least the same size? How many files are there in your pool? > > I managed to rsync the > > cpool folder without any real bother, but the pool folder is the > > problem, Err, 'pool' or 'pc'? ;-) > > and a cp ?a didn?t seem to work > > either, as the server filled up, assumedly as it?s not copying the > > hard links correctly? That is an interesting observation. I was always wondering exactly in which way cp would fail. > > So my query here really is am I going the right way about this? If > > not, what?s the best method to take so that say once a day the > > duplicate server gets updated. Well, Dan, zfs? ;-) Presuming we can get an initial copy done (does anyone have any ideas on how to *verify* a 3TB pool copy?), would migrating the BackupPC servers to an Opensolaris kernel be an option, or is that too "experimental"? > Check the archives for a *lot* of posts on this subject. The general > conclusion is that copying or rsyncing big pools just doesn't work > because of the large number of hardlinks used by BackupPC. Using rsync > 3.x instead of 2.x seems to need a lot less memory, but it just ends > at some point. Because the basic problem for *any general purpose tool* remains: you need a full inode number to file name mapping for *all files* (there are next to no files with only one link in a pool FS), meaning *at least* something like 50 bytes per file, probably significantly more. You do the maths. Apparently, cp simply ignores hardlinks once malloc() starts failing, but I'm just guessing. This doesn't mean it can't be done. It just means *general purpose tools* will start to fail at some point. > A lot of people run into this when they want to migrate > their pool to another machine or bigger hard drive. In that case the > usual advice is to use dd to copy the partition and then grow the > filesystem once it's copied over. The only problem being that this limits you to the same FS with the same parameters (meaning if you've set up an ext3 FS with too high or too low inodes to block ratio, you can't fix it this way). And the fact remains that copying huge amounts of data simply takes time. > Instead of trying to sync the pool, can't you just run a second > BackupPC server that also backs up your machines? If you don't need the current backup history on the redundant server, save yourself the pain of the initial pool copy and just follow this path - presuming network and client load constraints allow you to. One other thing: is your pool size due to the amount of backed up data or due to a long backup history? If you just want to ensure you have a recent version of your data (but not the complete backup history) in the event of a catastrophe, archives (rather than a copy of the complete pool) may be what you're looking for. Regards, Holger -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Advice on creating duplicate backup server
Stuart Luscombe wrote: > > I’ve got the OS (CentOS) installed on the new server and have > installed BackupPC v3.1.0, but I’m having problems working out how > to sync the pool with the main backup server. I managed to rsync the > cpool folder without any real bother, but the pool folder is the > problem, if I try an rsync it eventually dies with an ‘out of > memory’ error (the server has 8GB), and a cp –a didn’t seem to work > either, as the server filled up, assumedly as it’s not copying the > hard links correctly? > > So my query here really is am I going the right way about this? If > not, what’s the best method to take so that say once a day the > duplicate server gets updated. Check the archives for a *lot* of posts on this subject. The general conclusion is that copying or rsyncing big pools just doesn't work because of the large number of hardlinks used by BackupPC. Using rsync 3.x instead of 2.x seems to need a lot less memory, but it just ends at some point. A lot of people run into this when they want to migrate their pool to another machine or bigger hard drive. In that case the usual advice is to use dd to copy the partition and then grow the filesystem once it's copied over. dd'ing your complete pool every day isn't going to work either I guess. Instead of trying to sync the pool, can't you just run a second BackupPC server that also backs up your machines? Nils Breunese. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Advice on creating duplicate backup server
Stuart Luscombe wrote: > Hi there, > > > > I’ve been struggling with this for a little while now so I thought it > about time I got some help! > > > > We currently have a server running BackupPC v3.1.0 which has a pool of > around 3TB and we’ve got to a stage where a tape backup of of the pool > is taking 1-2 weeks, which isn’t effective at all. The decision was > made to buy a server that is an exact duplicate of our current one and > have it hosted in another building, as a 2 week old backup isn’t ideal > in the event of a disaster. > > > > I’ve got the OS (CentOS) installed on the new server and have installed > BackupPC v3.1.0, but I’m having problems working out how to sync the > pool with the main backup server. I managed to rsync the cpool folder > without any real bother, but the pool folder is the problem, if I try an > rsync it eventually dies with an ‘out of memory’ error (the server has > 8GB), and a cp –a didn’t seem to work either, as the server filled up, > assumedly as it’s not copying the hard links correctly? > > > > So my query here really is am I going the right way about this? If not, > what’s the best method to take so that say once a day the duplicate > server gets updated. > > > > Many Thanks > All of the folders have to be on the same filesystem, and they should all be synced all at once. Otherwise rsync won't know about hardlinks. Also, have you tried 'cp -a --preserve=all'? It should be mostly redundant but may be worth a shot. Best regards, Johan -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
[BackupPC-users] Advice on creating duplicate backup server
Hi there, I've been struggling with this for a little while now so I thought it about time I got some help! We currently have a server running BackupPC v3.1.0 which has a pool of around 3TB and we've got to a stage where a tape backup of of the pool is taking 1-2 weeks, which isn't effective at all. The decision was made to buy a server that is an exact duplicate of our current one and have it hosted in another building, as a 2 week old backup isn't ideal in the event of a disaster. I've got the OS (CentOS) installed on the new server and have installed BackupPC v3.1.0, but I'm having problems working out how to sync the pool with the main backup server. I managed to rsync the cpool folder without any real bother, but the pool folder is the problem, if I try an rsync it eventually dies with an 'out of memory' error (the server has 8GB), and a cp -a didn't seem to work either, as the server filled up, assumedly as it's not copying the hard links correctly? So my query here really is am I going the right way about this? If not, what's the best method to take so that say once a day the duplicate server gets updated. Many Thanks -- Stuart Luscombe Systems Administrator Dementia Research Centre 8-11 Queen Square WC1N 3BG London Direct: 08451 555 000 72 3875 Web : http://www.dementia.ion.ucl.ac.uk -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/