[BackupPC-users] Moving lots of data on a client
I have a client on which about 100 GB of data has been moved from one directory to another -- otherwise its exactly the same. As I understand it, since the data has been moved, BackupPC 3 will transfer all the data again (and discard it once it realizes the data is already in the pool) i.e. it does not skip the transfer of each file even though the checksum is identical to an existing file in the pool. I am using the rsync transfer method. Is there a workaround to prevent all 100 GB of data from being transferred again? Regards, Raman -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Moving lots of data on a client
On Tue, Aug 20, 2013 at 1:23 PM, Raman Gupta rocketra...@gmail.com wrote: I have a client on which about 100 GB of data has been moved from one directory to another -- otherwise its exactly the same. As I understand it, since the data has been moved, BackupPC 3 will transfer all the data again (and discard it once it realizes the data is already in the pool) i.e. it does not skip the transfer of each file even though the checksum is identical to an existing file in the pool. I am using the rsync transfer method. Is there a workaround to prevent all 100 GB of data from being transferred again? You should be able to do a matching move in the latest full backup tree under the pc directory for that host if you understand the filename mangling (precede with 'f', and URI-encode %, /, etc.), then force a full backup. Note that this will break your ability to restore back to the old location for the altered full and its subsequent increments. -- Les Mikesell lesmikes...@gmail.com -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Moving lots of data on a client
On Tue, Aug 20, 2013 at 02:23:38PM -0400, Raman Gupta wrote: I have a client on which about 100 GB of data has been moved from one directory to another -- otherwise its exactly the same. As I understand it, since the data has been moved, BackupPC 3 will transfer all the data again (and discard it once it realizes the data is already in the pool) i.e. it does not skip the transfer of each file even though the checksum is identical to an existing file in the pool. That is also my understanding. I am using the rsync transfer method. Is there a workaround to prevent all 100 GB of data from being transferred again? Your mileage may vary on this, but changing the structure of the data on the backuppc system to match what is currently present on the client should work. Asuuming you moved the data from: client1:/location/dir1 to client1:/location/dir2 and the share is /location (and directory dir1 and dir2 are subdirectories). Find the number for the last full backup and the last incremental backup for the host in the web interface. (In theory any filled backup should work in place of the full backup, but you want a tree that has all the files in their original structure. Regular (unfilled) incremental backups will have only the file changed since the last full backup.) Assuming your backuppc tree containing the subdirectorues cpool, pool, pc and so on is at BACKUPPC do the following: cd BACKUPPC/pc/client1/full backup number/f%2flocation (%2f is an encoded / and the leading f is how backuppc identifies backed up files/directories rather than metadata like attrib and backupInfo files) sudo -u backup cp -rl fdir1 ../incremental backup number/f%2flocation/fdir2 where backup is the user backuppc runs as. This will create a copy of the fdir1 tree with links for every file (because of the -l (lower case l flag)) and do it recursively (the -r flag). The target is fdir2 in the last incremental backup. The last incremental backup is going to be the reference backup for the next full backup. The copy may fail to links some files if the maximum number of links to the file is exceeded. If that's the case, the next full backup should just copy those files into place. Once this is done, start a new full backup and with luck it will see the files in the f2 tree and start comparing them with checksums rather than copying them. You may want to start the full backup using the BackupPC_dump command directly with verbose mode turned on so you can get more details on what's happening/transferring. ( If the move involved directory level changes say from: /location/dir1 to /location2/subdir1/dir2, you will have to create the appropriate directory structure under the last incremental directory. In this case it would be BACKUPPC/last incremental number/f%2flocation2/fsubdir1 and then use: sudo -u backup cp -rl fdir1 ../incremental backup number/f%2flocation2/fsubdir1/fdir2 ) I think I got the essentials here, last time I did this was with data move between two different hosts and it was a few years ago. IIRC the attrib file isn't used for determining what directories are in the reference backup, so it can be ignored for this process. You may want to wait a day or so to see if anybody else has some comments, but I claim this should be safe since no data is harmed by the copy/link 8-). Standard disclaimers apply. Let us know how/if it works. -- -- rouilj John Rouillard System Administrator Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111 -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Moving lots of data on a client
On Tue, 20 Aug 2013 14:23:38 -0400 Raman Gupta rocketra...@gmail.com wrote: I have a client on which about 100 GB of data has been moved from one directory to another -- otherwise its exactly the same. As I understand it, since the data has been moved, BackupPC 3 will transfer all the data again (and discard it once it realizes the data is already in the pool) i.e. it does not skip the transfer of each file even though the checksum is identical to an existing file in the pool. I am using the rsync transfer method. Is there a workaround to prevent all 100 GB of data from being transferred again? I think the workaround is to use rsync as transfer ;-) At least when you added the checksum-seed= parameter to your config, it should calculate the checksums on the client and compare with the servers database and only transfer contents that differ. Otherwise I would not manually fiddle with the dirs on the server, its far less stress and risk for error if you just let backuppc do its thing. Even if that means transfering the files again... Have fun, Arnold signature.asc Description: PGP signature -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Moving lots of data on a client
On 08/20/2013 03:27 PM, John Rouillard wrote: On Tue, Aug 20, 2013 at 02:23:38PM -0400, Raman Gupta wrote: I have a client on which about 100 GB of data has been moved from one directory to another -- otherwise its exactly the same. As I understand it, since the data has been moved, BackupPC 3 will transfer all the data again (and discard it once it realizes the data is already in the pool) i.e. it does not skip the transfer of each file even though the checksum is identical to an existing file in the pool. That is also my understanding. I am using the rsync transfer method. Is there a workaround to prevent all 100 GB of data from being transferred again? Your mileage may vary on this, but changing the structure of the data on the backuppc system to match what is currently present on the client should work. Asuuming you moved the data from: Thanks John (and Les) for this suggestion. This was my initial thought as well, but wanted to get feedback from the list before trying it. I used mv/cp -rl as suggested to create the target structure in the last full. I ignored attrib files as suggested by John (clearly, the manipulation of the last full breaks the attrib data, but that is fine for this temporary hack). I then deleted all the incrementals after the last full using J. Kosowski's deleteBackup script, just to be sure they didn't mess something up. Since they didn't see the changes made in the full, I think they were corrupted anyway. I then ran a new full backup manually, and it worked fine. Lastly, I debated whether to manually reverse the mv/cp operations made in the prior last full, to restore it to its original state (with correct locations and attrib files), or simply delete that full backup. I think either approach would have been fine, but I opted to simply delete it to avoid any potential screw-ups. All-in-all, saved a day or so worth of data transfer successfully. Regards, Raman -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Moving lots of data on a client
On 08/20/2013 03:28 PM, Arnold Krille wrote: On Tue, 20 Aug 2013 14:23:38 -0400 Raman Gupta rocketra...@gmail.com wrote: I have a client on which about 100 GB of data has been moved from one directory to another -- otherwise its exactly the same. As I understand it, since the data has been moved, BackupPC 3 will transfer all the data again (and discard it once it realizes the data is already in the pool) i.e. it does not skip the transfer of each file even though the checksum is identical to an existing file in the pool. I am using the rsync transfer method. Is there a workaround to prevent all 100 GB of data from being transferred again? I think the workaround is to use rsync as transfer ;-) At least when you added the checksum-seed= parameter to your config, it should calculate the checksums on the client and compare with the servers database and only transfer contents that differ. No, checksum-seed doesn't help here. BPC transfers all the data again. I think checksum-seed caches the checksums on the server, but if BPC thinks the file doesn't exist on the server side at all (which it doesn't since it has moved locations), then checksum-seed is irrelevant. Hopefully BPC 4 will be smarter -- I think I saw a post on backuppc-devel from Craig indicating that it will be. Otherwise I would not manually fiddle with the dirs on the server, its far less stress and risk for error if you just let backuppc do its thing. Even if that means transfering the files again... I went ahead with the fiddling -- I'm a bit of a daredevil at heart :) Regards, Raman -- Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/