[BackupPC-users] Moving lots of data on a client

2013-08-20 Thread Raman Gupta
I have a client on which about 100 GB of data has been moved from one
directory to another -- otherwise its exactly the same.

As I understand it, since the data has been moved, BackupPC 3 will
transfer all the data again (and discard it once it realizes the data
is already in the pool) i.e. it does not skip the transfer of each
file even though the checksum is identical to an existing file in the
pool.

I am using the rsync transfer method.

Is there a workaround to prevent all 100 GB of data from being
transferred again?

Regards,
Raman

--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Moving lots of data on a client

2013-08-20 Thread Les Mikesell
On Tue, Aug 20, 2013 at 1:23 PM, Raman Gupta rocketra...@gmail.com wrote:
 I have a client on which about 100 GB of data has been moved from one
 directory to another -- otherwise its exactly the same.

 As I understand it, since the data has been moved, BackupPC 3 will
 transfer all the data again (and discard it once it realizes the data
 is already in the pool) i.e. it does not skip the transfer of each
 file even though the checksum is identical to an existing file in the
 pool.

 I am using the rsync transfer method.

 Is there a workaround to prevent all 100 GB of data from being
 transferred again?

You should be able to do a matching move in the latest full backup
tree under the pc directory for that host if you understand the
filename mangling (precede with 'f', and URI-encode %, /, etc.), then
force a full backup.  Note that this will break your ability to
restore back to the old location for the altered full and its
subsequent increments.

-- 
   Les Mikesell
 lesmikes...@gmail.com

--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Moving lots of data on a client

2013-08-20 Thread John Rouillard
On Tue, Aug 20, 2013 at 02:23:38PM -0400, Raman Gupta wrote:
 I have a client on which about 100 GB of data has been moved from one
 directory to another -- otherwise its exactly the same.
 
 As I understand it, since the data has been moved, BackupPC 3 will
 transfer all the data again (and discard it once it realizes the data
 is already in the pool) i.e. it does not skip the transfer of each
 file even though the checksum is identical to an existing file in the
 pool.

That is also my understanding.
 
 I am using the rsync transfer method.
 
 Is there a workaround to prevent all 100 GB of data from being
 transferred again?

Your mileage may vary on this, but changing the structure of the data
on the backuppc system to match what is currently present on the
client should work.  Asuuming you moved the data from:

   client1:/location/dir1

to
  client1:/location/dir2

and the share is /location (and directory dir1 and dir2 are
subdirectories).

Find the number for the last full backup and the last incremental
backup for the host in the web interface. (In theory any filled backup
should work in place of the full backup, but you want a tree that has
all the files in their original structure. Regular (unfilled)
incremental backups will have only the file changed since the last
full backup.)

Assuming your backuppc tree containing the subdirectorues cpool, pool,
pc and so on is at BACKUPPC do the following:

 cd BACKUPPC/pc/client1/full backup number/f%2flocation

(%2f is an encoded / and the leading f is how backuppc identifies
backed up files/directories rather than metadata like attrib and
backupInfo files)

  sudo -u backup cp -rl fdir1 ../incremental backup number/f%2flocation/fdir2

where backup is the user backuppc runs as. This will create a copy of
the fdir1 tree with links for every file (because of the -l (lower
case l flag)) and do it recursively (the -r flag). The target is
fdir2 in the last incremental backup. The last incremental backup is
going to be the reference backup for the next full backup.

The copy may fail to links some files if the maximum number of
links to the file is exceeded. If that's the case, the next full
backup should just copy those files into place.

Once this is done, start a new full backup and with luck it will see
the files in the f2 tree and start comparing them with checksums
rather than copying them. You may want to start the full backup using
the BackupPC_dump command directly with verbose mode turned on so you
can get more details on what's happening/transferring.

(
If the move involved directory level changes say from: /location/dir1
to /location2/subdir1/dir2, you will have to create the appropriate
directory structure under the last incremental directory. In this case
it would be BACKUPPC/last incremental number/f%2flocation2/fsubdir1
and then use:

  sudo -u backup cp -rl fdir1 ../incremental backup
number/f%2flocation2/fsubdir1/fdir2
)

I think I got the essentials here, last time I did this was with data
move between two different hosts and it was a few years ago. IIRC the
attrib file isn't used for determining what directories are in the
reference backup, so it can be ignored for this process.

You may want to wait a day or so to see if anybody else has some
comments, but I claim this should be safe since no data is harmed by
the copy/link 8-).

Standard disclaimers apply.

Let us know how/if it works.

-- 
-- rouilj

John Rouillard   System Administrator
Renesys Corporation  603-244-9084 (cell)  603-643-9300 x 111

--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Moving lots of data on a client

2013-08-20 Thread Arnold Krille
On Tue, 20 Aug 2013 14:23:38 -0400 Raman Gupta rocketra...@gmail.com
wrote:
 I have a client on which about 100 GB of data has been moved from one
 directory to another -- otherwise its exactly the same.
 
 As I understand it, since the data has been moved, BackupPC 3 will
 transfer all the data again (and discard it once it realizes the data
 is already in the pool) i.e. it does not skip the transfer of each
 file even though the checksum is identical to an existing file in the
 pool.
 
 I am using the rsync transfer method.
 
 Is there a workaround to prevent all 100 GB of data from being
 transferred again?

I think the workaround is to use rsync as transfer ;-) At least when you
added the checksum-seed= parameter to your config, it should
calculate the checksums on the client and compare with the servers
database and only transfer contents that differ.

Otherwise I would not manually fiddle with the dirs on the server, its
far less stress and risk for error if you just let backuppc do its
thing. Even if that means transfering the files again...

Have fun,

Arnold


signature.asc
Description: PGP signature
--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Moving lots of data on a client

2013-08-20 Thread Raman Gupta
On 08/20/2013 03:27 PM, John Rouillard wrote:
 On Tue, Aug 20, 2013 at 02:23:38PM -0400, Raman Gupta wrote:
 I have a client on which about 100 GB of data has been moved from one
 directory to another -- otherwise its exactly the same.

 As I understand it, since the data has been moved, BackupPC 3 will
 transfer all the data again (and discard it once it realizes the data
 is already in the pool) i.e. it does not skip the transfer of each
 file even though the checksum is identical to an existing file in the
 pool.
 
 That is also my understanding.
  
 I am using the rsync transfer method.

 Is there a workaround to prevent all 100 GB of data from being
 transferred again?
 
 Your mileage may vary on this, but changing the structure of the data
 on the backuppc system to match what is currently present on the
 client should work.  Asuuming you moved the data from:

Thanks John (and Les) for this suggestion. This was my initial thought
as well, but wanted to get feedback from the list before trying it.

I used mv/cp -rl as suggested to create the target structure in the
last full. I ignored attrib files as suggested by John (clearly, the
manipulation of the last full breaks the attrib data, but that is fine
for this temporary hack).

I then deleted all the incrementals after the last full using J.
Kosowski's deleteBackup script, just to be sure they didn't mess
something up. Since they didn't see the changes made in the full, I
think they were corrupted anyway.

I then ran a new full backup manually, and it worked fine.

Lastly, I debated whether to manually reverse the mv/cp operations
made in the prior last full, to restore it to its original state (with
correct locations and attrib files), or simply delete that full
backup. I think either approach would have been fine, but I opted to
simply delete it to avoid any potential screw-ups.

All-in-all, saved a day or so worth of data transfer successfully.

Regards,
Raman

--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Moving lots of data on a client

2013-08-20 Thread Raman Gupta
On 08/20/2013 03:28 PM, Arnold Krille wrote:
 On Tue, 20 Aug 2013 14:23:38 -0400 Raman Gupta rocketra...@gmail.com
 wrote:
 I have a client on which about 100 GB of data has been moved from one
 directory to another -- otherwise its exactly the same.

 As I understand it, since the data has been moved, BackupPC 3 will
 transfer all the data again (and discard it once it realizes the data
 is already in the pool) i.e. it does not skip the transfer of each
 file even though the checksum is identical to an existing file in the
 pool.

 I am using the rsync transfer method.

 Is there a workaround to prevent all 100 GB of data from being
 transferred again?
 
 I think the workaround is to use rsync as transfer ;-) At least when you
 added the checksum-seed= parameter to your config, it should
 calculate the checksums on the client and compare with the servers
 database and only transfer contents that differ.

No, checksum-seed doesn't help here. BPC transfers all the data again.
I think checksum-seed caches the checksums on the server, but if BPC
thinks the file doesn't exist on the server side at all (which it
doesn't since it has moved locations), then checksum-seed is irrelevant.

Hopefully BPC 4 will be smarter -- I think I saw a post on
backuppc-devel from Craig indicating that it will be.

 Otherwise I would not manually fiddle with the dirs on the server, its
 far less stress and risk for error if you just let backuppc do its
 thing. Even if that means transfering the files again...

I went ahead with the fiddling -- I'm a bit of a daredevil at heart :)

Regards,
Raman

--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/