On 11/15 10:23 , Craig Barratt wrote:
> Here's the plan.  Can anyone see a problem with doing this?
> BackupPC_nightly and BackupPC_link are still mutually exclusive.
> But BackupPC_dump can overlap BackupPC_nightly.

This would be great, because under the current scheme, when backups take
more than 24 hours, they hold up the rest of the backup process. So it's
impossible to just 'trickle charge' a large (but fairly static) machine
that's on the other end of a slow link, letting the first backup trickle in
over the course of a week or so, while other backups run fine. You have to
tend the thing over the course of a couple of weeks, slowly allowing the
server to back up more & more of the remote machine.

> There is also the opposite race condition: the process doing the
> removing has already checked the link count and will remove the
> file.  Another process makes a new link before the remove.  The
> link succeeds (link count is now 2), and the remove succeeds too.

I was under the impression that filesystems under Linux simply let unlink()
decrement the refcount on a file... when the file reaches refcount 0, it's
deleted. (I'm hardly a kernel expert tho).

> Perhaps there is a strategy using rename:
> 
>   - removing process (BackupPC_nightly):
> 
>       - find each file with just 1 link, then:
> 
>       - rename that file.  (This is a new step.)
> 
>       - recheck the number of links.  If it is still 1, then remove the
>         file.  Otherwise, rename it back.  (This is a new step.)
> 
>       - the rename might fail if a new link was made between the
>         two renames.  New links into the pool are only made by
>         BackupPC_link, and only a single copy of that ever runs.
>         So if BackupPC_nightly never runs with BackupPC_link (this
>         is already the case) then this should never happen.

at first glance this would seem to slow things down considerably; performing
at least twice as many operations on each file. However, I think it's a
little better than that, on consideration. 

- In the majority of cases, we won't be removing large percentages of the
  files every night. (I wonder what percentage of the load of
  BackupPC_nightly is the process of finding files to be removed, and what
  is the process of actually removing the ones that have a refcount of 1?)

- With modern machines/OSes which cache disk very aggressively, renaming a
  file and then deleting it immediately will likely never hit disk in most
  cases. The OS simply keeps track of it as if it had been written, even if
  it never was. (My external USB drive can display the drive activity light
  for a couple of minutes after the OS has 'finshed' writing a multi-GB file
  to it).

so it will be slightly slower; but will be a win in the case of busy backup
servers, or backups which take multiple days to run.


-- 
Carl Soderstrom
Systems Administrator
Real-Time Enterprises
www.real-time.com


-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.  Get Certified Today
Register for a JBoss Training Course.  Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click
_______________________________________________
BackupPC-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/

Reply via email to