Thanks a lot for your e-mail and sorry for my late response. A partition on my customer's computers has 900+ GB data which are backed up by rdiff- backup on a daily basis (increment). The average daily data volume to be backed up (due to changes) is about 300 MB.
The rdiff-backup "backup window" is set to 52 weeks. Currently I run rdiff-backup-delete for "many" files which had been backed up in the first run because rdiff-backup had been launched without exclude option. I run rdiff-backup-delete because I do not want to compromise the integrity of the backup. To my surprise, for the above mentioned partition (and its backup) the subdirectory "rdiff- backup-data" contains more than 153,000 files ending on "*.gz". To delete a single file by rdiff-backup-delete, it takes 20 - 25 minutes to delete it because the program has to open a huge number pf *.gz files, drop information abot the file to delete, and create a new version of that "*.gz" file. Questions / new feature requests:: 1. How can the performance of rdiff-backup-delete significantly be improved? Any hints? 2. Are there plans for the next release of rdiff-backup-delete to improve its performance by rewriting parts of the source code? 3. Is it possible to run multiple copies of rdiff-backup-delete simultaneously without any problems (deadlocks, collision, etc.)? 4. Does rdiff-backup use a database (e.g. sqlite3, etc.) so it knows in which one of the"*.gz" files there is information about the file to be deleted? 5. How to exclude open MS Office files from being backed up? The names of these files usually start with "~$". Do I have to escape the "$" in the exclude pattern? Could you provide built-in exlude-option for automatically exclude such files? I think rdiff-backup is a very good tool for backup, however there is also a big need for an optimized version of rdiff-backup-delete to get rid of unwanted files and directories. Please note, the administrator is responsible to decide which files / directories to drop from rdiff-backup. Best regards Dieter > > I had a chat with my customer an hour ago: They want the most recent > > version and its immediate predecessor ("the two most recent version > > of > any file") to be always available at any time. > > If these are Office files, I suggest you use Office's own option: see > https://answers.microsoft.com/en-us/msoffice/forum/all/my-computer-keeps-mak > ing-backups-of-my-microsoft/79411263-08e6-4f7c-b67d-c75933513550 , but check > the option rather than unchecking it. Then you can make backups, if > desired, using any system you want, not necessarily > rdiff-backup. > > Note that what Office considers an "immediate predecessor" might not > be what the client wants. If you take a file made yesterday, change it > and save it, then make another change and save it two minutes later, > the immediate predecessor is the version that you made two minutes > ago, not the version made yesterday. -------------------------------------------------------------- PGP/GPG Key fingerprint: BF12 CD6F EDC4 9FBA C933 316B 2C81 0BEF 4324 8513 --------------------------------------------------------------
signature.asc
Description: This is a digitally signed message part.