On Tue, Sep 10, 2019 at 03:25:12PM -0400, Joe Steele wrote: > On 9/9/2019 9:53 PM, Walt Mankowski wrote: > > I found a file named > > > > rdiff-backup-data/current_mirror.2019-09-08T03:01:02-04:00.data > > > > which contained > > > > 4351 > > > > I moved it out of the way and reran the backup command. This time it > > through an exception. The output is in the attached log file. > > > > (Some of the following echoes what Eric Lavarde wrote a few minutes ago.) > > Moving a current_mirror file out of the way is never a good thing to do. > Having 2 current_mirror files is how rdiff-backup knows that the last backup > failed and that a regression is necessary in order to reestablish a > consistent state for the backup repository. > > Fortunately, it looks as though your attempt to run another backup after > removing the current_mirror file did not get anywhere (based on your log). > > I suggest putting the 'current_mirror.2019-09-08T03:01:02-04:00.data' back > in place (and possibly restart systemd-resolved, as commented further > below). After that, I would look to see what current mirror files you now > have. My guess is that you will find the following: > > current_mirror.2019-09-07T03:01:01-04:00.data > current_mirror.2019-09-08T03:01:02-04:00.data > current_mirror.2019-09-09T21:46:29-04:00.data > > 9/7/19 is your last good backup. 9/8 was the backup that failed. 9/9 was > your most recent attempt to fix things. > > *Assuming* that I am correct about the current_mirror files that exist, then > I would remove the last of those files > (current_mirror.2019-09-09T21:46:29-04:00.data). Yes, that's contrary to my > admonition above. But rdiff-backup cannot deal with 3 such files, and this > last file is from your most recent backup that did not get anywhere, > according to your log. > > I would then again try 'rdiff-backup --check-destination-dir' (and cross > your fingers).
Oops! What I ended up doing was moving the file back and removing these files: current_mirror.2019-09-09T21:46:28-04:00.data file_statistics.2019-09-09T21:46:28-04:00.data.gz Then, since I hadn't read your email yet, I reran the backup command again: $ sudo rdiff-backup -v9 --print-statistics --exclude-filelist /usr/local/etc/rdiff_exclude / /backup/scruffy 2>&1 | tee rdiff-backup3.txt That seems to have done the trick. It printed out a bunch of text and finally said Previous backup seems to have failed, regressing destination now. Looks like I lucked out! So I'm just going to let it run now and see what happens. > Your original concern was that this was taking forever (12+ hours and > counting). For what it is worth, my experience is that regressions do take > many hours (depending on size of your current mirror), and they leave you > wondering if anything is actually happening. > > It seems like 296 GB would take me 4-8 hours to regress (I can't really > remember -- it's been a while). If your backup is 527 GB (i.e., that's what > shows up for 'MirrorFileSize' in your session_statistics.* files), then yes, > I imagine that would take quite some time to regress. There are probably > other factors besides size that affect the speed -- disk speed, processor > speed, load, etc. I don't know if rdiff-backup logging verbosity is a > factor or not -- I would think that it might be a factor. Thanks. It's really good to know the time wasn't out of line. Some sort of diagnostic messages, especially with -v9, would be really helpful to know that it's working! > None of the above addresses your problem with "No space left on device". I > would try to restore your repository to a consistent state before > investigating that further. (Of course, the real frustrating thing is that > if the backup fails again, you are forced to wait many hours while you > repeat the regression of the failed backup.) Agreed. But considering at this point I haven't even been able to regress the backups, I'm willing to cross that bridge when I come to it. There was some weirdness with my box that day, including gnome crashing, so maybe it just got into a weird state. Even if it fails again I'm no worse off than I am now. > <snip> > > > On Mon, Sep 09, 2019 at 08:17:04PM -0400, Walt Mankowski wrote: > > > I ran > > > > > > $ sudo rdiff-backup -v9 --print-statistics --exclude-filelist > > > /usr/local/etc/rdiff_exclude / /backup/scruffy 2>&1 | tee rdiff-backup.txt > > > > > > This time it exited right away. I've attached the log file, where the > > > key message is > > > > > > Fatal Error: It appears that a previous rdiff-backup session with > > > process id 4351 is still running. > > > > > > Process 4351 is /lib/systemd/systemd-resolved > > > > > It would seem that you had a bit of bad luck in that a process ID that had > been used for a crashed rdiff-backup session happened to now be in use again > for an unrelated process (systemd-resolved). That is bizarre! Especially because it was a systemd process with a lowish PID, I figured it had been running since boot. > > > Is it safe to rerun it with --force? > > > > > Using --force would have gotten around the Fatal Error, but it would have > also forced other things to happen that you may not want. In this instance, > I would have probably restarted systemd-resolved so that it used a different > PID. That should have gotten rdiff-backup past that particular error. Good to know. > > > > > The latter was when I killed it when I woke up and saw that both of > > > > > them were running. > > > > > > > That's interesting. That points out that rdiff-backup does not check if a > regression is already in progress before starting another one. That needs > fixing. > > --Joe _______________________________________________ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki