On 9/9/2019 9:53 PM, Walt Mankowski wrote:
I found a file named
rdiff-backup-data/current_mirror.2019-09-08T03:01:02-04:00.data
which contained
4351
I moved it out of the way and reran the backup command. This time it
through an exception. The output is in the attached log file.
(Some of the following echoes what Eric Lavarde wrote a few minutes ago.)
Moving a current_mirror file out of the way is never a good thing to do.
Having 2 current_mirror files is how rdiff-backup knows that the last
backup failed and that a regression is necessary in order to reestablish
a consistent state for the backup repository.
Fortunately, it looks as though your attempt to run another backup after
removing the current_mirror file did not get anywhere (based on your log).
I suggest putting the 'current_mirror.2019-09-08T03:01:02-04:00.data'
back in place (and possibly restart systemd-resolved, as commented
further below). After that, I would look to see what current mirror
files you now have. My guess is that you will find the following:
current_mirror.2019-09-07T03:01:01-04:00.data
current_mirror.2019-09-08T03:01:02-04:00.data
current_mirror.2019-09-09T21:46:29-04:00.data
9/7/19 is your last good backup. 9/8 was the backup that failed. 9/9
was your most recent attempt to fix things.
*Assuming* that I am correct about the current_mirror files that exist,
then I would remove the last of those files
(current_mirror.2019-09-09T21:46:29-04:00.data). Yes, that's contrary
to my admonition above. But rdiff-backup cannot deal with 3 such files,
and this last file is from your most recent backup that did not get
anywhere, according to your log.
I would then again try 'rdiff-backup --check-destination-dir' (and cross
your fingers).
Your original concern was that this was taking forever (12+ hours and
counting). For what it is worth, my experience is that regressions do
take many hours (depending on size of your current mirror), and they
leave you wondering if anything is actually happening.
It seems like 296 GB would take me 4-8 hours to regress (I can't really
remember -- it's been a while). If your backup is 527 GB (i.e., that's
what shows up for 'MirrorFileSize' in your session_statistics.* files),
then yes, I imagine that would take quite some time to regress. There
are probably other factors besides size that affect the speed -- disk
speed, processor speed, load, etc. I don't know if rdiff-backup logging
verbosity is a factor or not -- I would think that it might be a factor.
None of the above addresses your problem with "No space left on device".
I would try to restore your repository to a consistent state before
investigating that further. (Of course, the real frustrating thing is
that if the backup fails again, you are forced to wait many hours while
you repeat the regression of the failed backup.)
<snip>
On Mon, Sep 09, 2019 at 08:17:04PM -0400, Walt Mankowski wrote:
I ran
$ sudo rdiff-backup -v9 --print-statistics --exclude-filelist
/usr/local/etc/rdiff_exclude / /backup/scruffy 2>&1 | tee rdiff-backup.txt
This time it exited right away. I've attached the log file, where the
key message is
Fatal Error: It appears that a previous rdiff-backup session with
process id 4351 is still running.
Process 4351 is /lib/systemd/systemd-resolved
It would seem that you had a bit of bad luck in that a process ID that
had been used for a crashed rdiff-backup session happened to now be in
use again for an unrelated process (systemd-resolved).
Is it safe to rerun it with --force?
Using --force would have gotten around the Fatal Error, but it would
have also forced other things to happen that you may not want. In this
instance, I would have probably restarted systemd-resolved so that it
used a different PID. That should have gotten rdiff-backup past that
particular error.
<snip>
On Mon, Sep 9, 2019 at 7:47 PM Walt Mankowski <walt...@pobox.com> wrote:
On Mon, Sep 09, 2019 at 07:38:52PM -0400, Patrik Dufresne wrote:
Hum, this is strange. It should not fail with a "no space left on
device".
Agreed! That's why I originally thought it must have been some sort of
USB glitch.
Could you provide the log generate with -v9 ? Plz provide the full
command
line you used.
So kill the run with -v8?
What is the filesystem of your USB drive ?
ext4
If you try to run the backup again do you have an error?
In fact that happened last night. My normal nightly backup kicked in
while a previous attempt at running --check-destination-dir was still
running. The cronjob reported:
Previous backup seems to have failed, regressing destination now.
Fatal Error: Killed with signal 15
The latter was when I killed it when I woke up and saw that both of
them were running.
That's interesting. That points out that rdiff-backup does not check if
a regression is already in progress before starting another one. That
needs fixing.
--Joe
_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki