Eric Mountain wrote:
On Thursday 05 April 2007 20:32, Brian spake thus:
Brian wrote:
Hi,
I am using dirvish to backup various machines in
my local network. It works pretty well. But sometimes the dirvish
process will give a RC=139 or RC=145 back, and I am trying to debug
RC=139 is more than likely a SEGV, especially based on the logs you provided
in your mail last June, as I see no code in Dirvish 1.2 that would return 139
explicitly.
Do you get core dumps? If so, of what process: perl/dirvish, or rsync? ...
and what does a debugger have to say (stack trace...)
If you don't have any core dumps, what does "ulimit -c" return for the user
(root?) running dirvish+rsync on the backup server?
Anyway if I start a backup then just kill (-9) rsync on the dirvish
machine, I get a RC=0 from dirvish. The tree is then only partially
filled, files and directories missing etc.
The summary file ends with "Status: Success", the log file ends with the
file name of a large file that was new, so it had to be transfered.
Haven't dug into this too much, but if you kill rsync with SIGKILL, then it's
going to exit with status 137. Dirvish doesn't have that in
its %RSYNC_CODES, so doesn't find it there, and since there will likely be no
error message in the rsync output (since it was just blown away by an
external cause), I guess dirvish gets tricked into thinking everything was
OK. You would probably get the same effect sending SIGSEGV and friends
(since dirvish does not try to trap them).
So, is it correct that I get an RC=0 from dirvish, for this?
Open a bug report (on the wiki I suppose).
How can I get more debugging output, only via the rsync options?
man rsync.
That's not going to help if the pb is ahead of rsync.
Dang,
the other RC was 149 not 145. And I can provoke that with invalid rsync
options. So can it be in some cases that dirvish is calling rsync with
incorrect parms. I guess I can try the rsync command directly next time
it goes wrong.
Exit 149 means the rsync process exited with a non-fatal error (from Dirvish's
point of view). See "man dirvish". You should be able to see the reason for
the failure in the file called "log" under the image directory.
Cheers,
Eric
Hi Eric,
Thanks for the help.
No coredumps as "ulimit -c" gives back 0, so I will set that to 80000000
in my startup scripts. I have now increased the verbosity of rsync to
-vvv, probably not required, but -v was giving me nothing useful.
When it breaks, the log only shows a single instance of the rsync
command that was issued, summary is empty (filesize = 0), and fsbuffer
is still there. The exclude file is still there too. Tree is completely
empty.
So I guess now I'll wait and see what the next error shows, and if I
have a coredump see if that can tell me something.
Cheers Brian
_______________________________________________
Dirvish mailing list
[email protected]
http://www.dirvish.org/mailman/listinfo/dirvish