>I did a reboot on the MP-RAS box to start with a clean slate and here is >what I've found so far. >... >WARNING: edaf6.irs.sat: selfcheck request timed out. Host down? >... >(brought to you by Amanda 2.4.3b1) >... >SERVICE selfcheck >OPTIONS ; >GNUTAR /home/tp 0 OPTIONS |;bsd-auth;srvcomp-fast;index;exclude-list=exclude.txt; >GNUTAR /eec/msg/logs 0 OPTIONS |;bsd-auth;srvcomp-fast;index;exclude-list=exclude.txt; >GNUTAR /eec/data/test 0 OPTIONS >|;bsd-auth;srvcomp-fast;index;exclude-list=exclude.txt; >GNUTAR /eec/var/inf_dumps 0 OPTIONS >|;bsd-auth;srvcomp-fast;index;exclude-list=exclude.txt; >GNUTAR /eec/var/inf_archive 0 OPTIONS >|;bsd-auth;srvcomp-fast;index;exclude-list=/usr/local/lib/amanda/exclude.gtar; >GNUTAR /eec/var 0 OPTIONS >|;bsd-auth;srvcomp-fast;index;exclude-list=/usr/local/lib/amanda/exclude.gtar; >GNUTAR /eec 0 OPTIONS |;bsd-auth;srvcomp-fast;index;exclude-list=exclude.txt; >...
This is a normal looking selfcheck packet, asking it to verify that things look OK for those file systems (or directories) using GNU tar as the dump program. So far, so good. The rest of the amandad log, however, indicates selfcheck is not getting done (and amcheck on the other side is re-sending the packets). ># cat selfcheck.20020321121249.debug >... >selfcheck: debug 1 pid 3459 ruid 22234 euid 22234 start time Thu Mar 21 12:12:49 2002 >/usr/local/libexec/selfcheck: version 2.4.3b2-20020308 >selfcheck: exclude list file "/home/tp/exclude.txt" does not exist, ignoring >selfcheck: checking disk /home/tp >selfcheck: device /home/tp >selfcheck: OK >selfcheck: exclude list file "/eec/msg/logs/exclude.txt" does not exist, ignoring >selfcheck: checking disk /eec/msg/logs >selfcheck: device /eec/msg/logs >selfcheck: OK >selfcheck: exclude list file "/eec/data/test/exclude.txt" does not exist, ignoring >selfcheck: checking disk /eec/data/test >selfcheck: device /eec/data/test >selfcheck: OK >selfcheck: exclude list file "/eec/var/inf_dumps/exclude.txt" does not exist, ignoring >selfcheck: checking disk /eec/var/inf_dumps >selfcheck: device /eec/var/inf_dumps >selfcheck: OK This also looks normal, as far as it goes. >selfcheck was still running or is hung after results returned to "relay" ># ps -ef | grep selfcheck > amanda 3459 3458 66 12:12:49 ? 10:05 /usr/local/libexec/selfcheck But the fact that it is still running, and that the log file did not go beyond /eec/var/inf_dumps, is bad. It's also probably not a good sign that it has used 10 minutes of time -- selfcheck should be done almost immediately. You can kill off the selfcheck process (if you can). It's obviously hung. >I'm assuming it has something to do with this oddly named file of 0 bytes: >selfcheck._eec_var_inf__archive.20020321121250.exclude Those are (new with 2.4.3) temp files used to hold the merged exclusion patterns. Note that /eec/var/inf_dumps was the first file system to use an explicit exclude file, "/usr/local/lib/amanda/exclude.gtar". Based on the log messages and the presence of the temp file, we can deduce that Amanda detected the existance of the exclude file and was trying to copy it to the temp file. Is there anything "magic" about access to that file on the MP-RAS box that would prevent Amanda from being able to read it? Can you "cat" it, for instance? If you (temporarily) comment out /eec/var/inf_archive and /eec/var from your disklist (both of which try to use that exclude file), does amcheck work? >Brian Davidson John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
