Hi,
I'm using Amanda 2.4.2p2 on/for a Linux Box (RH 6.2, 2.2.19, GNU tar
1.13.17) to backup home directories on a NetApp Filer mounted with NFS.
Up to and including 171 disklist entries of type root-tar, everything is
ok. amcheck complains about the home directories being not accessible
(amanda has uid 37), but runtar get's them running with euid 0 (NFS
export with no root squashing). It takes about 3 secs for amcheck to
check these lines.
If I add some more disklist entries of the same type, amcheck hangs for
a minute (ctimeout 60) and then reports "selfcheck request timed out.
Host down?"
/tmp/amanda gets three more files: amanda.<datetime>.debug, amcheck...
and selfcheck...
With up to 171 entries, selfcheck.<datetime>.debug grows to 28387 Bytes
containing 171 lines "could not access". Using 172 entries, it stops at
16427 Bytes and contains only 100 lines "could not access" (o.k. because
of NFS permissions). The last line of the disklist is checked first.
/tmp/amanda/selfcheck... ends with:
selfcheck: checking disk /home/User/cb
selfcheck: device /home/User/cb
selfcheck: could not access /home/User/cb (/home/User/cb): Permission
denied
selfcheck: checking disk /home/User/ca
selfcheck: device /home/User/ca
After adding one or more lines to the disklist file, only the last 100
lines get checked, then an amandad and a selfcheck process is hanging
around:
$ ps x
PID TTY STAT TIME COMMAND
28833 pts/2 S 0:00 -bash
28854 pts/2 S 0:00 emacs -nw disklist
29000 pts/1 S 0:00 -bash
29149 ? S 0:00 amandad
29151 ? S 0:00 /usr/libexec/amanda/selfcheck
29182 pts/3 S 0:00 -bash
29227 pts/3 S 0:00 less selfcheck.20010511233745.debug
29230 pts/1 R 0:00 ps x
Killing selfcheck spaws another selfcheck process and this one's debug
file stops after having checked the last 100 disklist lines, too.
$ kill 29151
$ ps x
PID TTY STAT TIME COMMAND
28833 pts/2 S 0:00 -bash
28854 pts/2 S 0:00 emacs -nw disklist
29000 pts/1 S 0:00 -bash
29182 pts/3 S 0:00 -bash
29231 ? S 0:00 amandad
29233 ? S 0:00 /usr/libexec/amanda/selfcheck
29234 pts/1 R 0:00 ps x
$ kill 29233
$ ps x
PID TTY STAT TIME COMMAND
28833 pts/2 S 0:00 -bash
28854 pts/2 S 0:00 emacs -nw disklist
29000 pts/1 S 0:00 -bash
29182 pts/3 S 0:00 -bash
29238 ? S 0:00 amandad
29240 ? S 0:00 /usr/libexec/amanda/selfcheck
29241 pts/1 R 0:00 ps x
$ kill 29240
$ ps x
PID TTY STAT TIME COMMAND
28833 pts/2 S 0:00 -bash
28854 pts/2 S 0:00 emacs -nw disklist
29000 pts/1 S 0:00 -bash
29182 pts/3 S 0:00 -bash
29244 ? S 0:00 amandad
29246 ? D 0:00 /usr/libexec/amanda/selfcheck
29247 pts/1 R 0:00 ps x
$ kill 29246
$ ps x
PID TTY STAT TIME COMMAND
28833 pts/2 S 0:00 -bash
28854 pts/2 S 0:00 emacs -nw disklist
29000 pts/1 S 0:00 -bash
29182 pts/3 S 0:00 -bash
29251 pts/1 R 0:00 ps x
Now it's got killed...
Any ideas?