(Sorry for the quickfire question submissions but I'm not a big fan of
combining issues into one message)
While running backups I noticed the following in the amstatus output:
Using /usr/local/amanda/logfiles/amdump
>From Tue Dec 8 20:05:01 CST 2009
server1.utexas.edu:/home 0 66880m dumping 66880m (100.00%)
(20:44:22) (waiting for holding disk space)
server2.utexas.edu:/home 1 57426m dumping 51399m ( 89.50%)
(20:44:22)
server3.utexas.edu:/ 1 43605m dumping 43605m
(100.00%) (20:44:22) (waiting for holding disk space)
This was also in the amstatus output:
SUMMARY part real estimated
size size
partition : 91
estimated : 91 537826m
flush : 0 0m
failed : 0 0m ( 0.00%)
wait for dumping: 60 218028m ( 40.54%)
dumping to tape : 0 0m ( 0.00%)
dumping : 4 246146m 306954m ( 80.19%) ( 45.77%)
dumped : 27 12843m 12843m (100.00%) ( 2.39%)
wait for writing: 1 12709m 12709m (100.00%) ( 2.36%)
wait to flush : 0 0m 0m (100.00%) ( 0.00%)
writing to tape : 0 0m 0m ( 0.00%) ( 0.00%)
failed to tape : 0 0m 0m ( 0.00%) ( 0.00%)
taped : 26 134m 134m (100.34%) ( 0.03%)
tape 1 : 26 134m 134m ( 0.09%) daily-38 (52 chunks)
15 dumpers idle : runq
taper idle
network free kps: 718381
HOLDING SPACE : 27m ( 0.01%)
The thing is ... there's plenty of holding disk space available:
[ama...@amanda amanda]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 77G 9.3G 64G 13% /
/dev/sda3 367G 255G 93G 74% /hold <--- AMANDA HOLDING DISK
and amdump gives the following:
driver: hdisk-state time 8903.996 hdisk 0: free 28352 dumpers 3
driver: result time 8904.002 from chunker1: RQ-MORE-DISK 01-00002
find diskspace: not enough diskspace. Left with 3395968 K
find diskspace: not enough diskspace. Left with 622400 K
find diskspace: not enough diskspace. Left with 2204288 K
find diskspace: not enough diskspace. Left with 3395968 K
driver: state time 9630.580 free kps: 718381 space: 28352 taper: idle
idle-dumpers: 15 qlen tapeq: 0 runq: 59 roomq: 3 wakeup: 0 driver-idle:
no-diskspace
driver: interface-state time 9630.580 if default: free 718381
driver: hdisk-state time 9630.580 hdisk 0: free 28352 dumpers 2
driver: result time 9630.597 from chunker2: RQ-MORE-DISK 02-00003
find diskspace: not enough diskspace. Left with 2911936 K
find diskspace: not enough diskspace. Left with 622400 K
find diskspace: not enough diskspace. Left with 2204288 K
find diskspace: not enough diskspace. Left with 3395968 K
find diskspace: not enough diskspace. Left with 2911936 K
Some other messages I've found suggest it's a problem with splitting dumps
between tapes. Any idea why I'm getting these nessages? I've included
amanda.conf below.
org "daily" # your organization name for
reports
dumpuser "amanda" # the user to run dumps
under
inparallel 15 # maximum dumpers that will run in
parallel (max 63)
maxdumps 2
dumporder "SSSSSSSSSSSSSSS" # specify the priority order of each
dumper
taperalgo largest # The algorithm used to choose which dump
image to send
displayunit "m" # Possible values: "k|m|g|t"
netusage 750000 Kbps # maximum net bandwidth for Amanda, in KB
per sec
dumpcycle 7 days # the number of days in the normal dump
cycle
runspercycle 0 # the number of amdump runs in dumpcycle
days
runtapes 10 # number of tapes to be used in a single run
of amdump
holdingdisk hd1 {
directory "/hold/amandaholdingdisk" # where the holding disk is
use -35Gb # how much space can we use on it
chunksize 10Gb # size of chunk if you want
big dump to be
}
define tapetype vtape {
comment "Amanda Virtual Tape, capacity 150G"
length 155000 mbytes
}
define dumptype global {
program "GNUTAR"
index yes
record yes
maxdumps 2
holdingdisk yes
tape_splitsize 10 Gb
fallback_splitsize 2 Gb
compress none
auth "ssh"
ssh_keys "/home/amanda/.ssh/id_rsa_amdump"
}
define interface eth0 {
comment "1000 Mbps ethernet"
use 700000 kbps
}
When I look at the DLEs being dumped with amstatus I see:
Using /usr/local/amanda/logfiles/amdump
>From Tue Dec 8 20:05:01 CST 2009
server1.utexas.edu:/home 0 66880m dumping 66880m
(100.00%) (20:44:22) (waiting for holding disk space)
server2.utexas.edu:/home 1 57426m dumping 51399m (
89.50%) (20:44:22)
server3.utexas.edu:/ 1 43605m dumping 43605m
(100.00%) (20:44:22) (waiting for holding disk space)
At most I've seen 6 items being dumped at any one time. The load average on
the server is less than 1 and five minute avg on the NIC is 70Mbps. There's
over 80-80GB left on the holding disk and all dumps are set to use the
holding disk. I found a previous message indicating issues with inparallel
but that was fixed by upping the netusage option. I increased this setting
but still only see a few DLEs being dumped.
Any idea what's going on?
By the way, the "waiting for holding disk space" is another question ...
especially since there is more than sufficient holding space.