Thanks John, I think you are absolutely right to question the dump
program. The inconsistancy may be due to the dump command, but it's more
likely that the processes it depends on are the main cause, and I don't know
how to find out what these processes are.
>
> >Anytime a filesystem failed to backup, the tapedrive seemed to be idle forever
> >until the READ_TIMEOUT period lapsed, i.e no activity shown.
>
> That could be normal. Looking at the amdump.<NN> file, are the file
> systems that time out being done with PORT-DUMP (direct to tape) or
> FILE-DUMP (through the holding disk)? If they are going through the
> holding disk, it could be normal for the tape to be idle waiting on
> something to do.
>
It's being done with PORT-DUMP as I don't use the holding disk. However, I
did try to use the holding disk but the results was no different.
> What happens if you do something like this:
>
> /sbin/dump 0f - > /dev/null /
This is what I've got when it's done successfully:
DUMP: Date of this level 0 dump: Tue Dec 12 16:02:22 2000
DUMP: Date of last level 0 dump: the epoch
DUMP: Dumping /dev/sda1 (/) to standard output
DUMP: Label: none
DUMP: mapping (Pass I) [regular files]
DUMP: mapping (Pass II) [directories]
DUMP: estimated 1990332 tape blocks.
DUMP: Volume 1 started at: Tue Dec 12 16:02:29 2000
DUMP: dumping (Pass III) [directories]
DUMP: dumping (Pass IV) [regular files]
DUMP: Volume 1 completed at: Tue Dec 12 16:05:27 2000
DUMP: Volume 1 took 0:02:58
DUMP: Volume 1 transfer rate: 12022 KB/s
DUMP: 2139918 tape blocks (2089.76MB)
DUMP: finished in 178 seconds, throughput 12022 KBytes/sec
DUMP: Date of this level 0 dump: Tue Dec 12 16:02:22 2000
DUMP: Date this dump completed: Tue Dec 12 16:05:27 2000
DUMP: Average transfer rate: 12022 KB/s
DUMP: DUMP IS DONE
but when it wasn't successful, it got stucked after the "Pass IV" step, and
was waiting for something indefinitely.
>
> or this:
>
> /sbin/dump 0f - / | rsh localhost "cat > /dev/null"
>
> What version of dump are you using? You really, really, really want
> to get the latest stuff from sourceforge. It was reportedly pretty bad
> for a while but has gotten a good maintainer now and is much better.
>
It's 0.4b19 version which came with RedHat 7.0 so I guess it's pretty current
>
> Clearly the dumps are getting started and moving some data, so something
> must be freezing up. I'm not sure how to track this down other than to
> try and catch it in the act and run gcore on the various programs and
> then a debugger to see what they are waiting on.
>
Yes, it would be great to know exactly how to track it down. What the
various programs are you talking about here John?
> John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
Thanks,
Hien.