Chris,
for larger file systems I've moved to "server estimate", less
accurate but takes the entire estimate phase out of the equation.
We have had a lot of success with pig zip rather than regular
gzip, is it'll take advantage of the mutiple CPUs and give
parallelization during compression, which is often our bottleneck
during actual dumping. In one system I cut DLE dump time from
13 to 8 hours, a huge savings (I think those where the numbers,
I can look them up...).
ZFS will allow unlimited capacity, and enough files per directory
to choke access, we have backups that run very badly here, with
litterally several hundred thousand files PER directory, and
multiple such directories.
For backups themselves, I do use snapshots where I can on my
ZFS file systems.
On Wed, Apr 03, 2013 at 11:26:01AM -0400, Chris Hoogendyk wrote:
This seems like an obvious "read the FAQ" situation, but . . .
I'm running Amanda 3.3.2 on a Sun T5220 with Solaris 10 and a
J4500 "jbod"
disk array with multipath SAS. It all should be fast and is on the
local
server, so there isn't any network path outside localhost for the
DLE's
that are giving me trouble. They are zfs on raidz1 with five 2TB
drives.
Gnutar is v1.23. This server is successfully backing up several other
servers as well as many more DLE's on the localhost. Output to an
AIT5
tape
library.
I've upped the etimeout to 1800 and the dtimeout to 3600, which
both seem
outrageously long (jumped from the default 5 minutes to 30
minutes, and
>from the default 30 minutes to an hour).
The filesystem (DLE) that is giving me trouble (hasn't backed up in a
couple of weeks) is /export/herbarium, which looks like:
marlin:/export/herbarium# df -k .
Filesystem kbytes used avail capacity Mounted on
J4500-pool1/herbarium
2040109465 262907572 1777201893 13%
/export/herbarium
marlin:/export/herbarium# find . -type f | wc -l
2806
marlin:/export/herbarium# find . -type d | wc -l
140
marlin:/export/herbarium#
So, it is only 262G and only has 2806 files. Shouldn't be that big
a deal.
They are typically tif scans.
One thought that hits me is: possibly, because it is over 200G of tif
scans, compression is causing trouble? But this is just getting
estimates,
output going to /dev/null.
Here is a segment from the very end of the sendsize debug file
from April
1
(the debug file ends after these lines):
Mon Apr 1 08:05:49 2013: thd-32a58: sendsize: .....
Mon Apr 1 08:05:49 2013: thd-32a58: sendsize: estimate time for
/export/herbarium level 0: 26302.500
Mon Apr 1 08:05:49 2013: thd-32a58: sendsize: estimate size for
/export/herbarium level 0: 262993150 KB
Mon Apr 1 08:05:49 2013: thd-32a58: sendsize: waiting for runtar
"/export/herbarium" child
Mon Apr 1 08:05:49 2013: thd-32a58: sendsize: after runtar
/export/herbarium wait
Mon Apr 1 08:05:49 2013: thd-32a58: sendsize: getting size via
gnutar for
/export/herbarium level 1
Mon Apr 1 08:05:49 2013: thd-32a58: sendsize: Spawning
"/usr/local/libexec/amanda/runtar runtar daily
/usr/local/etc/amanda/tools/gtar --create --file /dev/null
--numeric-owner
--directory /export/herbarium --one-file-system --listed-incremental
/usr/local/var/amanda/gnutar-lists/localhost_export_herbarium_1.new
--sparse --ignore-failed-read --totals ." in pipeline
Mon Apr 1 10:16:17 2013: thd-32a58: sendsize: Total bytes written:
77663795200 (73GiB, 9.5MiB/s)
Mon Apr 1 10:16:17 2013: thd-32a58: sendsize: .....
Mon Apr 1 10:16:17 2013: thd-32a58: sendsize: estimate time for
/export/herbarium level 1: 7827.571
Mon Apr 1 10:16:17 2013: thd-32a58: sendsize: estimate size for
/export/herbarium level 1: 75843550 KB
Mon Apr 1 10:16:17 2013: thd-32a58: sendsize: waiting for runtar
"/export/herbarium" child
Mon Apr 1 10:16:17 2013: thd-32a58: sendsize: after runtar
/export/herbarium wait
Mon Apr 1 10:16:17 2013: thd-32a58: sendsize: done with amname
/export/herbarium dirname /export/herbarium spindle 45002