On Thu, Jan 07, 2016 at 04:00:23PM -0500, Focus1 IT wrote: > Hi, > > This is my first post to the amanda-users list, I'm hoping the community > can help me resolve an issue that has rendered our backup-set 90% > ineffective. In short, I've got about 65 DLEs that are not being backed > up, 60 of which reside on a remote host and 1 that is on the same machine > as amandabackup. Amdump runs nightly and completes backup of other DLEs > properly, amreport indicates the failing DLEs are 1kB to 10kB in size > regardless of the backup level assigned to them each run. > > An example of an amreport that includes this failure:
IIRC as amanda uses tar, it will not cross a mount point. And I'm not sure about following symbolic links, but I don't think it will backup the target of a symlink either. Might either of these be the problem? I worked around the mount point problem in the past by specifically including the mount point. For example if /usr were the filesystem being backed up in the DLE and /usr/local were a separate filesystem that I wanted in the same backup, that DLE had something like an "include ./local" directive. jl > > DUMP SUMMARY: > DUMPER STATS > TAPER STATS > HOSTNAME DISK L ORIG-kB OUT-kB COMP% MMM:SS KB/s > MMM:SS KB/s > -------------------------------- ----------------------- -------------- > ------------- > 10.6.1.209 /mnt/IT 0 10 10 -- 0:00 339.2 > 0:02 0.0 > 10.6.1.209 /mnt/aeiland 0 10 10 -- 0:00 307.3 > 0:02 0.0 > 10.6.1.209 /mnt/applications 0 10 10 -- 0:00 544.7 > 0:02 0.0 > 10.6.1.209 /mnt/archive-epp 0 10 10 -- 0:00 687.4 > 0:02 0.0 > 10.6.1.209 /mnt/aswoboda 0 10 10 -- 0:00 642.8 > 0:02 0.0 > 10.6.1.209 /mnt/bmcfarlane 0 10 10 -- 0:00 632.2 > 0:02 0.0 > 10.6.1.209 /mnt/bmenzie 0 10 10 -- 0:00 669.3 > 0:02 0.0 > 10.6.1.209 /mnt/brodriguez 0 10 10 -- 0:00 107.2 > 0:02 0.0 > 10.6.1.209 /mnt/cczinski 0 10 10 -- 0:00 641.0 > 0:02 0.0 > > > Some history: > > We've used this backup-set for ~2 years, adjusting DLEs as necessary for > organizational changes, and it has functioned as expected. In early > November the machine that hosts amanda-server experienced a RAID5 array > failure that necessitated we rebuild the array from scratch. The initial > array used an EXT filesystem, the new array is formatted ZFS. After > recovering from this hardware failure amdump seemed to operate normally for > a couple of weeks... and then things started to fall apart. > > Initially I noticed a notification in the nightly amreport that our holding > disk was missing as we had failed to recreate the proper directory > structure after rebuilding the failed array. After creating the missing > directory that error was no longer present in the logs. Shortly thereafter > the bulk of our DLEs started being dumped incorrectly and I've been unable > to determine why. I'm hoping the list will be able to provide me with some > insight as to what triggered the error and assist me in restoring our > backup capabilities. > > Here are our config files: > > *AMANDA.CONF: *http://kickasspastes.com/4946/ > > *DISKLIST: *http://kickasspastes.com/4947/ > > All of the DLEs on host 10.6.1.209 (a FreeNAS box) are failing. > One of four DLEs on localhost is failing (/srv/backups). > > Here is a view of file attributes for /srv on localhost: > > ls -la /srv/ > total 12 > drwxr-xr-x 3 root root 4096 Nov 6 08:20 . > drwxr-xr-x 30 root root 4096 Dec 19 06:41 .. > lrwxrwxrwx 1 root root 13 Nov 6 08:14 amanda -> /tank/amanda/ > lrwxrwxrwx 1 root root 14 Nov 6 08:14 backups -> /tank/backups/ > drwxr-xr-x 3 root root 4096 Nov 6 08:20 mnt > lrwxrwxrwx 1 root root 14 Nov 6 08:17 reports -> /tank/reports/ > lrwxrwxrwx 1 root root 13 Nov 6 08:19 server -> /tank/server/ > > and the underlying ZFS array /srv mountpoints are symlinked to: > > ls -la /tank/ > total 30 > drwxr-xr-x 6 root root 6 Nov 6 08:18 . > drwxr-xr-x 30 root root 4096 Dec 19 06:41 .. > drwxr-xr-x 4 amandabackup amandabackup 4 Dec 16 11:49 amanda > drwxr-xr-x 8 merkin merkin 8 Nov 6 08:56 backups > drwxr-xr-x 6 amandabackup amandabackup 6 Nov 6 08:17 reports > drwxr-xr-x 6 amandabackup amandabackup 9 Jan 7 01:45 server > > > > Here are some examples of logfiles from days when the failures occur: > > *AMREPORT: *http://kickasspastes.com/4951/ > > *PLANNER.DEBUG:* http://kickasspastes.com/4948/ > > *AMANDA PLANNER CONSOLE OUTPUT PIPED TO A TEXTFILE: * > http://kickasspastes.com/4949/ > > *AMANDA PLANNER CONSOLE OUTPUT SAMPLE TWO (INCLUDES "NO TRY" LINES): * > http://kickasspastes.com/4950/ > > > A couple notes: You'll notice I've set "etimeout" to a very high value in > amanda.conf. I had to raise the value for this setting because estimates > were timing out for localhost:/srv/amanda/state and amdump was failing. > I've read about amanda, tar, and ZFS filesystems and am under the > impression they require special dumptype declarations, and if the wrong > version of tar is used timeouts can occur... *(We are currently using > "encrypted-gnutar-local" for /srv mountpoints).*.. what confounds me is > that the system ran fine for almost a month, then started timing out, so > I'm not entirely sure where the blame lies... is this a ZFS issue? What is > causing planner to calculate incorrect estimates for mountpoints on the > other host(?), nothing changed on that machine at all. > > Any help resolving this issue would be greatly appreciated. I need to get > these backups moving again ASAP. > > Thanks in advance, > > JS > > -- > > > CONFIDENTIALITY NOTICE: This email is covered by the Electronic > Communications Privacy Act, 18 U.S.C. 2510-2521 and is legally privileged. > This communication may also contain material protected and governed by the > Health insurance Portability and Accountability Act of 1996 (HIPAA). This > e-mail is only for the personal and confidential use of the individuals to > which it is addressed and contains confidential information. If you are not > the intended recipient, you are notified that you have received this > document in error, and that any reading, distributing, copying or > disclosure is unauthorized. > > If you are not the intended recipient Please notify Hatteras Printing Inc. > by calling (313) 624-3300 and destroy the message immediately. > Additionally, please do not print this email unless it is absolutely > necessary. >>> End of included message <<< -- Jon H. LaBadie [email protected] 11226 South Shore Rd. (703) 787-0688 (H) Reston, VA 20190 (703) 935-6720 (C)
