> On Jan 8, 2016, at 11:56 AM, Focus1 IT <[email protected]> wrote: > > Thanks for the replies, appreciate it. > > Re: tar and filesystem boundaries, our disklist worked fine up until the > failure, I don't think that is our issue but tried to add an include > directive to our config and ran amdump, the estimate for that DLE was still > wrong. > > Re: DLEs, I've posted a link to a paste of our disklist, we have DLEs for > each specific mount. > > Re: tar version, I'm going to look into this in a bit... Because the one > issue that has bothered me more than the rest is the failure of all of the > freenas mounts. There were no changes to that host prior to the failure, we > just suddenly lost the ability to back up 60+ directories overnight. > > JS
The fact that it ALSO incloudes a DLE on your server (failing) says a lot, too. I’m just not sure *what*. Deb > > On Jan 8, 2016 12:37 PM, "Debra S Baddorf" <[email protected]> wrote: > > > On Jan 8, 2016, at 12:26 AM, Jon LaBadie <[email protected]> wrote: > > > > On Thu, Jan 07, 2016 at 04:00:23PM -0500, Focus1 IT wrote: > >> Hi, > >> > >> This is my first post to the amanda-users list, I'm hoping the community > >> can help me resolve an issue that has rendered our backup-set 90% > >> ineffective. In short, I've got about 65 DLEs that are not being backed > >> up, 60 of which reside on a remote host and 1 that is on the same machine > >> as amandabackup. Amdump runs nightly and completes backup of other DLEs > >> properly, amreport indicates the failing DLEs are 1kB to 10kB in size > >> regardless of the backup level assigned to them each run. > >> > >> An example of an amreport that includes this failure: > > > > IIRC as amanda uses tar, it will not cross a mount point. And I'm > > not sure about following symbolic links, but I don't think it will > > backup the target of a symlink either. > > > > Might either of these be the problem? > > > > I worked around the mount point problem in the past by specifically > > including the mount point. For example if /usr were the filesystem > > being backed up in the DLE and /usr/local were a separate filesystem > > that I wanted in the same backup, that DLE had something like an > > "include ./local" directive. > > > > jl > > As I read his note, I think he’s already got a separate DLE for each > mounted volume. > That ought to work. I have had some troubles using “dump” on ZFS disks (ie > it doesn’t work > at all) and find that I have to backup those with tar instead. JS, did > you say you ARE > using tar for these DLEs ? > > Although it might be interesting for him to check his backups (*IF* they’re > present) and see if > a new version of tar was auto-installed at the point when things started > to fail. Or check an > older yet backup, see what version of tar was present then, and just compare > it to the current > live version. > > Deb Baddorf > > > > >> > >> DUMP SUMMARY: > >> DUMPER STATS > >> TAPER STATS > >> HOSTNAME DISK L ORIG-kB OUT-kB COMP% MMM:SS KB/s > >> MMM:SS KB/s > >> -------------------------------- ----------------------- -------------- > >> ------------- > >> 10.6.1.209 /mnt/IT 0 10 10 -- 0:00 339.2 > >> 0:02 0.0 > >> 10.6.1.209 /mnt/aeiland 0 10 10 -- 0:00 307.3 > >> 0:02 0.0 > >> 10.6.1.209 /mnt/applications 0 10 10 -- 0:00 544.7 > >> 0:02 0.0 > >> 10.6.1.209 /mnt/archive-epp 0 10 10 -- 0:00 687.4 > >> 0:02 0.0 > >> 10.6.1.209 /mnt/aswoboda 0 10 10 -- 0:00 642.8 > >> 0:02 0.0 > >> 10.6.1.209 /mnt/bmcfarlane 0 10 10 -- 0:00 632.2 > >> 0:02 0.0 > >> 10.6.1.209 /mnt/bmenzie 0 10 10 -- 0:00 669.3 > >> 0:02 0.0 > >> 10.6.1.209 /mnt/brodriguez 0 10 10 -- 0:00 107.2 > >> 0:02 0.0 > >> 10.6.1.209 /mnt/cczinski 0 10 10 -- 0:00 641.0 > >> 0:02 0.0 > >> > >> > >> Some history: > >> > >> We've used this backup-set for ~2 years, adjusting DLEs as necessary for > >> organizational changes, and it has functioned as expected. In early > >> November the machine that hosts amanda-server experienced a RAID5 array > >> failure that necessitated we rebuild the array from scratch. The initial > >> array used an EXT filesystem, the new array is formatted ZFS. After > >> recovering from this hardware failure amdump seemed to operate normally for > >> a couple of weeks... and then things started to fall apart. > >> > >> Initially I noticed a notification in the nightly amreport that our holding > >> disk was missing as we had failed to recreate the proper directory > >> structure after rebuilding the failed array. After creating the missing > >> directory that error was no longer present in the logs. Shortly thereafter > >> the bulk of our DLEs started being dumped incorrectly and I've been unable > >> to determine why. I'm hoping the list will be able to provide me with some > >> insight as to what triggered the error and assist me in restoring our > >> backup capabilities. > >> > >> Here are our config files: > >> > >> *AMANDA.CONF: *http://kickasspastes.com/4946/ > >> > >> *DISKLIST: *http://kickasspastes.com/4947/ > >> > >> All of the DLEs on host 10.6.1.209 (a FreeNAS box) are failing. > >> One of four DLEs on localhost is failing (/srv/backups). > >> > >> Here is a view of file attributes for /srv on localhost: > >> > >> ls -la /srv/ > >> total 12 > >> drwxr-xr-x 3 root root 4096 Nov 6 08:20 . > >> drwxr-xr-x 30 root root 4096 Dec 19 06:41 .. > >> lrwxrwxrwx 1 root root 13 Nov 6 08:14 amanda -> /tank/amanda/ > >> lrwxrwxrwx 1 root root 14 Nov 6 08:14 backups -> /tank/backups/ > >> drwxr-xr-x 3 root root 4096 Nov 6 08:20 mnt > >> lrwxrwxrwx 1 root root 14 Nov 6 08:17 reports -> /tank/reports/ > >> lrwxrwxrwx 1 root root 13 Nov 6 08:19 server -> /tank/server/ > >> > >> and the underlying ZFS array /srv mountpoints are symlinked to: > >> > >> ls -la /tank/ > >> total 30 > >> drwxr-xr-x 6 root root 6 Nov 6 08:18 . > >> drwxr-xr-x 30 root root 4096 Dec 19 06:41 .. > >> drwxr-xr-x 4 amandabackup amandabackup 4 Dec 16 11:49 amanda > >> drwxr-xr-x 8 merkin merkin 8 Nov 6 08:56 backups > >> drwxr-xr-x 6 amandabackup amandabackup 6 Nov 6 08:17 reports > >> drwxr-xr-x 6 amandabackup amandabackup 9 Jan 7 01:45 server > >> > >> > >> > >> Here are some examples of logfiles from days when the failures occur: > >> > >> *AMREPORT: *http://kickasspastes.com/4951/ > >> > >> *PLANNER.DEBUG:* http://kickasspastes.com/4948/ > >> > >> *AMANDA PLANNER CONSOLE OUTPUT PIPED TO A TEXTFILE: * > >> http://kickasspastes.com/4949/ > >> > >> *AMANDA PLANNER CONSOLE OUTPUT SAMPLE TWO (INCLUDES "NO TRY" LINES): * > >> http://kickasspastes.com/4950/ > >> > >> > >> A couple notes: You'll notice I've set "etimeout" to a very high value in > >> amanda.conf. I had to raise the value for this setting because estimates > >> were timing out for localhost:/srv/amanda/state and amdump was failing. > >> I've read about amanda, tar, and ZFS filesystems and am under the > >> impression they require special dumptype declarations, and if the wrong > >> version of tar is used timeouts can occur... *(We are currently using > >> "encrypted-gnutar-local" for /srv mountpoints).*.. what confounds me is > >> that the system ran fine for almost a month, then started timing out, so > >> I'm not entirely sure where the blame lies... is this a ZFS issue? What is > >> causing planner to calculate incorrect estimates for mountpoints on the > >> other host(?), nothing changed on that machine at all. > >> > >> Any help resolving this issue would be greatly appreciated. I need to get > >> these backups moving again ASAP. > >> > >> Thanks in advance, > >> > >> JS > >> > >> -- > >> > >> > >> CONFIDENTIALITY NOTICE: This email is covered by the Electronic > >> Communications Privacy Act, 18 U.S.C. 2510-2521 and is legally privileged. > >> This communication may also contain material protected and governed by the > >> Health insurance Portability and Accountability Act of 1996 (HIPAA). This > >> e-mail is only for the personal and confidential use of the individuals to > >> which it is addressed and contains confidential information. If you are not > >> the intended recipient, you are notified that you have received this > >> document in error, and that any reading, distributing, copying or > >> disclosure is unauthorized. > >> > >> If you are not the intended recipient Please notify Hatteras Printing Inc. > >> by calling (313) 624-3300 and destroy the message immediately. > >> Additionally, please do not print this email unless it is absolutely > >> necessary. > >>>> End of included message <<< > > > > -- > > Jon H. LaBadie [email protected] > > 11226 South Shore Rd. (703) 787-0688 (H) > > Reston, VA 20190 (703) 935-6720 (C) > > -- > You received this message because you are subscribed to the Google Groups > "Focus1 IT" group. > To view this discussion on the web visit > https://groups.google.com/a/focus1data.com/d/msgid/Focus1IT/126BCFEB-AE2F-4648-BFBD-90E3C90083BB%40fnal.gov. > > > > > CONFIDENTIALITY NOTICE: This email is covered by the Electronic > Communications Privacy Act, 18 U.S.C. 2510-2521 and is legally privileged. > This communication may also contain material protected and governed by the > Health insurance Portability and Accountability Act of 1996 (HIPAA). This > e-mail is only for the personal and confidential use of the individuals to > which it is addressed and contains confidential information. If you are not > the intended recipient, you are notified that you have received this document > in error, and that any reading, distributing, copying or disclosure is > unauthorized. > > If you are not the intended recipient Please notify Hatteras Printing Inc. by > calling (313) 624-3300 and destroy the message immediately. Additionally, > please do not print this email unless it is absolutely necessary. >
