backup errors
Hi, OS: Solaris8, Hardware: Sunfire V120 amanda backup is failing regularly. Following is the report generated. These dumps were to tape ARCH_DE2_003. *** A TAPE ERROR OCCURRED: [[writing file: Bad file number]]. Some dumps may have been left in the holding disk. Run amflush to flush them to tape. The next tape Amanda expects to use is: a new tape. FAILURE AND STRANGE DUMP SUMMARY: jafar2 /home/host lev 0 FAILED [out of tape] jafar2 /home/host lev 0 FAILED [data write: Broken pipe] jafar2 /home/host lev 0 FAILED [dump to tape failed] serv5 /raid/sql1_mysql_var lev 0 FAILED [out of tape] serv5 /raid/sql1_mysql_var lev 0 FAILED [data write: Broken pipe] serv5 /raid/sql1_mysql_var lev 0 FAILED [dump to tape failed] serv5 /raid/mysql_backup lev 0 FAILED [out of tape] serv5 /raid/mysql_backup lev 0 FAILED [data write: Broken pipe] serv5 /raid/mysql_backup lev 0 FAILED [dump to tape failed] serv3 /home/host lev 0 FAILED [out of tape] serv3 /home/host lev 0 FAILED [data write: Broken pipe] serv3 /home/host lev 0 FAILED [dump to tape failed] serv5 /raid/clring_rpts/Fortisdown lev 0 FAILED [out of tape] serv5 /raid/clring_rpts/Fortisdown lev 0 FAILED [data write: Broken pipe] serv5 /raid/clring_rpts/Fortisdown lev 0 FAILED [dump to tape failed] STATISTICS: Total --- Estimate Time (hrs:min).. 0:07 Run Time (hrs:min)... 0:57 Dump Time (hrs:min).. 0:39 Output Size (GB). 4.71 Original Size (GB)... 4.71 Filesystems Dumped... 6 Avg Dump Rate (MB/s). 2.05 Tape Time (hrs:min).. 0:32 Tape Size (GB)... 4.71 Tape Used (%) 9.65 Filesystems Taped 6 Avg Tp Write Rate (MB/s). 2.52 FAILED AND STRANGE DUMP DETAILS: /-- jafar2 /home/host lev 0 FAILED [data write: Broken pipe] sendbackup: start [jafar2:/home/host level 0] sendbackup: info BACKUP=/usr/sbin/ufsdump sendbackup: info RECOVER_CMD=/usr/sbin/ufsrestore -f... - sendbackup: info end | DUMP: Writing 32 Kilobyte records | DUMP: Date of this level 0 dump: Fri Jun 18 00:56:44 2004 | DUMP: Date of last level 0 dump: the epoch | DUMP: Dumping /dev/rdsk/c0t0d0s7 (jafar2:/home/host) to standard output. | DUMP: Mapping (Pass I) [regular files] | DUMP: Mapping (Pass II) [directories] | DUMP: Estimated 11526716 blocks (5628.28MB) on 0.08 tapes. | DUMP: Dumping (Pass III) [directories] | DUMP: Dumping (Pass IV) [regular files] \ /-- serv5 /raid/sql1_mysql_var lev 0 FAILED [data write: Broken pipe] sendbackup: start [serv5:/raid/sql1_mysql_var level 0] sendbackup: info BACKUP=/usr/sbin/ufsdump sendbackup: info RECOVER_CMD=/usr/sbin/ufsrestore -f... - sendbackup: info end | DUMP: Writing 32 Kilobyte records | DUMP: Date of this level 0 dump: Fri Jun 18 01:06:06 2004 | DUMP: Date of last level 0 dump: the epoch | DUMP: Dumping /dev/rdsk/c2t6d2s0 (serv5:/raid/sql1_mysql_var) to standard output. | DUMP: Mapping (Pass I) [regular files] | DUMP: Mapping (Pass II) [directories] | DUMP: Estimated 9108784 blocks (4447.65MB) on 0.07 tapes. \ /-- serv5 /raid/mysql_backup lev 0 FAILED [data write: Broken pipe] sendbackup: start [serv5:/raid/mysql_backup level 0] sendbackup: info BACKUP=/usr/sbin/ufsdump sendbackup: info RECOVER_CMD=/usr/sbin/ufsrestore -f... - sendbackup: info end | DUMP: Writing 32 Kilobyte records | DUMP: Date of this level 0 dump: Fri Jun 18 01:06:43 2004 | DUMP: Date of last level 0 dump: the epoch | DUMP: Dumping /dev/rdsk/c2t6d1s5 (serv5:/raid/mysql_backup) to standard output. | DUMP: Mapping (Pass I) [regular files] | DUMP: Mapping (Pass II) [directories] | DUMP: Estimated 17408044 blocks (8500.02MB) on 0.13 tapes. | DUMP: Dumping (Pass III) [directories] \ /-- serv3 /home/host lev 0 FAILED [data write: Broken pipe] sendbackup: start [serv3:/home/host level 0] sendbackup: info BACKUP=/usr/sbin/ufsdump sendbackup: info RECOVER_CMD=/usr/sbin/ufsrestore -f... - sendbackup: info end | DUMP: Writing 32 Kilobyte records | DUMP: Date of this level 0 dump: Fri Jun 18 01:07:04 2004 | DUMP: Date of last level 0 dump: the epoch | DUMP: Dumping /dev/rdsk/c0t0d0s7 (serv3a:/home/host) to standard output. | DUMP: Mapping (Pass I) [regular files] | DUMP: Mapping (Pass II) [directories] | DUMP: Estimated 9341590 blocks (4561.32MB) on 0.07 tapes. | DUMP: Dumping (Pass III) [directories] \ /-- serv5 /raid/clring_rpts/Fortisdown lev 0 FAILED [data write: Broken pipe] sendbackup: start [serv5:/raid/clring_rpts/Fortisdown level 0] sendbackup: info BACKUP=/usr/sbin/ufsdump sendbackup: info RECOVER_CMD=/usr/sbin/ufsrestore -f... - sendbackup: info end | DUMP: Writing 32 Kilobyte records | DUMP: Date of this level 0 dump: Fri Jun 18 01:07:15 2004 | DUMP: Date of last level 0 dump: the epoch | DUMP: Dumping /dev/rdsk/c2t6d2s1
Re: backup errors
Sal Sharief wrote: FAILURE AND STRANGE DUMP SUMMARY: jafar2 /home/host lev 0 FAILED [out of tape] It says out of tape, but a tape error is more likely. ... Tape Size (GB)... 4.71 Tape Used (%) 9.65 Filesystems Taped 6 Am I correct that your tapesize is configured to be about 45 Gbyte. (What kind of tape is that, I've never seen a tape with such a capacity.) ... taper: tape ARCH_DE2_003 kb 6878912 fm 7 writing file: I/O error While writing the 7th backup image to tape, there was an IO error around 6878912 Kbyte or about 6.6 Gbyte. That seems to be far from the expected capacity of 45 Gbyte, so it's probably a tape error. Try a cleaning tape. Look for scsi errors in /var/adm/messages. (PS. what version of amanda has only one column in the statistics section? Must be quite old.) -- Paul Bijnens, XplanationTel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUMFax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] *** * I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, F6, * * quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, * * stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, * * PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, * * kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ...* * ... Are you sure? ... YES ... Phew ... I'm out * ***
Re: backup errors
--- Paul Bijnens [EMAIL PROTECTED] wrote: Sal Sharief wrote: FAILURE AND STRANGE DUMP SUMMARY: jafar2 /home/host lev 0 FAILED [out of tape] It says out of tape, but a tape error is more likely. Tape Size (GB)... 4.71 Tape Used (%) 9.65 Filesystems Taped 6 Am I correct that your tapesize is configured to be about 45 Gbyte. (What kind of tape is that, I've never seen a tape with such a capacity.) taper: tape ARCH_DE2_003 kb 6878912 fm 7 writing file: I/O error While writing the 7th backup image to tape, there was an IO error around 6878912 Kbyte or about 6.6 Gbyte. That seems to be far from the expected capacity of 45 Gbyte, so it's probably a tape error. Try a cleaning tape. Look for scsi errors in /var/adm/messages. --snip from /var/adm/messages Jun 17 20:38:57 serv5c scsi: [ID 107833 kern.notice] /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED],1/[EMAIL PROTECTED],0 (st11): Jun 17 20:38:57 serv5c Fixed record length (1024 byte blocks) I/O Jun 17 20:39:12 serv5c scsi: [ID 193665 kern.info] sgen26 at glm1: target a lun 0 Jun 17 20:39:12 serv5c genunix: [ID 936769 kern.info] sgen26 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED],1/[EMAIL PROTECTED],0 Jun 18 01:04:52 serv5c scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED],1/[EMAIL PROTECTED],0 (st11): Jun 18 01:04:52 serv5c Error for Command: write Error Level: Fatal Jun 18 01:04:52 serv5c scsi: [ID 107833 kern.notice] Requested Block: 1938880 Error Block: 1938880 Jun 18 01:04:52 serv5c scsi: [ID 107833 kern.notice] Vendor: EXABYTESerial Number: 8E002106 Jun 18 01:04:52 serv5c scsi: [ID 107833 kern.notice] Sense Key: Media Error Jun 18 01:04:52 serv5c scsi: [ID 107833 kern.notice] ASC: 0xc (write error), ASCQ: 0x0, FRU: 0x0 -- end snip -- I have a 15 slot Exabyte 215M tape library with 225m AME with SmartClean cartridges. I have tried to run backups on more than 3 tapes, and got the same error. These tapes are approx. four months old. I went ahead and ordered a cleaning cartridge. Exabyte claimed that I wouldn't need a cleaning tape since the media comes with smart clean automatically. Now they backed out of their claim after creating an incident report. Go figure. (PS. what version of amanda has only one column in the statistics section? Must be quite old.) Its an amanda version 2.4.2p2 - modified to show data in MB and minus other stats. -- Paul Bijnens, Xplanation Tel +32 16 397.511 Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM Fax +32 16 397.512 http://www.xplanation.com/ email: [EMAIL PROTECTED] __ Do you Yahoo!? Yahoo! Mail - You care about security. So do we. http://promotions.yahoo.com/new_mail
Re: new samba backup errors
On Saturday 27 July 2002 01:52, Michael Perry wrote: I decided to focus on getting smbclient backups working with amanda so read the docs/samba stuff and added relevant entries to my disklist file and also ran amcheck a few times. I am running amanda on a Debian Unstable system with a very recent CVS version of Samba that is in the unstable tree of debian. The tape device is a ecrix vxa-1 which has worked flawlessly backing up bsd and linux systems. The evening backup just concluded and I got the backup report from amanda. The message is filled with errors like this: ? ERROR: string overflow by 10 in safe_strcpy [\Documents and +Settings\mperry\Local Settings\Temp] ? ERROR: string overflow by 18 in safe_strcpy [\Documents and +Settings\mperry\Local Settings\Temp] ? ERROR: string overflow by 16 in safe_strcpy [\Documents and +Settings\mperry\Local Settings\Temp] ? ERROR: string overflow by 20 in safe_strcpy [\Documents and +Settings\mperry\Local Settings\Temp] and several NT sharing violations. I wonder if this might be related to the eol conventions differences between the machines, some using a lf, some a cr, and some a combo of both. I haven't fooled with it myself as I don't have any winderz boxes here, but ISTR there is a line in smb.conf that tells samba to translate that stuff. man smb.conf comes to mind... OTOH, I could be full of it... In which case, ignore my mutterings. :-) Being a newbie at the samba and smbclient backups, I am wondering what these repeated errors mean. Most of them occur in the Documents and Settings folder. The system being backed up is a Windows 2000 pro box. Thanks for any enlightenment. BTW, I set up /etc/amandapass per the readme documentation to backup the C$ share and I am an administrator on that system with an admin login. Amcheck runs with no problems at all. -- Cheers, Gene AMD K6-III@500mhz 320M Athlon1600XP@1400mhz 512M 99.09% setiathome rank, not too shabby for a WV hillbilly
new samba backup errors
I decided to focus on getting smbclient backups working with amanda so read the docs/samba stuff and added relevant entries to my disklist file and also ran amcheck a few times. I am running amanda on a Debian Unstable system with a very recent CVS version of Samba that is in the unstable tree of debian. The tape device is a ecrix vxa-1 which has worked flawlessly backing up bsd and linux systems. The evening backup just concluded and I got the backup report from amanda. The message is filled with errors like this: ? ERROR: string overflow by 10 in safe_strcpy [\Documents and +Settings\mperry\Local Settings\Temp] ? ERROR: string overflow by 18 in safe_strcpy [\Documents and +Settings\mperry\Local Settings\Temp] ? ERROR: string overflow by 16 in safe_strcpy [\Documents and +Settings\mperry\Local Settings\Temp] ? ERROR: string overflow by 20 in safe_strcpy [\Documents and +Settings\mperry\Local Settings\Temp] and several NT sharing violations. Being a newbie at the samba and smbclient backups, I am wondering what these repeated errors mean. Most of them occur in the Documents and Settings folder. The system being backed up is a Windows 2000 pro box. Thanks for any enlightenment. BTW, I set up /etc/amandapass per the readme documentation to backup the C$ share and I am an administrator on that system with an admin login. Amcheck runs with no problems at all. -- Michael Perry [EMAIL PROTECTED]
Re: backup errors...
Hi, thanks for all your help. I broke down and recompiled/reinstalled the amanda client and that seems to have cleared up the problem, amcheck no longer reports any errors. I'll find out if the backups run tonight. Thanks Again! -Josh
backup errors...
Hi there. I'm having a problem getting Amanda to backup one of our servers. When running amcheck I get the following error: ERROR: nushtel: [access as operator not allowed from operator@backupservername] I found the faq-o-matic entry for this issue and checked all of my settings against the advice found there. The .amandahosts file is in operator's home directory, has the correct permissions and appears to be in the correct format (it is in fact a copy of a .amandahosts file from another server being backed up by the same backupserver on our network) Looking at /tmp/amanda/amandad.debug on the server I'm trying to backup I see somethings that concern me, namely: the BUILT_MACH line lists the wrong hostname, when I build amanda this server was named new-server.ourdomain.com but now that the box has gone into production it's name has changed to the name of the machine it has replaced. The other bit that bugs me is that it lists the DEFAULT_SERVER and DEFAULT_TAPE_SERVER as itself, using the old hostname. it does seem to pass the self-checks though: Amanda 2.4 REQ HANDLE 002-385A0508 SEQ 1000143707 SECURITY USER operator SERVICE selfcheck OPTIONS ; DUMP wd0g 0 OPTIONS |;bsd-auth;index; DUMP wd0h 0 OPTIONS |;bsd-auth;index; DUMP wd0a 0 OPTIONS |;bsd-auth;index; Oh yes, the backup server is a redhat box, and the machine I'm backing up is running BSDi 4.2 Any clues of what I can do about this? Thanks Josh
Re: backup errors...
Looking at /tmp/amanda/amandad.debug on the server I'm trying to backup I see somethings that concern me, namely: the BUILT_MACH line lists the wrong hostname ... That's just documentation about where Amanda was built. It's not used for anything. ... The other bit that bugs me is that it lists the DEFAULT_SERVER and DEFAULT_TAPE_SERVER as itself, using the old hostname. Those values are only used by amrecover in case you don't set them with the command line switches. it does seem to pass the self-checks though: Amanda 2.4 REQ HANDLE 002-385A0508 SEQ 1000143707 SECURITY USER operator SERVICE selfcheck OPTIONS ; DUMP wd0g 0 OPTIONS |;bsd-auth;index; DUMP wd0h 0 OPTIONS |;bsd-auth;index; DUMP wd0a 0 OPTIONS |;bsd-auth;index; That's just the incoming packet. It doesn't say anything about whether the checks worked or not. And if you're getting this: ERROR: nushtel: [access as operator not allowed from operator@backupservername] then selfcheck is not even being run by amandad. What else is in /tmp/amanda/amandad*debug? What version of Amanda are you using? Newer versions will tell you about (e.g.) permissions problems and the like. What's in ~operator/.amandahosts? It should be: backupservername operator and the names should be **exactly** as the error message reports them (i.e. if it gives a fully qualified name, the file should have a fully qualified name). Josh John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
Re: backup errors...
On Mon, 10 Sep 2001, John R. Jackson wrote: What's in ~operator/.amandahosts? It should be: backupservername operator Here's what's in .amandahosts: console.corecom.net operator console is the name of the backupserver, here is the exact verbage of the amcheck error: ERROR: nushtel: [access as operator not allowed from [EMAIL PROTECTED]] nushtel is the name of the server I'm trying to backup. What else is in /tmp/amanda/amandad*debug? What version of Amanda are you using? Newer versions will tell you about (e.g.) permissions problems and the like. We are using version 2.4.1p1. Here are the full contents of the amandad.debug file: amandad: debug 1 pid 28573 ruid 5 euid 5 start time Mon Sep 10 11:25:14 2001 amandad: version 2.4.1p1 amandad: build: VERSION=Amanda-2.4.1p1 amandad:BUILT_DATE=Tue Jul 24 04:28:27 AKDT 2001 amandad:BUILT_MACH=BSD/OS new-nushtel.corecom.net 4.2 BSDI BSD/OS 4.2 Kernel #0: Wed Oct 25 17:38:20 MDT 2000 [EMAIL PROTECTED]:/mnt/proto/4.2-i386/usr/src/sys/compile/GENERIC i386 amandad:CC=gcc amandad: paths: bindir=/usr/local/bin sbindir=/usr/local/sbin amandad:libexecdir=/usr/local/libexec mandir=/usr/local/man amandad:CONFIG_DIR=/usr/local/etc/amanda DEV_PREFIX=/dev/ amandad:RDEV_PREFIX=/dev/r DUMP=/sbin/dump amandad:RESTORE=/sbin/restore amandad:COMPRESS_PATH=/usr/contrib/bin/gzip amandad:UNCOMPRESS_PATH=/usr/contrib/bin/gzip amandad:MAILER=/usr/bin/Mail amandad: defs: DEFAULT_SERVER=new-nushtel.corecom.net amandad:DEFAULT_CONFIG=DailySet1 amandad:DEFAULT_TAPE_SERVER=new-nushtel.corecom.net amandad:DEFAULT_TAPE_DEVICE=/dev/nrst0 HAVE_MMAP HAVE_SYSVSHM amandad:LOCKING=POSIX_FCNTL DEBUG_CODE BSD_SECURITY amandad:CLIENT_LOGIN=operator FORCE_USERID HAVE_GZIP amandad:COMPRESS_SUFFIX=.gz COMPRESS_FAST_OPT=--fast amandad:COMPRESS_BEST_OPT=--best UNCOMPRESS_OPT=-dc got packet: Amanda 2.4 REQ HANDLE 002-385A0508 SEQ 1000150230 SECURITY USER operator SERVICE selfcheck OPTIONS ; DUMP wd0g 0 OPTIONS |;bsd-auth;index; DUMP wd0h 0 OPTIONS |;bsd-auth;index; DUMP wd0a 0 OPTIONS |;bsd-auth;index; sending ack: Amanda 2.4 ACK HANDLE 002-385A0508 SEQ 1000150230 bsd security: remote host console.corecom.net user operator local user operator check failed: [access as operator not allowed from [EMAIL PROTECTED]] amandad: sending REP packet: Amanda 2.4 REP HANDLE 002-385A0508 SEQ 1000150230 ERROR [access as operator not allowed from [EMAIL PROTECTED]] amandad: got ack: Amanda 2.4 ACK HANDLE 002-385A0508 SEQ 1000150230 amandad: pid 28573 finish time Mon Sep 10 11:25:14 2001
Re: backup errors...
Here's what's in .amandahosts: ... OK, that looks right. We are using version 2.4.1p1. ... OK, that version does not report anything else useful. What does the following say: ls -ld ~operator/.amandahosts ls -ld ~operator ls -ld ~operator/.. ls -ld ~operator/../.. John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
Re: backup errors...
OK, then here are a couple of other things to try. Is the .amandahosts file being accessed when you do amcheck? For instance, does ls -lu show a time change? Note that ls only shows things to the minute, so if you have some other utility that shows more accurate atime output, that would be useful. If it is being accessed, you need to look at the contents again. Make sure you don't have any extra whitespace at the beginning or end of the line (I seem to recall fixing a bug w.r.t. that a while back). And make sure there aren't unprintible characters hiding. It should handle either blanks or tabs between the host name and user name, but you might try switching between what you have and the other. If that does not help, go to the common-src directory of your Amanda build and make security. Take that program (security) back over to the client and run it **as operator**. It will ask you for the remote host and remote user. Enter those exactly as the amcheck error message reports and see what the program has to say. It basically does a verbose check just like amandad (using the same code). John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]
Re: backup errors...
On Mon, 10 Sep 2001, John R. Jackson wrote: What does the following say: ls -ld ~operator/.amandahosts -r 1 operator operator 28 Sep 10 14:14 /usr/local/amanda/.amandahosts ls -ld ~operator drwxrwxr-x 2 operator wheel 512 Sep 10 14:14 /usr/local/amanda ls -ld ~operator/.. drwxr-xr-x 13 root wheel 512 Sep 7 09:11 /usr/local/amanda/.. ls -ld ~operator/../.. drwxr-xr-x 23 root wheel 512 Aug 22 16:03 /usr/local/amanda/../.. Yes operator's home dir is set as /usr/local/amanda that's just kinda where it ended up getting standardized on here... thanks! -Josh