> > 1) Can anybody tell me what the "error redirecting stderr to fd 51: Bad file > > number" means? I googled the message but found nothing. > > The program amandad opens data/mesg/index to file descriptor 50/51/52 > and then execs sendbackup. And sendbackup connects those > filedescriptors to the needed streams. The message file descriptor > needs to connected to the stderr of gnutar, but when tried, sendbackup > notices that filedescriptor 51 is not valid (not open?). > Really weird.
I believe that we may be running into a similar problem using 2.5.0p1 on Solaris (server and clients). Unfortunately, in our case we do *NOT* get any warnings about stderr redirection. Instead, some of our dump files contain: 1) a normal Amanda header 2) an ASCII list of files that looks suspiciously like the output of the index command rather than the expected header followed by a gnu tar file (most of the dump files are okay, it's only the odd few that seem to have a problem). When this happens, the index file for this volume is empty. Obviously, this is a problem from a recovery point of view :-( ! My (unconfirmed) guess is that somehow the data/mesg/index file descriptors are getting mixed up and that the index output is ending up where the data should be. The code mentions "scheduling" various operations on these file descriptors, so perhaps there is a race condition somewhere? !! WARNING !! The only visible symptom of this issue on our platform is that amverify reports "End-of-Information detected" (rather than "End-of-Tape detected.") and exits before checking all of the files on the tape. There are no In my opinion, this is a serious issue since: 1) it results in a corrupted backup for the volumes affected 2) it is not accompanied by a clear error message, i.e. you could easily miss the problem if your list of volumes is long or you didn't happen to note the difference between end of tape and end of information. I've included below a sample output from amverify for an affected backup: -------------------------------------------------- Subject: m1 AMANDA VERIFY REPORT FOR MBK1_02 Tapes: MBK1_02 No errors found! amverify m1 Mon May 1 15:39:35 EDT 2006 Loading 2 slot... Using device /dev/rmt/0n Volume MBK1_02, Date 20060330 Checked megawatt._net_ghoncho_vol06.20060330.1 Checked megawatt._net_ghoss_vol06.20060330.1 Checked megawatt._.20060330.1 Checked megawatt._vol02.20060330.1 Checked megawatt._net_ghoncho_vol05.20060330.0 Checked megawatt._net_ghoss_vol01.20060330.0 Checked megawatt._vol01.20060330.0 End-of-Information detected. (NOTE: ~20 filesystems after megawatt._vol01.20060330.0 aren't listed, despite the fact that they are on the tape. The dump file for the volume after megawatt._vol01.20060330.0 is corrupt and contains only a header and a list of the files that should have been backed up) -------------------------------------------------- ================================================================= Sean Walmsley [EMAIL PROTECTED] Nuclear Safety Solutions Ltd. 416-592-4608 (V) 416-592-5528 (F) 700 University Ave M/S H04 J19, Toronto, Ontario, M5G 1X6, CANADA
