I just thought I would let the list in on my adventures this week. Tue 15:00: amcheck runs out of cron and checks everything.
Tue 15:50: I get an email from amanda reporting tape failure (multiple attempts all resulting in timeout). At first I didn't read it that way. But, after looking at amanda debug logs, dropping back to amtape, and then further back to just mt, and finally going to the tape library console, I realized that I had a tape stuck in the drive (which is sealed up back inside the library). Tue 17:00: I call sony tech support. They walked me through some steps to try to get the tape out -- repeatedly Eject Error. Then they had me run a trace using the sonytape utility. It generated a 1.4MB txt file that I sent to them. By this time it's 5:40pm. I head home. Listen to my cell phone, keep an eye on my email, fix supper. Having faith in amanda, I sleep. Wed 00:45: *amanda runs backups*. No tape access. Saves all backups to spool drives. Reports backup results. Wed 08:30: I call sony tech support. They got the nori-trace.txt file. They escalate to engineering. Wed 08:50: Sony support engineer calls while I'm biking in to work. I ask him to call me back in 15 minutes. Wed 09:15: He calls back. Explains tensioning problem with tape reported in trace file. Walks me through a series of procedures to get the tape out. It works. Tape actually looks alright. I load a previous tape. It seems to work. He points me to a firmware upgrade, and I load it using the sonytape utility. I tell him I'll have to do some testing and could he call me back in a couple of hours. Wed morning: amrecover gives me an error. I'm not sure if its hardware, something with the firmware upgrade that confused amanda, or something I've done in the configuration since last time I used amrecover on this machine. Drop back to mt and dd. I could read the tape label and the first file header. Cool. It actually lists the full set of unix commands required to read the file off the tape. So, perhaps foolishly, but watching the clock, I decide to reload the original tape (which "looked" alright) and see if amflush will push the data out to it. It jammed again, and now even the extra steps the engineer gave me won't get it out. Big mistake. Wed afternoon: I had to take off on another project. Sony engineer called. I gave him a status update. Thu 00:45: *amanda runs backups*. No tape access. Saves all backups to spool drives. Reports backup results. Thu morning: I repeatedly try a number of things that all fail. Thu afternoon: I call sony tech support again. Engineer calls me right back. We have a long session with him on speaker phone as he guides me though disassembling the library, pulling the drive out, and manually operating some gears on the circuit board to push the tape carriage out. Once the tape is pushed out enough, I can grab it and pull it out. There's about 6 inches of tape hanging out where it was stuck. Put everything back together. Load tape. Looks alright. Now I have to test everything. Thu 16:00: Checked my amanda-client.conf on this machine. There was one difference I fixed. Run amrecover. Look for something whose last full was at least 3 days ago and thus on tape. Recover it. It works. Just since I have backups sitting on the spool drive, try an amrecover from something that's there. That works too. Cool. Thu 17:15: Position to next tape. Run amcheck. Nope. tapecycle is 30, and I now only have 29 tapes. Barcode a new tape. Insert it in now empty slot in library (bad tape will be sent back for replacement and never reused). Run amlabel. Run amcheck. Everything is ready. Fri 00:45: *amanda runs backups*. Flushes everything to tape. Reports backup results. Fri 09:15: Checking all reports. Amanda is still expecting that bad tape next. Review procedures. Run amrmtape. Completely back in business. ----------------------- Through that whole episode, extended by my foolishness in trying to reload a tape that had already given me trouble, amanda never even hiccuped. I didn't miss a single backup. AND I didn't have to do anything about it. I didn't have to go in and ask amanda to backup anyway, or to save stuff on spool, or to flush it when it finally had access to a tape. Amanda just did everything it needed to do to keep my backups running. All I had to do was get the tape library working and replace the bad tape. Gotta love it. ----------------------- Coincidentally, there was simultaneously a somewhat related episode going back and forth on the bacula list. It wasn't a tape failure, it was a DVD issue. The answers were basically: "That's not how bacula does it"; no, you won't have backups on hard disk or spool; bacula won't back up until the problem is resolved; if you're on vacation, you're out of luck. I didn't think it was appropriate to comment on their list, and I imagine that someone could figure out a way of working around these issues with more complex configurations of bacula. But I thought it was worth mentioning here, since this is the amanda list, and we can appreciate what we have. --------------- Chris Hoogendyk - O__ ---- Systems Administrator c/ /'_ --- Biology & Geology Departments (*) \(*) -- 140 Morrill Science Center ~~~~~~~~~~ - University of Massachusetts, Amherst <[EMAIL PROTECTED]> --------------- Erdös 4
