-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Arno Lehmann wrote: > Hello, > > On 11/15/2006 6:22 PM, Ryan Novosielski wrote: > Last update to myself: > > Seems as if what happened is that the wrong tape was inserted during the > wrong week. For reasons unknown (if anyone can tell me where to look, > I'd be very grateful), the tape, when inserted the wrong week, appears > to have been written to. The volume label appears to have been changed > to match the tape that it wanted, and written despite the fact that it > was not writable at that time (it may have been just about ready to be > recycled, but still, it was in the wrong pool anyway). > > bls indicates that the WRONG tape contains my last week's backups. The > new tape does not appear to contain anything. I see no place to get a > media ID for any of this stuff, but I suspect both tapes have a mediaID > that's the same at this point. > >> As far as I know, media IDs are only stored in the catalog. You might >> find something if you compare older catalog dumps, checking for changes >> regarding that volume, but I wouldn't want to do that :-) > > This is seriously messed up, and even if I -- or my staff -- did > something to cause it, I really need to know what to be careful not to > do again. > >> If Bacula accidentially overwrites a tape label I would consider that a bug.
I guess I should see about filing one, I just don't really have a lot of information to provide at this point. >> That said, I can imagine situations where such a thing can happen (but >> never investigated it): >> Imagine you have a tape inserted in a drive, rewound. The SD uses >> "always open=yes" and the polling stuff. > >> The tape is mounted, and thus Bacula knows for sure which tape is in the >> drive. > >> If you can change the tape without unmounting from Bacula and the drive >> doesn't inform the OS of that operation, or Bacula doesn't query that >> status from the OS, and you change the tape in between Bacula acesses, >> what you describe might happen. > >> (Keep in mind that this is mostly fiction, not science - I don't know if >> such a thing might happen with any tape drive, OS, or Bacula without >> indicating a bug.) I'd think so too. However, in this case it appears as if AlwaysOpen is off for this drive. I've seen a case where this exact thing DID happen to someone on this mailing list. Basically the resolution was "don't do that." However, here, I do not use that directive for this drive. Theoretically, there's no way for this to have happened. > Please let me know what I should provide to the list or how I should > troubleshoot this. I'm leaving everything as-is for now so that I have > all the evidence. > >> What I'd do is to examine all my tapes to find the one that is missing. >> If there is exactly one tape label that can't be found, you know at >> least which volume got overwritten and can invalidate the jobs on it. > >> Also, given the fact that your operators work by a list, you can >> probably determine when the wrong tape was inserted, perhaps even who >> did it :-) I've basically done this. The result is that combined_BW1, the tape incorrectly inserted last week, is now catalyst_BW1. catalyst_BW1, consequently, is empty, and combined_BW1 no longer exists. However, I still believe that in this particular case, given the course of events, this should NOT have happened. >> Once this is sorted out, you should use that example as reason why your >> operators should be educated to use Bacula for tape management and not >> their caledars ;-) Is there really an easy way for the staff to determine "next tape" though, when the storage devices and pools are defined in the schedule? status dir does not show them in these cases (showing instead *unknown*). > Thanks for any assistance you can provide. > >> I have seen such a thing myself, once, but that was during a beta test >> phase where I more or less tried to get such results, and it happened >> before the new locking mechanisms were implemented IIRC. > >> Arno > > > Ryan Novosielski wrote: > >>>> Here's a followup to myself. Apparently, I have two tapes called >>>> "catalyst_BW1" and have no idea how I could have gotten into this >>>> situation. One of them is a brandy new tape: >>>> >>>> Volume Label: >>>> Id : Bacula 1.0 immortal >>>> VerNo : 11 >>>> VolName : catalyst_BW1 >>>> PrevVolName : >>>> VolFile : 0 >>>> LabelType : PRE_LABEL >>>> LabelSize : 168 >>>> PoolName : catalyst_FULL >>>> MediaType : DDS-4 >>>> PoolType : Backup >>>> HostName : helios >>>> Date label written: 31-Oct-2006 08:51 >>>> >>>> ...one of them has been around for awhile, supposedly was not touched, >>>> is empty (this is probably OK), but for some reason does not have the >>>> right volume name anymore: >>>> >>>> Volume Label: >>>> Id : Bacula 1.0 immortal >>>> VerNo : 11 >>>> VolName : catalyst_BW1 >>>> PrevVolName : >>>> VolFile : 0 >>>> LabelType : VOL_LABEL >>>> LabelSize : 168 >>>> PoolName : catalyst_FULL >>>> MediaType : DDS-4 >>>> PoolType : Backup >>>> HostName : helios >>>> Date label written: 31-Oct-2006 08:51 >>>> >>>> I can't tell which media ID's these two tapes think they have. Is there >>>> any way with any of the commands that work directly on the tapes (ie. >>>> NOT the catalog) to check? It doesn't look like bls is interested in >>>> telling me. >>>> >>>> =R >>>> >>>> Ryan Novosielski wrote: >>>> >>>>> OK, here's how I got into this mess: >>>>> Operations staff has a calendar for which tape goes in when -- they >>>>> wrote it out because they really don't have the knowhow to check for >>>>> themselves. Well, they messed up because I don't have a full backup >>>>> scheduled for the fifth Tuesday: >>>>> Schedule { >>>>> Name = "UMD-F13T-Inc" >>>>> Run = Level=Full Storage=helios_DAT72 1st,3rd tue at 21:00 >>>>> Run = Level=Incremental Storage=helios_DDS 1st,3rd mon,wed-fri at 23:00 >>>>> Run = Level=Incremental Storage=helios_DDS 2nd,4th-5th mon-fri at 23:00 >>>>> } >>>>> Schedule { >>>>> Name = "UMD-F24T-Inc" >>>>> Run = Level=Full Storage=helios_DAT72 2nd,4th tue at 21:00 >>>>> Run = Level=Incremental Storage=helios_DDS 2nd,4th mon,wed-fri at 23:00 >>>>> Run = Level=Incremental Storage=helios_DDS 1st,3rd,5th mon-fri at 23:00 >>>>> } >>>>> ...however, their calendar had a fifth Tuesday and messed up the >>>>> rotation. Today, as a result, the wrong tape went into the drive. Here >>>>> is my storage config: >>>>> Device { >>>>> Name = helios_DAT72 # >>>>> Media Type = DDS-4 >>>>> Archive Device = /dev/rmt/1lbn >>>>> AutomaticMount = yes; # when device opened, read it >>>>> AlwaysOpen = no; >>>>> Volume Poll Interval = 30 minutes; >>>>> Close on Poll = yes; >>>>> RemovableMedia = yes; >>>>> RandomAccess = no; >>>>> Spool Directory = /usr/local/bacula/var/spool; >>>>> } >>>>> I meant to set AlwaysOpen to yes here, but apparently did not -- so now >>>>> I'm even more confused. Anyway, what happened... it tried to run a >>>>> backup, but it looked at the tape and saw that it was used and in the >>>>> wrong pool, and rightly refused. We noticed the error and now have the >>>>> proper tape in the drive. However: >>>>> #umount >>>>> Using default Catalog name=MyCatalog DB=bacula >>>>> The defined Storage resources are: >>>>> 1: File >>>>> 2: helios_DDS >>>>> 3: helios_DAT72 >>>>> Select Storage resource (1-3): Unexpected question has been received. >>>>> 3 >>>>> 3901 Device "helios_DAT72" (/dev/rmt/1lbn) is already unmounted. >>>>> #mount >>>>> The defined Storage resources are: >>>>> 1: File >>>>> 2: helios_DDS >>>>> 3: helios_DAT72 >>>>> Select Storage resource (1-3): Unexpected question has been received. >>>>> 3 >>>>> 3001 Mounted Volume: catalyst_BW1 >>>>> 3001 Device "helios_DAT72" (/dev/rmt/1lbn) is already mounted with >>>>> Volume "catalyst_BW1" >>>>> # >>>>> ...as you can see, my tape is both "already unmounted" and "already >>>>> mounted", and claims that the tape that is in the drive is the tape that >>>>> I've taken out of the drive and replaced with the right tape. It has >>>>> requested the tape I'd expect it to, but will not be convinced that it's >>>>> in the drive: >>>>> 15-Nov 10:37 helios-sd: CFMX-dev.2006-11-14_21.00.00 Warning: Director >>>>> wanted Volume "combined_BW1" for device "helios_DAT72" (/dev/rmt/1lbn). >>>>> Current Volume "catalyst_BW1" not acceptable because: >>>>> 1998 Volume "catalyst_BW1" status is Used, not in Pool. >>>>> 15-Nov 10:37 helios-sd: Please mount Volume "combined_BW1" on Storage >>>>> Device "helios_DAT72" (/dev/rmt/1lbn) for Job CFMX-dev.2006-11-14_21.00.00 >>>>> I've restarted bacula already and this hasn't helped at all. This is >>>>> version 1.38.11. I've never actually had a problem before like this that >>>>> I couldn't figure out -- seems to be that the worst case scenario has >>>>> you restart and that's that. I'm totally stumped here. Can anyone point >>>>> me in the right direction? >>>>> Thanks, >>>>> =R >>>> >>>> ------------------------------------------------------------------------- >>>> Take Surveys. Earn Cash. Influence the Future of IT >>>> Join SourceForge.net's Techsay panel and you'll get the chance to share >>>> your >>>> opinions on IT & business topics through brief surveys - and earn cash >>>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >>>> _______________________________________________ >>>> Bacula-users mailing list >>>> Bacula-users@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/bacula-users >>>> > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > >> >> - ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users - -- ---- _ _ _ _ ___ _ _ _ |Y#| | | |\/| | \ |\ | | |Ryan Novosielski - Systems Programmer III |$&| |__| | | |__/ | \| _| |[EMAIL PROTECTED] - 973/972.0922 (2-0922) \__/ Univ. of Med. and Dent.|IST/AST - NJMS Medical Science Bldg - C630 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFW4EUmb+gadEcsb4RAgxYAJ9UPofJF34wiMv5tq18/l+AgNEP6ACgysBo pAQNJbO/JenzP4zM7VdowMU= =8dS6 -----END PGP SIGNATURE----- ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users