Re: [Bacula-users] Destructive Tape Label Crossing! (was: Problem mounting Volume)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Last update to myself: Seems as if what happened is that the wrong tape was inserted during the wrong week. For reasons unknown (if anyone can tell me where to look, I'd be very grateful), the tape, when inserted the wrong week, appears to have been written to. The volume label appears to have been changed to match the tape that it wanted, and written despite the fact that it was not writable at that time (it may have been just about ready to be recycled, but still, it was in the wrong pool anyway). bls indicates that the WRONG tape contains my last week's backups. The new tape does not appear to contain anything. I see no place to get a media ID for any of this stuff, but I suspect both tapes have a mediaID that's the same at this point. This is seriously messed up, and even if I -- or my staff -- did something to cause it, I really need to know what to be careful not to do again. Please let me know what I should provide to the list or how I should troubleshoot this. I'm leaving everything as-is for now so that I have all the evidence. Thanks for any assistance you can provide. Ryan Novosielski wrote: Here's a followup to myself. Apparently, I have two tapes called catalyst_BW1 and have no idea how I could have gotten into this situation. One of them is a brandy new tape: Volume Label: Id: Bacula 1.0 immortal VerNo : 11 VolName : catalyst_BW1 PrevVolName : VolFile : 0 LabelType : PRE_LABEL LabelSize : 168 PoolName : catalyst_FULL MediaType : DDS-4 PoolType : Backup HostName : helios Date label written: 31-Oct-2006 08:51 ...one of them has been around for awhile, supposedly was not touched, is empty (this is probably OK), but for some reason does not have the right volume name anymore: Volume Label: Id: Bacula 1.0 immortal VerNo : 11 VolName : catalyst_BW1 PrevVolName : VolFile : 0 LabelType : VOL_LABEL LabelSize : 168 PoolName : catalyst_FULL MediaType : DDS-4 PoolType : Backup HostName : helios Date label written: 31-Oct-2006 08:51 I can't tell which media ID's these two tapes think they have. Is there any way with any of the commands that work directly on the tapes (ie. NOT the catalog) to check? It doesn't look like bls is interested in telling me. =R Ryan Novosielski wrote: OK, here's how I got into this mess: Operations staff has a calendar for which tape goes in when -- they wrote it out because they really don't have the knowhow to check for themselves. Well, they messed up because I don't have a full backup scheduled for the fifth Tuesday: Schedule { Name = UMD-F13T-Inc Run = Level=Full Storage=helios_DAT72 1st,3rd tue at 21:00 Run = Level=Incremental Storage=helios_DDS 1st,3rd mon,wed-fri at 23:00 Run = Level=Incremental Storage=helios_DDS 2nd,4th-5th mon-fri at 23:00 } Schedule { Name = UMD-F24T-Inc Run = Level=Full Storage=helios_DAT72 2nd,4th tue at 21:00 Run = Level=Incremental Storage=helios_DDS 2nd,4th mon,wed-fri at 23:00 Run = Level=Incremental Storage=helios_DDS 1st,3rd,5th mon-fri at 23:00 } ...however, their calendar had a fifth Tuesday and messed up the rotation. Today, as a result, the wrong tape went into the drive. Here is my storage config: Device { Name = helios_DAT72 # Media Type = DDS-4 Archive Device = /dev/rmt/1lbn AutomaticMount = yes; # when device opened, read it AlwaysOpen = no; Volume Poll Interval = 30 minutes; Close on Poll = yes; RemovableMedia = yes; RandomAccess = no; Spool Directory = /usr/local/bacula/var/spool; } I meant to set AlwaysOpen to yes here, but apparently did not -- so now I'm even more confused. Anyway, what happened... it tried to run a backup, but it looked at the tape and saw that it was used and in the wrong pool, and rightly refused. We noticed the error and now have the proper tape in the drive. However: #umount Using default Catalog name=MyCatalog DB=bacula The defined Storage resources are: 1: File 2: helios_DDS 3: helios_DAT72 Select Storage resource (1-3): Unexpected question has been received. 3 3901 Device helios_DAT72 (/dev/rmt/1lbn) is already unmounted. #mount The defined Storage resources are: 1: File 2: helios_DDS 3: helios_DAT72 Select Storage resource (1-3): Unexpected question has been received. 3 3001 Mounted Volume: catalyst_BW1 3001 Device helios_DAT72 (/dev/rmt/1lbn) is already mounted with Volume catalyst_BW1 # ...as you can see, my tape is both already unmounted and already mounted, and claims that the tape that is in the drive is the tape that I've taken out of the drive and replaced with the right tape. It has requested the tape I'd
Re: [Bacula-users] Destructive Tape Label Crossing!
Hello, On 11/15/2006 6:22 PM, Ryan Novosielski wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Last update to myself: Seems as if what happened is that the wrong tape was inserted during the wrong week. For reasons unknown (if anyone can tell me where to look, I'd be very grateful), the tape, when inserted the wrong week, appears to have been written to. The volume label appears to have been changed to match the tape that it wanted, and written despite the fact that it was not writable at that time (it may have been just about ready to be recycled, but still, it was in the wrong pool anyway). bls indicates that the WRONG tape contains my last week's backups. The new tape does not appear to contain anything. I see no place to get a media ID for any of this stuff, but I suspect both tapes have a mediaID that's the same at this point. As far as I know, media IDs are only stored in the catalog. You might find something if you compare older catalog dumps, checking for changes regarding that volume, but I wouldn't want to do that :-) This is seriously messed up, and even if I -- or my staff -- did something to cause it, I really need to know what to be careful not to do again. If Bacula accidentially overwrites a tape label I would consider that a bug. That said, I can imagine situations where such a thing can happen (but never investigated it): Imagine you have a tape inserted in a drive, rewound. The SD uses always open=yes and the polling stuff. The tape is mounted, and thus Bacula knows for sure which tape is in the drive. If you can change the tape without unmounting from Bacula and the drive doesn't inform the OS of that operation, or Bacula doesn't query that status from the OS, and you change the tape in between Bacula acesses, what you describe might happen. (Keep in mind that this is mostly fiction, not science - I don't know if such a thing might happen with any tape drive, OS, or Bacula without indicating a bug.) Please let me know what I should provide to the list or how I should troubleshoot this. I'm leaving everything as-is for now so that I have all the evidence. What I'd do is to examine all my tapes to find the one that is missing. If there is exactly one tape label that can't be found, you know at least which volume got overwritten and can invalidate the jobs on it. Also, given the fact that your operators work by a list, you can probably determine when the wrong tape was inserted, perhaps even who did it :-) Once this is sorted out, you should use that example as reason why your operators should be educated to use Bacula for tape management and not their caledars ;-) Thanks for any assistance you can provide. I have seen such a thing myself, once, but that was during a beta test phase where I more or less tried to get such results, and it happened before the new locking mechanisms were implemented IIRC. Arno Ryan Novosielski wrote: Here's a followup to myself. Apparently, I have two tapes called catalyst_BW1 and have no idea how I could have gotten into this situation. One of them is a brandy new tape: Volume Label: Id: Bacula 1.0 immortal VerNo : 11 VolName : catalyst_BW1 PrevVolName : VolFile : 0 LabelType : PRE_LABEL LabelSize : 168 PoolName : catalyst_FULL MediaType : DDS-4 PoolType : Backup HostName : helios Date label written: 31-Oct-2006 08:51 ...one of them has been around for awhile, supposedly was not touched, is empty (this is probably OK), but for some reason does not have the right volume name anymore: Volume Label: Id: Bacula 1.0 immortal VerNo : 11 VolName : catalyst_BW1 PrevVolName : VolFile : 0 LabelType : VOL_LABEL LabelSize : 168 PoolName : catalyst_FULL MediaType : DDS-4 PoolType : Backup HostName : helios Date label written: 31-Oct-2006 08:51 I can't tell which media ID's these two tapes think they have. Is there any way with any of the commands that work directly on the tapes (ie. NOT the catalog) to check? It doesn't look like bls is interested in telling me. =R Ryan Novosielski wrote: OK, here's how I got into this mess: Operations staff has a calendar for which tape goes in when -- they wrote it out because they really don't have the knowhow to check for themselves. Well, they messed up because I don't have a full backup scheduled for the fifth Tuesday: Schedule { Name = UMD-F13T-Inc Run = Level=Full Storage=helios_DAT72 1st,3rd tue at 21:00 Run = Level=Incremental Storage=helios_DDS 1st,3rd mon,wed-fri at 23:00 Run = Level=Incremental Storage=helios_DDS 2nd,4th-5th mon-fri at 23:00 } Schedule { Name = UMD-F24T-Inc Run = Level=Full Storage=helios_DAT72 2nd,4th tue at 21:00 Run = Level=Incremental Storage=helios_DDS 2nd,4th mon,wed-fri at 23:00 Run =
Re: [Bacula-users] Destructive Tape Label Crossing!
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Arno Lehmann wrote: Hello, On 11/15/2006 6:22 PM, Ryan Novosielski wrote: Last update to myself: Seems as if what happened is that the wrong tape was inserted during the wrong week. For reasons unknown (if anyone can tell me where to look, I'd be very grateful), the tape, when inserted the wrong week, appears to have been written to. The volume label appears to have been changed to match the tape that it wanted, and written despite the fact that it was not writable at that time (it may have been just about ready to be recycled, but still, it was in the wrong pool anyway). bls indicates that the WRONG tape contains my last week's backups. The new tape does not appear to contain anything. I see no place to get a media ID for any of this stuff, but I suspect both tapes have a mediaID that's the same at this point. As far as I know, media IDs are only stored in the catalog. You might find something if you compare older catalog dumps, checking for changes regarding that volume, but I wouldn't want to do that :-) This is seriously messed up, and even if I -- or my staff -- did something to cause it, I really need to know what to be careful not to do again. If Bacula accidentially overwrites a tape label I would consider that a bug. I guess I should see about filing one, I just don't really have a lot of information to provide at this point. That said, I can imagine situations where such a thing can happen (but never investigated it): Imagine you have a tape inserted in a drive, rewound. The SD uses always open=yes and the polling stuff. The tape is mounted, and thus Bacula knows for sure which tape is in the drive. If you can change the tape without unmounting from Bacula and the drive doesn't inform the OS of that operation, or Bacula doesn't query that status from the OS, and you change the tape in between Bacula acesses, what you describe might happen. (Keep in mind that this is mostly fiction, not science - I don't know if such a thing might happen with any tape drive, OS, or Bacula without indicating a bug.) I'd think so too. However, in this case it appears as if AlwaysOpen is off for this drive. I've seen a case where this exact thing DID happen to someone on this mailing list. Basically the resolution was don't do that. However, here, I do not use that directive for this drive. Theoretically, there's no way for this to have happened. Please let me know what I should provide to the list or how I should troubleshoot this. I'm leaving everything as-is for now so that I have all the evidence. What I'd do is to examine all my tapes to find the one that is missing. If there is exactly one tape label that can't be found, you know at least which volume got overwritten and can invalidate the jobs on it. Also, given the fact that your operators work by a list, you can probably determine when the wrong tape was inserted, perhaps even who did it :-) I've basically done this. The result is that combined_BW1, the tape incorrectly inserted last week, is now catalyst_BW1. catalyst_BW1, consequently, is empty, and combined_BW1 no longer exists. However, I still believe that in this particular case, given the course of events, this should NOT have happened. Once this is sorted out, you should use that example as reason why your operators should be educated to use Bacula for tape management and not their caledars ;-) Is there really an easy way for the staff to determine next tape though, when the storage devices and pools are defined in the schedule? status dir does not show them in these cases (showing instead *unknown*). Thanks for any assistance you can provide. I have seen such a thing myself, once, but that was during a beta test phase where I more or less tried to get such results, and it happened before the new locking mechanisms were implemented IIRC. Arno Ryan Novosielski wrote: Here's a followup to myself. Apparently, I have two tapes called catalyst_BW1 and have no idea how I could have gotten into this situation. One of them is a brandy new tape: Volume Label: Id: Bacula 1.0 immortal VerNo : 11 VolName : catalyst_BW1 PrevVolName : VolFile : 0 LabelType : PRE_LABEL LabelSize : 168 PoolName : catalyst_FULL MediaType : DDS-4 PoolType : Backup HostName : helios Date label written: 31-Oct-2006 08:51 ...one of them has been around for awhile, supposedly was not touched, is empty (this is probably OK), but for some reason does not have the right volume name anymore: Volume Label: Id: Bacula 1.0 immortal VerNo : 11 VolName : catalyst_BW1 PrevVolName : VolFile : 0 LabelType : VOL_LABEL LabelSize : 168 PoolName : catalyst_FULL MediaType : DDS-4
Re: [Bacula-users] Destructive Tape Label Crossing!
Hi, On 11/15/2006 10:05 PM, Ryan Novosielski wrote: ... Is there really an easy way for the staff to determine next tape though, when the storage devices and pools are defined in the schedule? status dir does not show them in these cases (showing instead *unknown*). Why not use the mails Bacula sends when it requests a new tape? If your problem is getting the necessary tapes before they are requested, from off-site storage or a firesafe, then you could simply keep a small number of purged tapes from each volume available. Bacula is flexible enough to accept other tapes than the ones it requests if the tapes qualify. Other than that, a more useful schedule listing would be nice, but then we'd all want that Bacula tells us which would be the next tape it wants when the currently scheduled one fill :-) Arno -- IT-Service Lehmann[EMAIL PROTECTED] Arno Lehmann http://www.its-lehmann.de - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users