Re: [Veritas-bu] make_scsi_dev woes under Linux
Hi Dan and NetBackup advisors, I realize this subject is from a very old thread (Tue, 15 Aug 2006) but the situation from 2006 most closely resembles my current problem. I've been upgrading my RHEL3 (32-bit) Linux NetBackup 5.1MP6 servers to RHEL4 (64-bit). All of the servers are Sun X4200 servers. All of the upgrades on my media servers went very well. My colleague and I devised a disk-cloning mechanism for building out a set of RHEL4 disks using spare drive bays and customize them for the server we wish to upgrade. We then put the pre-built disks in the server, boot the server with the new disks. The last media server we upgraded using this process only took about 2 hours of downtime. Now all of my media servers have been upgraded and we are attempting to use the same process for upgrading my Master server. The problem we are having with this master upgrade is in NetBackup's inability to control the tape library robotic. The tape drives and the changer are all visible. We have managed to get our udev/rules.d/20-local.rules file so that it detects the tape drives and changer even if the drives have tapes mounted. The drives and changer are configured via tpconfig; the global database synchronizes without error; the robtest utility can move tapes, load and unload them. But when I tried a test restore to test NetBackup's ability to load and unload tapes, I got the following segfault: Apr 9 11:57:11 errol tldd[1432]: TLD(0) MountTape 040943 on drive 5, from slot 182 Apr 9 11:57:11 errol kernel: tldd[1869]: segfault at rip 00807f4d rsp ca20 error 6 Apr 9 11:57:11 errol tldd[1432]: DecodeMount(): TLD(0) drive 5, Actual status: Process killed by signal Apr 9 11:57:11 errol tldd[1432]: Unexpected response status (11) in DecodeMount I'd appreciate any advice at this point. Could this segfault be caused by my OS upgrade from a 32-bit to a 64-bit OS? The udev/haldaemon device handling was a huge difference between RHEL3 and RHEL4 too. Today's Topics: 6. make_scsi_dev woes under Linux (Daniel Cox) -- Message: 6 Date: Tue, 15 Aug 2006 11:31:38 -0500 From: Daniel Cox [EMAIL PROTECTED] Subject: [Veritas-bu] make_scsi_dev woes under Linux To: veritas-bu@mailman.eng.auburn.edu Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=us-ascii We've got a few media servers running NetBackup 5.1 MP5 under Linux (RedHat AS4) and we're having no end of problems with FC attached tape drive device mappings. I see when NB starts it runs make_scsi_dev, which creates the following devices: [EMAIL PROTECTED] ~ # ls -l /dev/st total 0 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t0l0 - /dev/st5 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t1l0 - /dev/st4 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t2l0 - /dev/st3 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t0l0 - /dev/st1 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t1l0 - /dev/st0 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t2l0 - /dev/st2 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t0l0 - /dev/nst5 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t1l0 - /dev/nst4 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t2l0 - /dev/nst3 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t0l0 - /dev/nst1 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t1l0 - /dev/nst0 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t2l0 - /dev/nst2 There seems to be 2 big problems with this. The devices as created by the OS (st*, nst*) can change due to HBA driver upgrades, PCI bus detection order changes, somebody moving an HBA around on the system or somebody moving a drive around in the SAN for various reasons (port based zoning). Another problem is if any of the previous scenarios occur then NB creates entirely different /dev/st/*, /dev/sg/* entries to represent the new host/controller/target/lun detection order. Naturally either of these scenarios results in drive and robotic library id mismatches and either netbackup refusing to start or drives going into perm DOWN state. We can use 2.6 kernel udev rules to map WWNs to OS devices and always have consistent /dev/st*, /dev/sg* device names to get around the first problem; however the NB auto-created devices can still change so we are stuck with things occasionally breaking and then we waste a fare amount of time putting it all back together again. Is there some better way of handling this? Dan- --Kathy Kathryn Hemness[EMAIL PROTECTED] Infrastructure Servicesphone: 530.752.6547 Campus Data Center Client Services fax: 530.752.9154 ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] make_scsi_dev woes under Linux
In 5.1 MP1+ placing ENABLE_AUTO_PATH_CORRECTION in the vm.conf file should also help. Additionally, it creates files in /dev/sg/ that point to any SCSI device (disks, tape, controllers, etc. This appears to aid in auto-detection, etc). Look at /proc/scsi/scsi, everything in there should map to a /dev/sgX device. Justin King [EMAIL PROTECTED] 12/6/2006 12:41 PM Near as I can tell, make_scsi_dev determines which drives are connected to the system and creates symlinks to those corresponding devices. When you load your *st* module, you get something like this: Nov 30 16:08:52 kyle kernel: st: Version 20030406, bufsize 262144, max init. bufs 8, s/g segs 16 Nov 30 16:08:52 kyle kernel: Attached scsi tape st0 at scsi1, channel 0, id 0, lun 2 Nov 30 16:08:52 kyle kernel: Attached scsi tape st1 at scsi1, channel 0, id 0, lun 3 Nov 30 16:08:52 kyle kernel: Attached scsi tape st2 at scsi1, channel 0, id 0, lun 4 Nov 30 16:08:52 kyle kernel: Attached scsi tape st3 at scsi2, channel 0, id 0, lun 5 Nov 30 16:08:52 kyle kernel: Attached scsi tape st4 at scsi2, channel 0, id 0, lun 6 Nov 30 16:08:52 kyle kernel: Attached scsi tape st5 at scsi2, channel 0, id 0, lun 7 When make_scsi_dev is run, it creates the following files in /dev/st/ (clearly, this is my unique configuration, YMMV) H1C0T0L2 - /dev/st0 H1C0T0L3 - /dev/st1 H1C0T0L4 - /dev/st2 H2C0T0L5 - /dev/st3 H2C0T0L6 - /dev/st4 H2C0T0L7 - /dev/st5 NH1C0T0L2 - /dev/nst0 NH1C0T0L3 - /dev/nst1 NH1C0T0L4 - /dev/nst2 NH2C0T0L5 - /dev/nst3 NH2C0T0L6 - /dev/nst4 NH2C0T0L7 - /dev/nst5 Additionally, it creates files in /dev/sg/ that point to any SCSI device (disks, tape, controllers, etc. This appears to aid in auto-detection, etc). Look at /proc/scsi/scsi, everything in there should map to a /dev/sgX device. Example: H0C0T0L0 - /dev/sg0 H1C0T0L0 - /dev/sg1 * H2C0T0L7 - /dev/sg10 In your case, you should be able to create these symlinks yourself and comment out the make_scsi_dev command. As people have pointed out, if devices move around, then you*ll have to recreate these * it isn*t elegant, but it might get you up and running. From:[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matthew Johnson Sent: Tuesday, December 05, 2006 12:57 PM To: [EMAIL PROTECTED]; Daniel Cox; veritas-bu@mailman.eng.auburn.edu Subject: [Veritas-bu] make_scsi_dev woes under Linux Hello all, I am having problems with the make_scsi_dev command as well. I have read every thing out there and some are great ideas, however when I run it on a particular system it crashed the system, nowhere have I read this happening. if any one out there has info as to what the exact commands that are run within the binary it would be great. this may be a system problem and until i find out what is being run i am stuck between Veritas and IBM. Thanks Matt From:[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of smpt Sent: Tuesday, August 15, 2006 10:36 PM To: 'Daniel Cox'; veritas-bu@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] make_scsi_dev woes under Linux If your environment is SSO I will give you 2 tips. Wait for the completion of all backups and then run make_scsi_dev. Then configure all Linux drives. After that comment (#) the make_scsi_dev line at the NetBackup startup script. And do not reboot the system. If you do it you must run the make_scsi_dev before netbackup start (and all drives must be idle) I have install several linux systems and all working fine. smpt From:[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Daniel Cox Sent: Tuesday, August 15, 2006 7:32 PM To: veritas-bu@mailman.eng.auburn.edu Subject: [Veritas-bu] make_scsi_dev woes under Linux We*ve got a few media servers running NetBackup 5.1 MP5 under Linux (RedHat AS4) and we*re having no end of problems with FC attached tape drive device mappings. I see when NB starts it runs make_scsi_dev, which creates the following devices: [EMAIL PROTECTED] ~ # ls -l /dev/st total 0 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t0l0 - /dev/st5 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t1l0 - /dev/st4 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t2l0 - /dev/st3 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t0l0 - /dev/st1 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t1l0 - /dev/st0 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t2l0 - /dev/st2 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t0l0 - /dev/nst5 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t1l0 - /dev/nst4 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t2l0 - /dev/nst3 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t0l0 - /dev/nst1 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t1l0 - /dev/nst0 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t2l0 - /dev/nst2 There seems to be 2 big problems with this. The devices as created by the OS (st*, nst*) can change due to HBA driver upgrades, PCI bus detection order changes, somebody moving an HBA around on the system
[Veritas-bu] make_scsi_dev woes under Linux
Hello all, I am having problems with the make_scsi_dev command as well. I have read every thing out there and some are great ideas, however when I run it on a particular system it crashed the system, nowhere have I read this happening. if any one out there has info as to what the exact commands that are run within the binary it would be great. this may be a system problem and until i find out what is being run i am stuck between Veritas and IBM. Thanks Matt From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of smpt Sent: Tuesday, August 15, 2006 10:36 PM To: 'Daniel Cox'; veritas-bu@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] make_scsi_dev woes under Linux If your environment is SSO I will give you 2 tips. Wait for the completion of all backups and then run make_scsi_dev. Then configure all Linux drives. After that comment (#) the make_scsi_dev line at the NetBackup startup script. And do not reboot the system. If you do it you must run the make_scsi_dev before netbackup start (and all drives must be idle) I have install several linux systems and all working fine. smpt From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Daniel Cox Sent: Tuesday, August 15, 2006 7:32 PM To: veritas-bu@mailman.eng.auburn.edu Subject: [Veritas-bu] make_scsi_dev woes under Linux We've got a few media servers running NetBackup 5.1 MP5 under Linux (RedHat AS4) and we're having no end of problems with FC attached tape drive device mappings. I see when NB starts it runs make_scsi_dev, which creates the following devices: [EMAIL PROTECTED] ~ # ls -l /dev/st total 0 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t0l0 - /dev/st5 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t1l0 - /dev/st4 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t2l0 - /dev/st3 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t0l0 - /dev/st1 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t1l0 - /dev/st0 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t2l0 - /dev/st2 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t0l0 - /dev/nst5 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t1l0 - /dev/nst4 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t2l0 - /dev/nst3 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t0l0 - /dev/nst1 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t1l0 - /dev/nst0 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t2l0 - /dev/nst2 There seems to be 2 big problems with this. The devices as created by the OS (st*, nst*) can change due to HBA driver upgrades, PCI bus detection order changes, somebody moving an HBA around on the system or somebody moving a drive around in the SAN for various reasons (port based zoning). Another problem is if any of the previous scenarios occur then NB creates entirely different /dev/st/*, /dev/sg/* entries to represent the new host/controller/target/lun detection order. Naturally either of these scenarios results in drive and robotic library id mismatches and either netbackup refusing to start or drives going into perm DOWN state. We can use 2.6 kernel udev rules to map WWNs to OS devices and always have consistent /dev/st*, /dev/sg* device names to get around the first problem; however the NB auto-created devices can still change so we are stuck with things occasionally breaking and then we waste a fare amount of time putting it all back together again. Is there some better way of handling this? Dan- * Note: The information contained in this message and any attachment to it is privileged, confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to the message, and please delete it from your system. Thank you. NYSE Group. ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] make_scsi_dev woes under Linux
On 8/15/06, Daniel Cox [EMAIL PROTECTED] wrote: We've got a few media servers running NetBackup 5.1 MP5 under Linux (RedHat AS4) and we're having no end of problems with FC attached tape drive device mappings. I see when NB starts it runs make_scsi_dev, which creates the following devices: If you have people installing new hba drivers or re-arranging hba's on your systems without your knowledge, then you clearly have procedural issues to deal with. Not a technical problem. In any event, how about just moving make_scsi_dev aside and replace it with with your own script that does what you want? All it does is make symlinks because netbackup is too stupid to use normal linux device names. It wants to use crazy solaris style names for some unknown reason. [snip]-- -Tim ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
[Veritas-bu] make_scsi_dev woes under Linux
Weve got a few media servers running NetBackup 5.1 MP5 under Linux (RedHat AS4) and were having no end of problems with FC attached tape drive device mappings. I see when NB starts it runs make_scsi_dev, which creates the following devices: [EMAIL PROTECTED] ~ # ls -l /dev/st total 0 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t0l0 - /dev/st5 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t1l0 - /dev/st4 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t2l0 - /dev/st3 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t0l0 - /dev/st1 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t1l0 - /dev/st0 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t2l0 - /dev/st2 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t0l0 - /dev/nst5 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t1l0 - /dev/nst4 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t2l0 - /dev/nst3 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t0l0 - /dev/nst1 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t1l0 - /dev/nst0 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t2l0 - /dev/nst2 There seems to be 2 big problems with this. The devices as created by the OS (st*, nst*) can change due to HBA driver upgrades, PCI bus detection order changes, somebody moving an HBA around on the system or somebody moving a drive around in the SAN for various reasons (port based zoning). Another problem is if any of the previous scenarios occur then NB creates entirely different /dev/st/*, /dev/sg/* entries to represent the new host/controller/target/lun detection order. Naturally either of these scenarios results in drive and robotic library id mismatches and either netbackup refusing to start or drives going into perm DOWN state. We can use 2.6 kernel udev rules to map WWNs to OS devices and always have consistent /dev/st*, /dev/sg* device names to get around the first problem; however the NB auto-created devices can still change so we are stuck with things occasionally breaking and then we waste a fare amount of time putting it all back together again. Is there some better way of handling this? Dan- * Note: The information contained in this message and any attachment to it is privileged, confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to the message, and please delete it from your system. Thank you. NYSE Group. ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] make_scsi_dev woes under Linux
If your environment is SSO I will give you 2 tips. Wait for the completion of all backups and then run make_scsi_dev. Then configure all Linux drives. After that comment (#) the make_scsi_dev line at the NetBackup startup script. And do not reboot the system. If you do it you must run the make_scsi_dev before netbackup start (and all drives must be idle) I have install several linux systems and all working fine. smpt From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Daniel Cox Sent: Tuesday, August 15, 2006 7:32 PM To: veritas-bu@mailman.eng.auburn.edu Subject: [Veritas-bu] make_scsi_dev woes under Linux Weve got a few media servers running NetBackup 5.1 MP5 under Linux (RedHat AS4) and were having no end of problems with FC attached tape drive device mappings. I see when NB starts it runs make_scsi_dev, which creates the following devices: [EMAIL PROTECTED] ~ # ls -l /dev/st total 0 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t0l0 - /dev/st5 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t1l0 - /dev/st4 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h0c0t2l0 - /dev/st3 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t0l0 - /dev/st1 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t1l0 - /dev/st0 lrwxrwxrwx 1 root root 8 2006-08-15 12:28 h1c0t2l0 - /dev/st2 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t0l0 - /dev/nst5 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t1l0 - /dev/nst4 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh0c0t2l0 - /dev/nst3 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t0l0 - /dev/nst1 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t1l0 - /dev/nst0 lrwxrwxrwx 1 root root 9 2006-08-15 12:28 nh1c0t2l0 - /dev/nst2 There seems to be 2 big problems with this. The devices as created by the OS (st*, nst*) can change due to HBA driver upgrades, PCI bus detection order changes, somebody moving an HBA around on the system or somebody moving a drive around in the SAN for various reasons (port based zoning). Another problem is if any of the previous scenarios occur then NB creates entirely different /dev/st/*, /dev/sg/* entries to represent the new host/controller/target/lun detection order. Naturally either of these scenarios results in drive and robotic library id mismatches and either netbackup refusing to start or drives going into perm DOWN state. We can use 2.6 kernel udev rules to map WWNs to OS devices and always have consistent /dev/st*, /dev/sg* device names to get around the first problem; however the NB auto-created devices can still change so we are stuck with things occasionally breaking and then we waste a fare amount of time putting it all back together again. Is there some better way of handling this? Dan- * Note: The information contained in this message and any attachment to it is privileged, confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to the message, and please delete it from your system. Thank you. NYSE Group. ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu