Hmmm, do your media server's have multiple NIC, and are you using IP multipathing software? (like in.mpathd under Solaris) If so, then make sure that you have set the ACS_SSI_HOSTNAME appropriately in your vm.conf file. The acs daemon inserts the value (or inferred value) of ACS_SSI_HOSTNAME into all communications with the acs server. Also, make sure that if you are using acls on the acs server, that they match the name/IP used in ACS_SSI_HOSTNAME.
Cheers Mike On 1:43:52 pm 2006-12-08 Justin Piszcz <[EMAIL PROTECTED]> wrote: > It is 100% correct. Yep. I ran about 5 test backups to each drive > in the robot. No problems. It is only when there is a burst of jobs. > > Justin. > > On Fri, 8 Dec 2006, Mike Dunn (veritas-bu) wrote: > > > Justin, > > > > Are you absolutely certain that you have your drive mapping done > > properly? The fact that the job fails 30 minutes after the initial > > mount attempt makes it sound like you are failing with a media > > mount time out. The most common cause (especially with ACS > > environments) is a simple mismatch betwee the /dev/rmt path and > > your ACS path (i.e. ACS,LSM,PANEL,DRIVE). The SL8500 is also very > > difficult to address properly, since the ACS path has little > > correlation with the physical location of the drive. > > Probably the quickest test you can perform is to verify that your > > jobs are being affected by the media mount timeout. If you > > shorten the media mount timeout parameter, to say 10 minutes, your > > jobs should fail 10 minutes after they start if the mount timeout > > is what fails the jobs. > > You should also track down which drives are failing to mount, and > > see if there is a correlation. > > > > Cheers > > Mike > > > > > > > > > > Message: 7 > > > Date: Fri, 8 Dec 2006 11:08:39 -0500 (EST) > > > From: Justin Piszcz <[EMAIL PROTECTED]> > > > Subject: [Veritas-bu] Question posed to ACSLS/STK8500 users. > > > To: veritas-bu@mailman.eng.auburn.edu > > > Message-ID: <[EMAIL PROTECTED]> > > > Content-Type: TEXT/PLAIN; charset=US-ASCII > > > > > > All, > > > > > > My group is setting up two Sun/StorageTek SL8500s. Sun did the > > > install of ACSLS, there were no problems on their side. Each > > > SL8500 is in its own environment. On each SL8500, we have 8 > > > media servers, connected to four drives each, giving us a total > > > of 32 drives. For testing, I did the following. Ran a > > > NON-MULTIPLEXED backup to each drive, to ensure each drive > > > worked properly. To do this I kicked off four jobs in > > > succession. When I do this, I utilize all 4 drives. I did this > > > with each media server without a single problem. However, when > > > testing everything together, all 32 drives, I kick off 45 jobs > > > for example. It says there are 32 active jobs in netbackup, > > > which is correct. The problem is, randomly, 2 or 3 jobs will > > > hang at "Mounting MediaID.." and then the drive will go down > after 30 minutes. Why is this? With an L700, I can send 500-1000 jobs > > > to all of the drives in it and there is never a mounting > > > problem. There is nothing wrong with any of the drives, they > > > are brand new. I can use ACSLS and dismount the media from the > > > drives and then re-run my earlier test backups, one at a time to > > > each of the four drives per-media server without any issues. It > > > is only when the robot receives a 'burst' of jobs that this > > > happens. > > > Has anyone experienced anything like this before? > > > > > > Thanks for any help and responses, > > > > > > Justin. > > > > > > > > > > _______________________________________________ > > Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu > > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu _______________________________________________ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu