Re: [Veritas-bu] Question posed to ACSLS/STK8500 users.

Mike Dunn (veritas-bu) Fri, 08 Dec 2006 12:27:36 -0800

Hmmm, do your media server's have multiple NIC, and are you using IP
multipathing software? (like in.mpathd under Solaris)  If so, then make
sure that you have set the ACS_SSI_HOSTNAME appropriately in your vm.conf
file.  The acs daemon inserts the value (or inferred value) of
ACS_SSI_HOSTNAME into all communications with the acs server.  Also, make
sure that if you are using acls on the acs server, that they match the
name/IP used in ACS_SSI_HOSTNAME.


  Cheers
  Mike


On 1:43:52 pm 2006-12-08 Justin Piszcz <[EMAIL PROTECTED]> wrote:
> It is 100% correct.  Yep.  I ran about 5 test backups to each drive
> in the robot.  No problems.  It is only when there is a burst of jobs.
>
> Justin.
>
> On Fri, 8 Dec 2006, Mike Dunn (veritas-bu) wrote:
>
> >  Justin,
> >
> >  Are you absolutely certain that you have your drive mapping done
> >  properly? The fact that the job fails 30 minutes after the initial
> >  mount attempt makes it sound like you are failing with a media
> >  mount time out.  The most common cause (especially with ACS
> >  environments) is a simple mismatch betwee the /dev/rmt path and
> >  your ACS path (i.e. ACS,LSM,PANEL,DRIVE).  The SL8500 is also very
> >  difficult to address properly, since the ACS path has little
> >  correlation with the physical location of the drive.
> >  Probably the quickest test you can perform is to verify that your
> >  jobs are being affected by the media mount timeout.  If you
> >  shorten the media mount timeout parameter, to say 10 minutes, your
> >  jobs should fail 10 minutes after they start if the mount timeout
> >  is what fails the jobs.
> >  You should also track down which drives are failing to mount, and
> >  see if there is a correlation.
> >
> >    Cheers
> >    Mike
> >
> >
> > >
> > >  Message: 7
> > >  Date: Fri, 8 Dec 2006 11:08:39 -0500 (EST)
> > >  From: Justin Piszcz <[EMAIL PROTECTED]>
> > >  Subject: [Veritas-bu] Question posed to ACSLS/STK8500 users.
> > >  To: veritas-bu@mailman.eng.auburn.edu
> > >  Message-ID: <[EMAIL PROTECTED]>
> > >  Content-Type: TEXT/PLAIN; charset=US-ASCII
> > >
> > >  All,
> > >
> > >  My group is setting up two Sun/StorageTek SL8500s.  Sun did the
> > >  install of ACSLS, there were no problems on their side.  Each
> > >  SL8500 is in its own environment.  On each SL8500, we have 8
> > >  media servers, connected to four drives each, giving us a total
> > >  of 32 drives.  For testing, I did the following.  Ran a
> > >  NON-MULTIPLEXED backup to each drive, to ensure each drive
> > >  worked properly.  To do this I kicked off four jobs in
> > >  succession. When I do this, I utilize all 4 drives.  I did this
> > >  with each media server without a single problem.  However, when
> > >  testing everything together, all 32 drives, I kick off 45 jobs
> > >  for example.  It says there are 32 active jobs in netbackup,
> > >  which is correct.  The problem is, randomly, 2 or 3 jobs will
> > >  hang at "Mounting MediaID.." and then the drive will go down
> after 30 minutes.  Why is this?  With an L700, I can send 500-1000 jobs
> > >  to all of the drives in it and there is never a mounting
> > >  problem.  There is nothing wrong with any of the drives, they
> > >  are brand new.  I can use ACSLS and dismount the media from the
> > >  drives and then re-run my earlier test backups, one at a time to
> > >  each of the four drives per-media server without any issues.  It
> > >  is only when the robot receives a 'burst' of jobs that this
> > > happens.
> > >  Has anyone experienced anything like this before?
> > >
> > >  Thanks for any help and responses,
> > >
> > >  Justin.
> > >
> > >
> >
> >  _______________________________________________
> >  Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
> >  http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

_______________________________________________
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

Re: [Veritas-bu] Question posed to ACSLS/STK8500 users.

Reply via email to