In case anyone is interested. We basically fixed the problem by rebooting
one of the media servers. It appears that it was having intermittent
difficult, or perhaps one way, communication with the master server. The
master server was unable to empty it queue (forgotten the name) L so it
looked like everything was hung.

We found the problem from two different angles by two different people.

On the media server we saw very slow responses to vmoprcmd and tpconfig
(among others). The most interesting thing was that the media server could
ping the physical master server but not the virtual (it's clustered, both on
same subnet). However the master server could ping the media server and
other media servers had no problem pinging both. After the reboot, the
medias server was able to ping both. L

>From another angle but in parallel someone ran some analysis tools on the
master server and determined the above queue problem. After rebooting the
medias server (it had been up for 45 days (Linux)) everything worked find.
We went from a success rate of 65% to 80% (on a good day) to 95% the first
night of the fix and 99% last night.

This was all because of one errant media server out of 60. L

 

Regards,

 

Patrick Whelan

VERITAS Certified NetBackup Support Engineer for UNIX.

VERITAS Certified NetBackup Support Engineer for Windows.

 

 <mailto:netbac...@whelan-consulting.co.uk>
netbac...@whelan-consulting.co.uk

 

 

_______________________________________________
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

Reply via email to