Jeff
Good idea about not touching anything again...
 
One colleague a few years back, done a Firmware upgrade to a Tape
Library on a Friday and went on a few weeks vacation.
 
On the Monday, a new person, totally unaware of the backups found every
single one was failing. took 2 weeks to resolve.
 
Never make big changes on the last day of the week, or before you go on
vacation springs to mind :-)
Enjoy the break !
 
Simon

________________________________

From: veritas-bu-boun...@mailman.eng.auburn.edu
[mailto:veritas-bu-boun...@mailman.eng.auburn.edu] On Behalf Of Jeff
Cleverley
Sent: Wednesday, January 20, 2010 6:03 PM
To: Justin Piszcz
Cc: VERITAS-BU@mailman.eng.auburn.edu
Subject: Re: [Veritas-bu] Bpjobd and other failures.


Justin,

Thanks for the reply.  For whatever reason things seem to have magically
started working again.  All I did was shutdown Veritas (again), turned
up the verbosity in bp.conf, and restarted it.  When it first started I
still didn't have bpdbm, bpjobd, etc, running.  The vnetd log had a lot
of errors.  When I ran bpdbjobs from the command line, nothing came
back.

While looking through the bpdbm log I found no errors but a lot of
entries like it was doing backups.  About 10 minutes later I ran
bpdbjobs again and everything showed up and some jobs were running.  I
think this are restarts of some failed jobs so we'll see how they do.
So far 4 of them have completed successfully.

Since I leave the country on vacation tomorrow morning I don't plan on
touching anything else on it today :-)

Thanks again for the help.

Jeff


On Wed, Jan 20, 2010 at 2:11 AM, Justin Piszcz <jpis...@lucidpixels.com>
wrote:


        Hi,
        
        Taking a shot in the dark here, for the tcp issues, try adding:
        net.ipv4.tcp_tw_reuse = 1
        net.ipv4.tcp_tw_recycle = 1
        
        To your /etc/sysctl.conf, reboot.
        
        For vnetd, check your /etc/xinetd.d/vnetd*
        Also check the logs that xinetd is not throttling connections if
too many servers are trying to backup too fast that can happen.
        
        Justin. 


        On Tue, 19 Jan 2010, Jeff Cleverley wrote:
        
        

                Greetings,
                
                While continuing to work on this it seems there may be
issues with vnetd.
                The netstat -a |grep vnet shows this:
                
                tcp        0      0 *:vnetd                     *:*
                LISTEN
                tcp        0      0 sgpbkp04.sgp.avagotec:35781
agt604.sgp.avagotech.:vnetd
                ESTABLISHED
                tcp        0      0 sgpbkp04.sgp.avagotec:35720
sgpbkp04.sgp.avagotec:vnetd
                ESTABLISHED
                tcp        0      0 sgpbkp04.sgp.avagotec:vnetd
sgpbkp04.sgp.avagotec:35720
                ESTABLISHED
                tcp        0      0 localhost.localdomain:vnetd
sgpbkp04.sgp.avagotec:35846
                TIME_WAIT
                tcp        0      0 localhost.localdomain:vnetd
sgpbkp04.sgp.avagotec:35853
                TIME_WAIT
                tcp        0      0 localhost.localdomain:vnetd
sgpbkp04.sgp.avagotec:35839
                TIME_WAIT
                unix  2      [ ACC ]     STREAM     LISTENING     146403
                /usr/openv/var/vnetd/vmd.uds
                unix  2      [ ACC ]     STREAM     LISTENING     145874
                /usr/openv/var/vnetd/bpcompatd.uds
                unix  2      [ ACC ]     STREAM     LISTENING     146786
                /usr/openv/var/vnetd/tldcd.uds
                unix  3      [ ]         STREAM     CONNECTED     152574
                /usr/openv/var/vnetd/bpcompatd.uds
                
                The time_wait entries seem to stick around a lot.  I've
restarted xinetd on
                the system and we have rebooted but things are still
wedged.
                
                Thanks,
                
                Jeff
                
                On Tue, Jan 19, 2010 at 6:00 PM, Jeff Cleverley <
                jeff.clever...@avagotech.com> wrote:
                
                

                        Greetings,
                        
                        Our environment is NB6.5.1 on a RHEL4 server.
It has a hpux SAN media
                        server also.  All other clients are backed up
over the network.  Most are
                        RHEL4x.
                        
                        The tape library in our Singapore office failed
over the weekend and caused
                        a lot of things to fail and continue to be
wedged up.  Some jobs seemed to
                        have run but some failed with errors 13, 63, and
233.  This varied across
                        policies.  I decided to try and restart all
processes and get things cleaned
                        up.  This hasn't worked well.
                        
                        When I started everything using service
netbackup start or
                        /etc/init.d/netbackup start, everything looks
OK.  When I look at things
                        like bpps -a I notice that the bpjobd isn't
running anymore.  When I try to
                        start it manually it fails saying File size
limit exceeded.  The bpdbjobs
                        returns no output.  I haven't been able to
figure out which file it is
                        complaining about.
                        
                        I'm sure I have a lot of things that need to be
cleaned up.  There are a
                        lot of files in the restart and trylogs.  I was
thinking it was safe to move
                        those out of the way but wanted to make sure.
                        
                        Any help on tracking the bpjobd error along with
advice on cleaning up all
                        the restart and trylogs would be appreciated.
Naturally I'm leaving on
                        vacation Thursday so I need to help clean this
up before I go.  I won't be
                        doing any replies to this after Wednesday night
because of that.
                        
                        Thanks,
                        
                        Jeff
                        
                        --
                        Jeff Cleverley
                        Unix Systems Administrator
                        4380 Ziegler Road
                        Fort Collins, Colorado 80525
                        970-288-4611
                        
                        
                        



                -- 
                Jeff Cleverley
                Unix Systems Administrator
                4380 Ziegler Road
                Fort Collins, Colorado 80525
                970-288-4611
                
                




-- 
Jeff Cleverley
Unix Systems Administrator
4380 Ziegler Road
Fort Collins, Colorado 80525
970-288-4611



This email (including any attachments) may contain confidential
and/or privileged information or information otherwise protected
from disclosure. If you are not the intended recipient, please
notify the sender immediately, do not copy this message or any
attachments and do not use it for any purpose or disclose its
content to any person, but delete this message and any attachments
from your system. Astrium disclaims any and all liability if this
email transmission was virus corrupted, altered or falsified.
-o-
Astrium Limited, Registered in England and Wales No. 2449259
Registered Office:
Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2AS, England
_______________________________________________
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

Reply via email to