Hi Martin,

Thank you for your response and suggestion on how to get visibility into 
what is going on when the director appears to hang.  =)

I recently lowered the "Maximum Concurrent Jobs" directive in the 
Director stanza and have avoided doing a reload or query of the director 
in order to reduce the amount of "stuff" going on that could be 
exacerbating the problem.  So far Bacula hasn't crashed however it did 
run for 3 days immediately after doing the upgrade to 5.2.2 so I have a 
sneaking suspicion Bacula may still crash when more fulls kick off this 
weekend.  I will definitely use the gdb trick you mentioned to see what 
is going on if/when that happens.  I also fixed the btraceback script 
(it wasn't getting the correction location of the bacula-dir executable 
passed to it) so when/if it does crash, I'll have something more to 
provide you guys for clues.

In parallel I'm also working to compile Bacula 5.2.3 for FreeBSD using 
the compile options from Dan Langille's port for 5.2.2 and if that goes 
well will do the upgrade to see if that has any affect on the current 
issues I have been seeing.

Thanks again and I'll keep you guys posted.

-Jenny  =)

On 01/05/2012 04:22 AM, Martin Simmons wrote:
>>>>>> On Tue, 03 Jan 2012 16:43:24 -0800, Jenny Aquilino said:
>> I have been happily running bacula-server-5.0.3 with postgresql-9.0.5_1
>> on a FreeBSD 8 server until an ill-fated chain of events led to me to
>> have to upgrade bacula-server to 5.2.2 before I had a chance to test it
>> in a development environment.  Although I followed the release notes by
>> upgrading the storage nodes and running 'update_bacula_tables,' since
>> the upgrade the server (director) has crashed twice in 5 days and I've
>> had to manually restart it a couple of times after normal queries like
>> "status storage=X" appear to hang.
>>
>> Based on analysis of Munin graphs that report things like PostgreSQL
>> connections, PostgreSQL locks, process states, network connections
>> (netstat), and memory utilization it appears that something significant
>> has changed between 5.0.3 and 5.2.2 that is leaving a very high number
>> of PostgreSQL connections in Idle state instead of being closed.  When
>> Bacula crashes the PostgreSQL connections graph shows a large number of
>> connections in "Waiting for lock" state.  At the same time looking at
>> the PostgreSQL locks graph shows a very large number of
>> "ShareRowExclusive" and "AccessShare" locks which is behaviour we didn't
>> see prior to upgrading to 5.2.2.  If anyone would like a copy of these
>> graphs I can send them to you directly or post them to the mailing list
>> if that is allowed.
>>
>> I know that 5.2.3 was released on 12/16 and saw that there was a bug fix
>> with update stats that I thought may be related to what I'm seeing
>> however have not updated because 5.2.3 has yet to make it into the
>> FreeBSD ports collection.  Based on the problem I described does this
>> sound like something that may have been fixed in 5.2.3?
> I can't answer that because the bug in question (3419) is not in the Mantis
> bug reporting system.
>
>
>>                                                           If not, does
>> anyone have other ideas on what I can do to troubleshoot?
> You could attach gdb to the director process when it hangs and see what is
> happening with
>
> thread apply all bt
>
> __Martin
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Bacula-devel mailing list
> Bacula-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-devel

-- 
=======================================================
Jennifer R. Aquilino
S&T IT Support
Lawrence Livermore National Laboratory
Mail Stop L-556
7000 East Avenue
Livermore, CA 94550

Voice: (925)-424-4585
Fax:   (925)-423-8719
Email: aquili...@llnl.gov
========================================================


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual 
desktops for less than the cost of PCs and save 60% on VDI infrastructure 
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to