On Wednesday 12 September 2007 22:53, Dan Langille wrote:
> On 12 Sep 2007 at 22:42, Kern Sibbald wrote:
> > On Wednesday 12 September 2007 21:25, Dan Langille wrote:
> > > On 12 Sep 2007 at 20:15, Martin Simmons wrote:
> > > > >>>>> On Wed, 12 Sep 2007 12:00:59 -0400, Dan Langille said:
> > > > >
> > > > > Priority: normal
> > > > > Content-description: Mail message body
> > > > >
> > > > > After I encounterd repeated cores of bacula-sd, Eric Bollengier
> > > > > suggested this patch.  It passed my first set of regression tests
> > > > > without producing a bacula-sd.core
> > > > >
> > > > > The traceback appears at the end of this email.  Any objections to
> > > > > a commit now?
> > > > >
> > > > > $ svn di
> > > > > Index: stored/stored.c
> > > > > ===================================================================
> > > > > --- stored/stored.c     (revision 5534)
> > > > > +++ stored/stored.c     (working copy)
> > > > > @@ -600,10 +600,10 @@
> > > > >     if (debug_level > 10) {
> > > > >        print_memory_pool_stats();
> > > > >     }
> > > > > -   term_reservations_lock();
> > > > >     term_msg();
> > > > >     cleanup_crypto();
> > > > >     free_volume_list();
> > > > > +   term_reservations_lock();
> > > > >     close_memory_pool();
> > > > >
> > > > >     sm_dump(false);                    /* dump orphaned buffers */
> > > > > $
> > > >
> > > > Looks right, but that won't fix the bacula-fd crash you included
> > > > below
> > > >
> > > > :-)
>
> The above code passed regression tests on both FreeBSD 6.2 and 7.x.
>
> I'll commit it soon.
>
> > > You win:
> > >
> > > [EMAIL PROTECTED]:~/src/BaculaRegressionTesting] $ gdb bin/bacula-sd 
> > > bacula-
> > > sd.core
> > > GNU gdb 6.1.1 [FreeBSD]
> > > Copyright 2004 Free Software Foundation, Inc.
> > > GDB is free software, covered by the GNU General Public License, and
> > > you are
> > > welcome to change it and/or distribute copies of it under certain
> > > conditions.
> > > Type "show copying" to see the conditions.
> > > There is absolutely no warranty for GDB.  Type "show warranty" for
> > > details.
> > > This GDB was configured as "i386-marcel-freebsd"...
> > > Core was generated by `bacula-sd'.
> > > Program terminated with signal 11, Segmentation fault.
> > > Reading symbols from /lib/libz.so.4...done.
> > > Loaded symbols for /lib/libz.so.4
> > > Reading symbols from /lib/libthr.so.3...done.
> > > Loaded symbols for /lib/libthr.so.3
> > > Reading symbols from /usr/local/lib/libintl.so.8...done.
> > > Loaded symbols for /usr/local/lib/libintl.so.8
> > > Reading symbols from /usr/local/lib/libiconv.so.3...done.
> > > Loaded symbols for /usr/local/lib/libiconv.so.3
> > > Reading symbols from /usr/lib/libssl.so.5...done.
> > > Loaded symbols for /usr/lib/libssl.so.5
> > > Reading symbols from /lib/libcrypto.so.5...done.
> > > Loaded symbols for /lib/libcrypto.so.5
> > > Reading symbols from /usr/lib/libstdc++.so.6...done.
> > > Loaded symbols for /usr/lib/libstdc++.so.6
> > > Reading symbols from /lib/libm.so.5...done.
> > > Loaded symbols for /lib/libm.so.5
> > > Reading symbols from /lib/libgcc_s.so.1...done.
> > > Loaded symbols for /lib/libgcc_s.so.1
> > > Reading symbols from /lib/libc.so.7...done.
> > > Loaded symbols for /lib/libc.so.7
> > > Reading symbols from /libexec/ld-elf.so.1...done.
> > > Loaded symbols for /libexec/ld-elf.so.1
> > > #0  0x0808ab61 in jcr_walk_start () at dlist.h:187
> > > 187     dlist.h: No such file or directory.
> > >         in dlist.h
> > > [New Thread 0x28601900 (LWP 100488)]
> > > [New Thread 0x28601100 (LWP 100346)]
> > > (gdb) backtrace
> > > #0  0x0808ab61 in jcr_walk_start () at dlist.h:187
> > > #1  0x0808aee0 in get_jobid_from_tid (tid=0x28601900) at jcr.c:467
> > > #2  0x0808af43 in get_jobid_from_tid () at jcr.c:460
> > > #3  0x080731d3 in free_volume_list () at reserve.c:514
> > > #4  0x0804c9ac in terminate_stored (sig=15) at stored.c:606
> > > #5  0x08097519 in signal_handler (sig=15) at signal.c:180
> > > #6  0xbfbfffb4 in ?? ()
> > > #7  0x0000000f in ?? ()
> > > #8  0x00000000 in ?? ()
> > > #9  0xbf3f8bc0 in ?? ()
> > > #10 0x00000000 in ?? ()
> > > #11 0x08097260 in init_stack_dump () at signal.c:190
> > > #12 0x281051de in pthread_cond_init () from /lib/libthr.so.3
> > > #13 0x0809b2d6 in workq_server (arg=0x28644100) at workq.c:332
> > > #14 0x280fea7f in pthread_getprio () from /lib/libthr.so.3
> > > #15 0xbf1f6fec in ?? ()
> > > Current language:  auto; currently c++
> > > (gdb)
> > >
> > > Now... you gotta fix for that fd core? ;)
> >
> > The above is an SD crash due to the term_reservations_lock() problem.
>
> Again?  Sorry.  Try this FD:
>
> [EMAIL PROTECTED]:~/src/BaculaRegressionTesting] $ gdb bin/bacula-fd bacula-
> fd.core
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for
> details.
> This GDB was configured as "i386-marcel-freebsd"...
> Core was generated by `bacula-fd'.
> Program terminated with signal 11, Segmentation fault.
> Reading symbols from /lib/libz.so.4...done.
> Loaded symbols for /lib/libz.so.4
> Reading symbols from /lib/libthr.so.3...done.
> Loaded symbols for /lib/libthr.so.3
> Reading symbols from /usr/local/lib/libintl.so.8...done.
> Loaded symbols for /usr/local/lib/libintl.so.8
> Reading symbols from /usr/local/lib/libiconv.so.3...done.
> Loaded symbols for /usr/local/lib/libiconv.so.3
> Reading symbols from /usr/lib/libssl.so.5...done.
> Loaded symbols for /usr/lib/libssl.so.5
> Reading symbols from /lib/libcrypto.so.5...done.
> Loaded symbols for /lib/libcrypto.so.5
> Reading symbols from /usr/lib/libstdc++.so.6...done.
> Loaded symbols for /usr/lib/libstdc++.so.6
> Reading symbols from /lib/libm.so.5...done.
> Loaded symbols for /lib/libm.so.5
> Reading symbols from /lib/libgcc_s.so.1...done.
> Loaded symbols for /lib/libgcc_s.so.1
> Reading symbols from /lib/libc.so.7...done.
> Loaded symbols for /lib/libc.so.7
> Reading symbols from /libexec/ld-elf.so.1...done.
> Loaded symbols for /libexec/ld-elf.so.1
> #0  get_first_port_host_order (addrs=0x5a5a5a5a) at address_conf.h:64
> 64      address_conf.h: No such file or directory.
>         in address_conf.h
> [New Thread 0x28601200 (LWP 100058)]
> [New Thread 0x28601100 (LWP 100478)]
> (gdb) backtrace
> #0  get_first_port_host_order (addrs=0x5a5a5a5a) at address_conf.h:64
> #1  0x0804cc93 in terminate_filed (sig=15) at filed.c:238
> #2  0x0807dbf9 in signal_handler (sig=15) at signal.c:180
> #3  0xbfbfffb4 in ?? ()
> #4  0x0000000f in ?? ()
> #5  0x00000000 in ?? ()
> #6  0xbf9febc0 in ?? ()
> #7  0x00000000 in ?? ()
> #8  0x0807d940 in init_stack_dump () at signal.c:190
> #9  0x280e41de in pthread_cond_init () from /lib/libthr.so.3
> #10 0x08081588 in watchdog_thread (arg=0x0) at watchdog.c:307
> #11 0x280dda7f in pthread_getprio () from /lib/libthr.so.3
> #12 0x00000000 in ?? ()
> Current language:  auto; currently c++
> (gdb)

Ah, yes, that looks better.  As I mentioned in a previous email. I believe I 
have a fix for it.

>
> > By the way, I think there is a problem with your kernel/debugger.  The
> > stack dump after about frame 10 looks to me like they have gone too far,
> > perhaps into another thread or something ...
>
> Could that happen if I did a gdb on a old core?  That is, one which
> was not produced from the current source code?

Uh, yes, that is possible.

Regards,

Kern

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to