On Wednesday 12 September 2007 22:53, Dan Langille wrote:
> On 12 Sep 2007 at 22:42, Kern Sibbald wrote:
> > On Wednesday 12 September 2007 21:25, Dan Langille wrote:
> > > On 12 Sep 2007 at 20:15, Martin Simmons wrote:
> > > > >>>>> On Wed, 12 Sep 2007 12:00:59 -0400, Dan Langille said:
> > > > >
> > > > > Priority: normal
> > > > > Content-description: Mail message body
> > > > >
> > > > > After I encounterd repeated cores of bacula-sd, Eric Bollengier
> > > > > suggested this patch. It passed my first set of regression tests
> > > > > without producing a bacula-sd.core
> > > > >
> > > > > The traceback appears at the end of this email. Any objections to
> > > > > a commit now?
> > > > >
> > > > > $ svn di
> > > > > Index: stored/stored.c
> > > > > ===================================================================
> > > > > --- stored/stored.c (revision 5534)
> > > > > +++ stored/stored.c (working copy)
> > > > > @@ -600,10 +600,10 @@
> > > > > if (debug_level > 10) {
> > > > > print_memory_pool_stats();
> > > > > }
> > > > > - term_reservations_lock();
> > > > > term_msg();
> > > > > cleanup_crypto();
> > > > > free_volume_list();
> > > > > + term_reservations_lock();
> > > > > close_memory_pool();
> > > > >
> > > > > sm_dump(false); /* dump orphaned buffers */
> > > > > $
> > > >
> > > > Looks right, but that won't fix the bacula-fd crash you included
> > > > below
> > > >
> > > > :-)
>
> The above code passed regression tests on both FreeBSD 6.2 and 7.x.
>
> I'll commit it soon.
>
> > > You win:
> > >
> > > [EMAIL PROTECTED]:~/src/BaculaRegressionTesting] $ gdb bin/bacula-sd
> > > bacula-
> > > sd.core
> > > GNU gdb 6.1.1 [FreeBSD]
> > > Copyright 2004 Free Software Foundation, Inc.
> > > GDB is free software, covered by the GNU General Public License, and
> > > you are
> > > welcome to change it and/or distribute copies of it under certain
> > > conditions.
> > > Type "show copying" to see the conditions.
> > > There is absolutely no warranty for GDB. Type "show warranty" for
> > > details.
> > > This GDB was configured as "i386-marcel-freebsd"...
> > > Core was generated by `bacula-sd'.
> > > Program terminated with signal 11, Segmentation fault.
> > > Reading symbols from /lib/libz.so.4...done.
> > > Loaded symbols for /lib/libz.so.4
> > > Reading symbols from /lib/libthr.so.3...done.
> > > Loaded symbols for /lib/libthr.so.3
> > > Reading symbols from /usr/local/lib/libintl.so.8...done.
> > > Loaded symbols for /usr/local/lib/libintl.so.8
> > > Reading symbols from /usr/local/lib/libiconv.so.3...done.
> > > Loaded symbols for /usr/local/lib/libiconv.so.3
> > > Reading symbols from /usr/lib/libssl.so.5...done.
> > > Loaded symbols for /usr/lib/libssl.so.5
> > > Reading symbols from /lib/libcrypto.so.5...done.
> > > Loaded symbols for /lib/libcrypto.so.5
> > > Reading symbols from /usr/lib/libstdc++.so.6...done.
> > > Loaded symbols for /usr/lib/libstdc++.so.6
> > > Reading symbols from /lib/libm.so.5...done.
> > > Loaded symbols for /lib/libm.so.5
> > > Reading symbols from /lib/libgcc_s.so.1...done.
> > > Loaded symbols for /lib/libgcc_s.so.1
> > > Reading symbols from /lib/libc.so.7...done.
> > > Loaded symbols for /lib/libc.so.7
> > > Reading symbols from /libexec/ld-elf.so.1...done.
> > > Loaded symbols for /libexec/ld-elf.so.1
> > > #0 0x0808ab61 in jcr_walk_start () at dlist.h:187
> > > 187 dlist.h: No such file or directory.
> > > in dlist.h
> > > [New Thread 0x28601900 (LWP 100488)]
> > > [New Thread 0x28601100 (LWP 100346)]
> > > (gdb) backtrace
> > > #0 0x0808ab61 in jcr_walk_start () at dlist.h:187
> > > #1 0x0808aee0 in get_jobid_from_tid (tid=0x28601900) at jcr.c:467
> > > #2 0x0808af43 in get_jobid_from_tid () at jcr.c:460
> > > #3 0x080731d3 in free_volume_list () at reserve.c:514
> > > #4 0x0804c9ac in terminate_stored (sig=15) at stored.c:606
> > > #5 0x08097519 in signal_handler (sig=15) at signal.c:180
> > > #6 0xbfbfffb4 in ?? ()
> > > #7 0x0000000f in ?? ()
> > > #8 0x00000000 in ?? ()
> > > #9 0xbf3f8bc0 in ?? ()
> > > #10 0x00000000 in ?? ()
> > > #11 0x08097260 in init_stack_dump () at signal.c:190
> > > #12 0x281051de in pthread_cond_init () from /lib/libthr.so.3
> > > #13 0x0809b2d6 in workq_server (arg=0x28644100) at workq.c:332
> > > #14 0x280fea7f in pthread_getprio () from /lib/libthr.so.3
> > > #15 0xbf1f6fec in ?? ()
> > > Current language: auto; currently c++
> > > (gdb)
> > >
> > > Now... you gotta fix for that fd core? ;)
> >
> > The above is an SD crash due to the term_reservations_lock() problem.
>
> Again? Sorry. Try this FD:
>
> [EMAIL PROTECTED]:~/src/BaculaRegressionTesting] $ gdb bin/bacula-fd bacula-
> fd.core
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for
> details.
> This GDB was configured as "i386-marcel-freebsd"...
> Core was generated by `bacula-fd'.
> Program terminated with signal 11, Segmentation fault.
> Reading symbols from /lib/libz.so.4...done.
> Loaded symbols for /lib/libz.so.4
> Reading symbols from /lib/libthr.so.3...done.
> Loaded symbols for /lib/libthr.so.3
> Reading symbols from /usr/local/lib/libintl.so.8...done.
> Loaded symbols for /usr/local/lib/libintl.so.8
> Reading symbols from /usr/local/lib/libiconv.so.3...done.
> Loaded symbols for /usr/local/lib/libiconv.so.3
> Reading symbols from /usr/lib/libssl.so.5...done.
> Loaded symbols for /usr/lib/libssl.so.5
> Reading symbols from /lib/libcrypto.so.5...done.
> Loaded symbols for /lib/libcrypto.so.5
> Reading symbols from /usr/lib/libstdc++.so.6...done.
> Loaded symbols for /usr/lib/libstdc++.so.6
> Reading symbols from /lib/libm.so.5...done.
> Loaded symbols for /lib/libm.so.5
> Reading symbols from /lib/libgcc_s.so.1...done.
> Loaded symbols for /lib/libgcc_s.so.1
> Reading symbols from /lib/libc.so.7...done.
> Loaded symbols for /lib/libc.so.7
> Reading symbols from /libexec/ld-elf.so.1...done.
> Loaded symbols for /libexec/ld-elf.so.1
> #0 get_first_port_host_order (addrs=0x5a5a5a5a) at address_conf.h:64
> 64 address_conf.h: No such file or directory.
> in address_conf.h
> [New Thread 0x28601200 (LWP 100058)]
> [New Thread 0x28601100 (LWP 100478)]
> (gdb) backtrace
> #0 get_first_port_host_order (addrs=0x5a5a5a5a) at address_conf.h:64
> #1 0x0804cc93 in terminate_filed (sig=15) at filed.c:238
> #2 0x0807dbf9 in signal_handler (sig=15) at signal.c:180
> #3 0xbfbfffb4 in ?? ()
> #4 0x0000000f in ?? ()
> #5 0x00000000 in ?? ()
> #6 0xbf9febc0 in ?? ()
> #7 0x00000000 in ?? ()
> #8 0x0807d940 in init_stack_dump () at signal.c:190
> #9 0x280e41de in pthread_cond_init () from /lib/libthr.so.3
> #10 0x08081588 in watchdog_thread (arg=0x0) at watchdog.c:307
> #11 0x280dda7f in pthread_getprio () from /lib/libthr.so.3
> #12 0x00000000 in ?? ()
> Current language: auto; currently c++
> (gdb)
Ah, yes, that looks better. As I mentioned in a previous email. I believe I
have a fix for it.
>
> > By the way, I think there is a problem with your kernel/debugger. The
> > stack dump after about frame 10 looks to me like they have gone too far,
> > perhaps into another thread or something ...
>
> Could that happen if I did a gdb on a old core? That is, one which
> was not produced from the current source code?
Uh, yes, that is possible.
Regards,
Kern
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel