On Tue, 2007-08-28 at 07:53 +0200, Kern Sibbald wrote:
> Hello Dirk,
>
> I've looked over the code, and if there is something wrong with it, I am
> certainly missing it. Perhaps someone on the devel list will see something
> that I cannot.
>
> At this point, I'm privileging a compiler bug. Could you give me the
> following information?
>
> 1. The version of the compiler and the architecture for each machine where
> you have the failure.
These are gentoo machines. I'm getting the gcc version with:
gcc-config -l
myth2 ~ # uname -a
Linux myth2 2.6.19-gentoo-r2 #5 SMP Thu Dec 28 22:24:13 EST 2006 x86_64
AMD Athlon(tm) 64 Processor 3200+ AuthenticAMD GNU/Linux
gcc x86_64-pc-linux-gnu-4.1.1
srvalum3 ~ # uname -a
Linux srvalum3 2.6.20-gentoo #3 SMP Sun Feb 18 12:32:04 EST 2007 x86_64
Dual-Core AMD Opteron(tm) Processor 2210 AuthenticAMD GNU/Linux
gcc x86_64-pc-linux-gnu-4.1.1
> 2. The version of the compiler and the architecture for each machine where
> you do not have the failure.
workplay ~ # uname -a
Linux workplay 2.6.17-gentoo-r5 #4 SMP Thu Apr 19 19:58:11 EDT 2007 i686
AMD Sempron(tm) 2600+ AuthenticAMD GNU/Linux
gcc i686-pc-linux-gnu-3.4.4
>
> Could you give me the complete compile line with all the options for both
> dird/ua_output.c and lib/bsnprintf.c? Either edit the Makefile and remove
> the $(NO_ECHO) in front of the compile rules (the .c.o: and .cc.o: lines), or
> set the environment variable NO_ECHO to the empty string.
Compiling ua_output.c
/usr/bin/x86_64-pc-linux-gnu-g++ -c -I/usr/include/python2.4 -I.
-I.. -O2 -march=athlon64 -pipe -g ua_output.c
Compiling bsnprintf.c
/usr/bin/x86_64-pc-linux-gnu-g++ -c -I/usr/include/python2.4 -I.
-I.. -O2 -march=athlon64 -pipe -g bsnprintf.c
then set makefile optimization
Compiling bsnprintf.c
/usr/bin/x86_64-pc-linux-gnu-g++ -c -I/usr/include/python2.4 -I.
-I.. -O0 -march=athlon64 -pipe -g bsnprintf.c
Compiling ua_output.c
/usr/bin/x86_64-pc-linux-gnu-g++ -c -I/usr/include/python2.4 -I.
-I.. -O0 -march=athlon64 -pipe -g ua_output.c
>
> Could you set the compile optimization for those two files to -O0 (minus oh
> zero)? Either edit the Makefile or set it on the command line via a
> preceding environment variable setting of CFLAGS. Then test again and see
> if it fails.
Yes it fails.
>
> As a separate test, if the above test still fails, could you comment out the
> #define USE_BSNPRINTF 1
> line in src/version.h and then rebuild everything?
Yes it fails as well
>
> Another interesting test would be to put:
>
> Dmsg1(000, "fmt=%s\n", fmt);
>
> just after the line "again:" at line 737 in src/dird/ua_output.c as well as:
>
> Dmsg0(000, "goto again\n");
>
> after "msg = realloc_pool_memory(msg, maxlen + maxlen/2);" at line
> 741. Then report what it prints when the seg fault occurs.
srvalum3-dir: ua_output.c:738 fmt=%s
srvalum3-dir: ua_output.c:738 fmt=%s
srvalum3-dir: ua_output.c:738 fmt=%s
srvalum3-dir: ua_output.c:738 fmt=%s
srvalum3-dir: ua_output.c:738 fmt=%s
srvalum3-dir: ua_output.c:743 goto again
srvalum3-dir: ua_output.c:738 fmt=%s
Kaboom! bacula-dir, srvalum3-dir got signal 11 - Segmentation violation.
Attempting traceback.
Kaboom! exepath=/usr/sbin/
Calling: /usr/sbin/btraceback /usr/sbin/bacula-dir 16016
Traceback complete, attempting cleanup ...
And here is thread 2
Thread 2 (Thread 1098918208 (LWP 16043)):
#0 0x00002af703e5faef in waitpid () from /lib/libpthread.so.0
#1 0x000000000046b9e3 in signal_handler (sig=11) at signal.c:167
#2 <signal handler called>
#3 0x00002af704a06c10 in strlen () from /lib/libc.so.6
#4 0x00002af7049d8b70 in vfprintf () from /lib/libc.so.6
#5 0x00002af7049f9a4a in vsnprintf () from /lib/libc.so.6
#6 0x0000000000452bac in bvsnprintf (str=0x9 <Address 0x9 out of
bounds>, size=<value optimized out>,
format=0x41801e01 "tn", ap=0x7) at bsys.c:292
#7 0x0000000000431f6d in bmsg (ua=0x6e74a8, fmt=0x513af4 "%s",
arg_ptr=0x41801ec0) at ua_output.c:740
#8 0x0000000000432326 in UAContext::send_msg (this=0x9, fmt=0x513af4 "%
s") at ua_output.c:778
#9 0x000000000042e43c in sql_handler (ctx=0x6e74a8, num_field=3,
row=0x6e7738) at ua_dotcmds.c:480
#10 0x00000000004508f2 in db_sql_query (mdb=0x6e7888, query=<value
optimized out>, result_handler=0x42e350 <sql_handler>,
ctx=0x6e74a8) at postgresql.c:320
#11 0x000000000042dd57 in sql_cmd (ua=0x6e74a8, cmd=<value optimized
out>) at ua_dotcmds.c:496
#12 0x000000000042da9b in do_a_dot_command (ua=0x6e74a8,
cmd=0x6f0f70 ".sql query=\"SELECT LogId, Time, LogText FROM Log
WHERE JobId='2674'\"") at ua_dotcmds.c:131
#13 0x000000000043e75f in handle_UA_client_request (arg=<value optimized
out>) at ua_server.c:145
#14 0x0000000000473c1d in workq_server (arg=<value optimized out>) at
workq.c:357
#15 0x00002af703e58135 in start_thread () from /lib/libpthread.so.0
#16 0x00002af704a5335e in clone () from /lib/libc.so.6
#17 0x0000000000000000 in ?? ()
On this machine, gcc had three versions
srvalum3 bacula # gcc-config -l
[1] x86_64-pc-linux-gnu-3.3.6
[2] x86_64-pc-linux-gnu-4.1.1 *
[3] x86_64-pc-linux-gnu-4.1.2
So as an experiment I set gcc to 4.1.2 and recompiled bacula with the
same result.
Dirk
>
> Best regards,
>
> Kern
>
> PS: for the list, the problem is clearly (according to the traceback) in
> Thread 2 between stack frame 4 and 5 where the argument "fmt" in stack frame
> 5 should be identical to argument "format" in stack frame 4, but has been
> shifted by 2 bytes!
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel