Hi Martin, hi everyone,
thanks for your hints, so now i got this and really, don't have a clue :(
Any help is highly appreciated
Created /var/lib/bareos/bareos-sd.core.21385 for doing postmortem debugging
[New LWP 21385]
[New LWP 21386]
[New LWP 21388]
[New LWP 21389]
[New LWP 8746]
[New LWP 19330]
[New LWP 19331]
[New LWP 19342]
[New LWP 19343]
[New LWP 19344]
[New LWP 19345]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: the debug information found in "/lib64/ld-2.23.so" does not match
"/lib64/ld-linux-x86-64.so.2" (CRC mismatch).
Core was generated by `/usr/sbin/bareos-sd'.
#0 0x00007fd4c491574d in poll () at ../sysdeps/unix/syscall-template.S:84
84 ../sysdeps/unix/syscall-template.S: No such file or directory.
[Current thread is 1 (Thread 0x7fd4c5f05740 (LWP 21385))]
$1 = 0x626da0 <my_name> "gimli-sd"
$2 = 0xfc6c68 "bareos-sd"
$3 = 0xfc6ca8 "/usr/sbin/bareos-sd"
$4 = 0x0
$5 = 0x7fd4c53e89a0 "17.2.4 (21 Sep 2017)"
$6 = 0x7fd4c53e8974 "x86_64-pc-linux-gnu"
$7 = 0x7fd4c53e896d "ubuntu"
$8 = 0x7fd4c53e898f "Ubuntu 16.04 LTS"
$9 = "gimli", '\000' <repeats 250 times>
$10 = 0x7fd4c53e8988 "ubuntu Ubuntu 16.04 LTS"
Environment variable "TestName" not defined.
#0 0x00007fd4c491574d in poll () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007fd4c53aaa43 in bnet_thread_server_tcp (addr_list=<optimized out>,
max_clients=<optimized out>, sockfds=<optimized out>,
client_wq=client_wq@entry=0x627160 <socket_workq>, nokeepalive=<optimized out>,
handle_client_request=handle_client_request@entry=0x418960
<handle_connection_request(void*)>) at bnet_server_tcp.c:306
#2 0x0000000000418ce8 in start_socket_server (addrs=<optimized out>) at
socket_server.c:122
#3 0x0000000000408cf8 in main (argc=<optimized out>, argv=<optimized out>) at
stored.c:322
Thread 11 (Thread 0x7fd4b8ff9700 (LWP 19345)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x00007fd4c53d54e5 in rwl_writelock_p (rwl=rwl@entry=0x7fd4c5a66c00
<reservation_lock>, file=file@entry=0x7fd4c5857e5f "reserve.c",
line=line@entry=303) at rwlock.c:236
#2 0x00007fd4c58418ce in _lock_reservations (file=file@entry=0x7fd4c5857e5f
"reserve.c", line=line@entry=303) at reserve.c:113
#3 0x00007fd4c58442ee in use_device_cmd (jcr=<optimized out>) at reserve.c:303
#4 use_cmd (jcr=<optimized out>) at reserve.c:76
#5 0x000000000041018a in handle_director_connection (dir=dir@entry=0xfec638)
at dir_cmd.c:315
#6 0x00000000004189fa in handle_connection_request (arg=0xfec638) at
socket_server.c:100
#7 0x00007fd4c53e26b5 in workq_server (arg=arg@entry=0x627160 <socket_workq>)
at workq.c:336
#8 0x00007fd4c53c7c7f in lmgr_thread_launcher (x=0xfec848) at lockmgr.c:928
#9 0x00007fd4c51776ba in start_thread (arg=0x7fd4b8ff9700) at
pthread_create.c:333
#10 0x00007fd4c492141d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 10 (Thread 0x7fd4ba7fc700 (LWP 19344)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x00007fd4c53d54e5 in rwl_writelock_p (rwl=rwl@entry=0x7fd4c5a66c00
<reservation_lock>, file=file@entry=0x7fd4c5857e5f "reserve.c",
line=line@entry=303) at rwlock.c:236
#2 0x00007fd4c58418ce in _lock_reservations (file=file@entry=0x7fd4c5857e5f
"reserve.c", line=line@entry=303) at reserve.c:113
#3 0x00007fd4c58442ee in use_device_cmd (jcr=<optimized out>) at reserve.c:303
#4 use_cmd (jcr=<optimized out>) at reserve.c:76
#5 0x000000000041018a in handle_director_connection (dir=dir@entry=0xff2b58)
at dir_cmd.c:315
#6 0x00000000004189fa in handle_connection_request (arg=0xff2b58) at
socket_server.c:100
#7 0x00007fd4c53e26b5 in workq_server (arg=arg@entry=0x627160 <socket_workq>)
at workq.c:336
#8 0x00007fd4c53c7c7f in lmgr_thread_launcher (x=0xff2d68) at lockmgr.c:928
#9 0x00007fd4c51776ba in start_thread (arg=0x7fd4ba7fc700) at
pthread_create.c:333
#10 0x00007fd4c492141d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 9 (Thread 0x7fd4bb7fe700 (LWP 19343)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x00007fd4c53d54e5 in rwl_writelock_p (rwl=rwl@entry=0x7fd4c5a66c00
<reservation_lock>, file=file@entry=0x7fd4c5857e5f "reserve.c",
line=line@entry=303) at rwlock.c:236
#2 0x00007fd4c58418ce in _lock_reservations (file=file@entry=0x7fd4c5857e5f
"reserve.c", line=line@entry=303) at reserve.c:113
#3 0x00007fd4c58442ee in use_device_cmd (jcr=<optimized out>) at reserve.c:303
#4 use_cmd (jcr=<optimized out>) at reserve.c:76
#5 0x000000000041018a in handle_director_connection (dir=dir@entry=0xff25b8)
at dir_cmd.c:315
#6 0x00000000004189fa in handle_connection_request (arg=0xff25b8) at
socket_server.c:100
#7 0x00007fd4c53e26b5 in workq_server (arg=arg@entry=0x627160 <socket_workq>)
at workq.c:336
#8 0x00007fd4c53c7c7f in lmgr_thread_launcher (x=0xff27c8) at lockmgr.c:928
#9 0x00007fd4c51776ba in start_thread (arg=0x7fd4bb7fe700) at
pthread_create.c:333
#10 0x00007fd4c492141d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 8 (Thread 0x7fd4bbfff700 (LWP 19342)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x00007fd4c53d54e5 in rwl_writelock_p (rwl=rwl@entry=0x7fd4c5a66c00
<reservation_lock>, file=file@entry=0x7fd4c5857e5f "reserve.c",
line=line@entry=303) at rwlock.c:236
#2 0x00007fd4c58418ce in _lock_reservations (file=file@entry=0x7fd4c5857e5f
"reserve.c", line=line@entry=303) at reserve.c:113
#3 0x00007fd4c58442ee in use_device_cmd (jcr=<optimized out>) at reserve.c:303
#4 use_cmd (jcr=<optimized out>) at reserve.c:76
#5 0x000000000041018a in handle_director_connection (dir=dir@entry=0xfc7368)
at dir_cmd.c:315
#6 0x00000000004189fa in handle_connection_request (arg=0xfc7368) at
socket_server.c:100
#7 0x00007fd4c53e26b5 in workq_server (arg=arg@entry=0x627160 <socket_workq>)
at workq.c:336
#8 0x00007fd4c53c7c7f in lmgr_thread_launcher (x=0xfc7578) at lockmgr.c:928
#9 0x00007fd4c51776ba in start_thread (arg=0x7fd4bbfff700) at
pthread_create.c:333
#10 0x00007fd4c492141d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 7 (Thread 0x7fd4b97fa700 (LWP 19331)):
#0 0x00007fd4c491127d in read () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007fd4c48945e8 in _IO_new_file_underflow (fp=0x7fd4a800c280) at
fileops.c:592
#2 0x00007fd4c489560e in __GI__IO_default_uflow (fp=0x7fd4a800c280) at
genops.c:413
#3 0x00007fd4c4890108 in _IO_getc (fp=fp@entry=0x7fd4a800c280) at getc.c:38
#4 0x00007fd4c53b6de7 in bfgets (s=s@entry=0x7fd4a80044f8 "Unloading drive 1
into Storage Element 211...", size=size@entry=32000, fd=0x7fd4a800c280) at
bsys.c:807
#5 0x00007fd4c53ac9ee in run_program_full_output (prog=<optimized out>,
wait=wait@entry=300, results=@0x7fd4b97f9970: 0x7fd498001c80 "") at bpipe.c:442
#6 0x00007fd4c582c650 in unload_autochanger (dcr=dcr@entry=0x7fd4a8001f68,
loaded=211, loaded@entry=-1, lock_set=lock_set@entry=false) at autochanger.c:472
#7 0x00007fd4c5842e4d in can_reserve_drive (rctx=..., dcr=0x7fd4a8001f68) at
reserve.c:1198
#8 reserve_device_for_append (rctx=..., dcr=0x7fd4a8001f68) at reserve.c:975
#9 reserve_device (rctx=...) at reserve.c:770
#10 0x00007fd4c58431d2 in search_res_for_device (rctx=...) at reserve.c:617
#11 0x00007fd4c58436b7 in find_suitable_device_for_job
(jcr=jcr@entry=0x7fd4a80032c8, rctx=...) at reserve.c:568
#12 0x00007fd4c584439d in use_device_cmd (jcr=<optimized out>) at reserve.c:320
#13 use_cmd (jcr=<optimized out>) at reserve.c:76
#14 0x000000000041018a in handle_director_connection (dir=dir@entry=0xfcd338)
at dir_cmd.c:315
#15 0x00000000004189fa in handle_connection_request (arg=0xfcd338) at
socket_server.c:100
#16 0x00007fd4c53e26b5 in workq_server (arg=arg@entry=0x627160 <socket_workq>)
at workq.c:336
#17 0x00007fd4c53c7c7f in lmgr_thread_launcher (x=0xfcd508) at lockmgr.c:928
#18 0x00007fd4c51776ba in start_thread (arg=0x7fd4b97fa700) at
pthread_create.c:333
#19 0x00007fd4c492141d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 6 (Thread 0x7fd492ffd700 (LWP 19330)):
#0 0x00007fd4c518051d in read () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007fd4c53b4447 in BSOCK_TCP::read_nbytes (this=0xfec2a8, ptr=<optimized
out>, nbytes=65536) at bsock_tcp.c:978
#2 0x00007fd4c53b3dd6 in BSOCK_TCP::recv (this=0xfec2a8) at bsock_tcp.c:617
#3 0x00007fd4c53a9ced in bget_msg (sock=sock@entry=0xfec2a8) at bget_msg.c:53
#4 0x00000000004095c0 in do_append_data (jcr=jcr@entry=0x7fd484001078,
bs=bs@entry=0xfec2a8, what=what@entry=0x41ec5c "FD") at append.c:209
#5 0x0000000000410934 in append_data_cmd (jcr=0x7fd484001078) at fd_cmds.c:271
#6 0x0000000000410d43 in do_fd_commands (jcr=0x7fd484001078) at fd_cmds.c:227
#7 0x0000000000410f52 in run_job (jcr=jcr@entry=0x7fd484001078) at
fd_cmds.c:183
#8 0x00000000004119f2 in do_job_run (jcr=0x7fd484001078) at job.c:238
#9 0x000000000041018a in handle_director_connection (dir=dir@entry=0xfed348)
at dir_cmd.c:315
#10 0x00000000004189fa in handle_connection_request (arg=0xfed348) at
socket_server.c:100
#11 0x00007fd4c53e26b5 in workq_server (arg=arg@entry=0x627160 <socket_workq>)
at workq.c:336
#12 0x00007fd4c53c7c7f in lmgr_thread_launcher (x=0xfedb98) at lockmgr.c:928
#13 0x00007fd4c51776ba in start_thread (arg=0x7fd492ffd700) at
pthread_create.c:333
#14 0x00007fd4c492141d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 5 (Thread 0x7fd4c25d2700 (LWP 8746)):
#0 0x00007fd4c518051d in read () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007fd4c53b4447 in BSOCK_TCP::read_nbytes (this=0xfec038, ptr=<optimized
out>, nbytes=4) at bsock_tcp.c:978
#2 0x00007fd4c53b3b86 in BSOCK_TCP::recv (this=0xfec038) at bsock_tcp.c:550
#3 0x00007fd4c53a9ced in bget_msg (sock=sock@entry=0xfec038) at bget_msg.c:53
#4 0x00000000004095c0 in do_append_data (jcr=jcr@entry=0x7fd4ac0008e8,
bs=bs@entry=0xfec038, what=what@entry=0x41ec5c "FD") at append.c:209
#5 0x0000000000410934 in append_data_cmd (jcr=0x7fd4ac0008e8) at fd_cmds.c:271
#6 0x0000000000410d43 in do_fd_commands (jcr=0x7fd4ac0008e8) at fd_cmds.c:227
#7 0x0000000000410f52 in run_job (jcr=jcr@entry=0x7fd4ac0008e8) at
fd_cmds.c:183
#8 0x00000000004119f2 in do_job_run (jcr=0x7fd4ac0008e8) at job.c:238
#9 0x000000000041018a in handle_director_connection (dir=dir@entry=0xfedd48)
at dir_cmd.c:315
#10 0x00000000004189fa in handle_connection_request (arg=0xfedd48) at
socket_server.c:100
#11 0x00007fd4c53e26b5 in workq_server (arg=arg@entry=0x627160 <socket_workq>)
at workq.c:336
#12 0x00007fd4c53c7c7f in lmgr_thread_launcher (x=0xfeda18) at lockmgr.c:928
#13 0x00007fd4c51776ba in start_thread (arg=0x7fd4c25d2700) at
pthread_create.c:333
#14 0x00007fd4c492141d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 4 (Thread 0x7fd4c15d0700 (LWP 21389)):
#0 0x00007fd4c5180f7b in __waitpid (pid=pid@entry=19353,
stat_loc=stat_loc@entry=0x7fd4c15cf3ac, options=options@entry=0) at
../sysdeps/unix/sysv/linux/waitpid.c:29
#1 0x00007fd4c53d82f4 in signal_handler (sig=11) at signal.c:240
#2 <signal handler called>
#3 0x0000000000417b4b in update_job_statistics (jcr=0x7fd4a0001598,
now=1520107218) at sd_stats.c:296
#4 0x0000000000418223 in statistics_thread_runner (arg=arg@entry=0x0) at
sd_stats.c:386
#5 0x00007fd4c53c7c7f in lmgr_thread_launcher (x=0xfed898) at lockmgr.c:928
#6 0x00007fd4c51776ba in start_thread (arg=0x7fd4c15d0700) at
pthread_create.c:333
#7 0x00007fd4c492141d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 3 (Thread 0x7fd4c1dd1700 (LWP 21388)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1 0x00007fd4c53c812c in bthread_cond_timedwait_p
(cond=cond@entry=0x7fd4c55ff780 <_ZL5timer>, m=m@entry=0x7fd4c55ff7c0
<_ZL11timer_mutex>, abstime=abstime@entry=0x7fd4c1dd0e10,
file=file@entry=0x7fd4c53ed222 "watchdog.c", line=line@entry=313) at
lockmgr.c:813
#2 0x00007fd4c53e1ca5 in watchdog_thread (arg=arg@entry=0x0) at watchdog.c:313
#3 0x00007fd4c53c7c7f in lmgr_thread_launcher (x=0xfed718) at lockmgr.c:928
#4 0x00007fd4c51776ba in start_thread (arg=0x7fd4c1dd1700) at
pthread_create.c:333
#5 0x00007fd4c492141d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 2 (Thread 0x7fd4c2dd3700 (LWP 21386)):
#0 0x00007fd4c5180c1d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007fd4c53b6253 in bmicrosleep (sec=sec@entry=30, usec=usec@entry=0) at
bsys.c:171
#2 0x00007fd4c53c7bbc in check_deadlock () at lockmgr.c:568
#3 0x00007fd4c51776ba in start_thread (arg=0x7fd4c2dd3700) at
pthread_create.c:333
#4 0x00007fd4c492141d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 1 (Thread 0x7fd4c5f05740 (LWP 21385)):
#0 0x00007fd4c491574d in poll () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007fd4c53aaa43 in bnet_thread_server_tcp (addr_list=<optimized out>,
max_clients=<optimized out>, sockfds=<optimized out>,
client_wq=client_wq@entry=0x627160 <socket_workq>, nokeepalive=<optimized out>,
handle_client_request=handle_client_request@entry=0x418960
<handle_connection_request(void*)>) at bnet_server_tcp.c:306
#2 0x0000000000418ce8 in start_socket_server (addrs=<optimized out>) at
socket_server.c:122
#3 0x0000000000408cf8 in main (argc=<optimized out>, argv=<optimized out>) at
stored.c:322
#0 0x00007fd4c491574d in poll () at ../sysdeps/unix/syscall-template.S:84
84 in ../sysdeps/unix/syscall-template.S
No locals.
#1 0x00007fd4c53aaa43 in bnet_thread_server_tcp (addr_list=<optimized out>,
max_clients=<optimized out>, sockfds=<optimized out>,
client_wq=client_wq@entry=0x627160 <socket_workq>, nokeepalive=<optimized out>,
handle_client_request=handle_client_request@entry=0x418960
<handle_connection_request(void*)>) at bnet_server_tcp.c:306
306 bnet_server_tcp.c: No such file or directory.
cnt = <optimized out>
newsockfd = <optimized out>
status = <optimized out>
clilen = 16
cli_addr = {sa_family = 2, sa_data =
"fҬ\022\372\005\000\000\000\000\000\000\000"}
tlog = <optimized out>
tmax = <optimized out>
value = 1
request = {fd = 15, user = '\000' <repeats 127 times>, daemon = "gimli-sd",
'\000' <repeats 119 times>, pid = "21385\000\000\000\000", client = {{name =
'\000' <repeats 127 times>, addr = '\000' <repeats 127 times>, sin =
0x7fd4c516ebc0, unit = 0x0, request = 0x7ffc3f9df6b0}}, server = {{name =
'\000' <repeats 127 times>, addr = '\000' <repeats 127 times>, sin =
0x7fd4c516eb40, unit = 0x0, request = 0x7ffc3f9df6b0}}, sink = 0x0, hostname =
0x7fd4c4f6baa0 <sock_hostname>, hostaddr = 0x7fd4c4f6ba50 <sock_hostaddr>,
cleanup = 0x0, config = 0x0}
ipaddr = 0x0
next = <optimized out>
to_free = <optimized out>
fd_ptr = <optimized out>
buf =
"172.18.250.5\000\000\000\000\001\000\000\000\000\000\000\000\004\000\000\000\061",
'\000' <repeats 11 times>, "\024\375\235?\374\177\000\000[\000\000\000\004",
'\000' <repeats 19 times>,
"\060\373\235?\374\177\000\000\b{\374\000\000\000\000\000\030\375\235?\374\177\000\000\000\000\000\000\004",
'\000' <repeats 19 times>, "`\373\235?\374\177\000"
nfds = <optimized out>
events = 195
pfds = <optimized out>
allbuf =
"N\003\000\000\000\000\000\000\370\257\202\304\324\177\000\000\360\324\360\305\324\177\000\000\330\373\235?\374\177\000\000\324\373\235?\374\177\000\000\341\377\317\305\324\177\000\000\360\373\235?\374\177\000\000\265,\376\303\324\177\000\000\350\276\374\303\324\177\000\000\330\373\235?\374\177\000\000\060\270\202\r\000\000\000\000\340\n6\000\000\000\000\000\060\000\000\000\000\000\000\000\260\374\235?\374\177\000\000\370\257\202\304\324\177\000\000\200݁\304\324\177\000\000\324\373\235?\374\177\000\000\240\374\235?\374\177\000\000\030\230\360\305\324\177\000\000\000\000\000\000\374\177\000\000\025\000\000\000\000\000\000\000\024\376\317\305\324\177\000\000\000\000\000\000\000\000\000\000w\004\000\000\000\000\000\000\360\324\360\305\324\177\000\000\200"...
#2 0x0000000000418ce8 in start_socket_server (addrs=<optimized out>) at
socket_server.c:122
122 socket_server.c: No such file or directory.
p = <optimized out>
#3 0x0000000000408cf8 in main (argc=<optimized out>, argv=<optimized out>) at
stored.c:322
322 stored.c: No such file or directory.
ch = <optimized out>
no_signals = <optimized out>
test_config = false
export_config = false
export_config_schema = false
thid = 140551770679040
uid = <optimized out>
gid = <optimized out>
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
Am 01.03.2018 um 11:12 schrieb Martin Emrich:
> Am Donnerstag, 1. März 2018 07:06:35 UTC+1 schrieb A. Podstawka:
>> Hi All,
>>
>> i have a problem with our bareos installation.
>> the SD crashes during the run of scheduled backup jobs, and i can't find
>> out why (without your help).
>>
>> It seems like only some of the bareos-sd threads crash and the others
>> still run flawlessly, like running jobs on tape drive 1 crash, but jobs
>> on tape drive 2 still work and run till they are finished
>> What should i do, to get rid of this error?
>>
>> I have following "Traceback" (mailed with the crash):
>> Any Help or Suggestions ?
> I am no bareos development expert, but you might install the debug info
> packages for your distribution so not only memory adresses but acual code
> positions of where the crash happens are displayed.
>
> I am also currently debugging SD crashes here (but probably unrelated to
> yours, as mine happen related to the Ceph RADOS backend).
>
> On CentOS 7, I installed the glibc debug infos (packages
> glibc-debuginfo-common, glibc-debuginfo), and I also have the Bareos debug
> infos (I built from source, but there is also a package bareos-debuginfo).
>
> On other distributions, there are probably similar packages. With them, the
> backtrace might become clearer, allowing the real experts here to step in ;)
>
> Cheers,
>
> Martin
>
--
Adam Podstawka
--
You received this message because you are subscribed to the Google Groups
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.