Hi,
My bacula-sd is deadlocking during copy jobs.
Version 5.0.1
compile option: --with-readline=/usr/include/readline --disable-conio
--with-mysql --enable-smartalloc
Linux version: i686-pc-linux-gnu debian 5.0.4
Daily i copy every job made that night from 2 disk pools to a migrate pool
using copy jobs (pool uncopied).
The migrate pool contains a autochanger with to drives. Config of the bacula-sd
autochanger see below.
Both pools are loadbalancing there jobs over the 2 drives (using the maximum
concurrent jobs =1 feature in the bacula-sd) as expected.
After a while the load on both the dir and sd are dropping to 0.
When I try to do status stor in the console I see the following (stopping at
Used Volume status: and waiting forever):
*status stor
The defined Storage resources are:
1: Migrate
2: diskbackup
3: diskbackup2
Select Storage resource (1-3): 1
Connecting to Storage daemon Migrate at bacula-sd.solcon.nl:9103
bacula-sd Version: 5.0.1 (24 February 2010) i686-pc-linux-gnu debian 5.0.4
Daemon started 08-Mar-10 10:07, 20 Jobs run since started.
Heap: heap=2,367,488 smbytes=1,724,139 max_bytes=1,985,497 bufs=250
max_bufs=293
Sizes: boffset_t=8 size_t=4 int32_t=4 int64_t=8
Running Jobs:
Reading: Full Copy job D2D2T2 JobId=131743 Volume="disk2-1265"
pool="Disk2-Pool" device="diskbackup2" (/bacula/diskbackup2)
Files=3,337 Bytes=45,779,673 Bytes/sec=140,428
FDSocket closed
====
Jobs waiting to reserve a drive:
====
Terminated Jobs:
JobId Level Files Bytes Status Finished Name
===================================================================
131559 Full 184 70.61 M OK 08-Mar-10 11:20 D2D2T
131561 Full 215 32.03 M OK 08-Mar-10 11:22 D2D2T
131735 Full 233 1.541 G OK 08-Mar-10 11:23 D2D2T2
131563 Full 41 2.123 M OK 08-Mar-10 11:25 D2D2T
131737 Full 118 241.3 M OK 08-Mar-10 11:28 D2D2T2
131565 Full 21,836 239.7 M OK 08-Mar-10 11:30 D2D2T
131739 Full 2,069 596.7 M OK 08-Mar-10 11:32 D2D2T2
131567 Full 122 315.9 M OK 08-Mar-10 11:34 D2D2T
131741 Full 141 2.779 M OK 08-Mar-10 11:35 D2D2T2
131569 Full 187 20.59 M OK 08-Mar-10 11:37 D2D2T
====
Device status:
Autochanger "TandbergT40" with devices:
"Drive-1" (/dev/st0)
"Drive-2" (/dev/st1)
Device "Drive-1" (/dev/st0) is mounted with:
Volume: B4MO03
Pool: Migrate-Pool
Media type: LTO-4
Slot 19 is loaded in drive 0.
Total Bytes=4,032,451,584 Blocks=62,506 Bytes/block=64,513
Positioned at File=8 Block=0
Device "Drive-2" (/dev/st1) is mounted with:
Volume: B4MO01
Pool: Migrate-Pool
Media type: LTO-4
Slot 22 is loaded in drive 1.
Total Bytes=3,954,521,088 Blocks=61,298 Bytes/block=64,513
Positioned at File=17 Block=0
Device "diskbackup" (/bacula/diskbackup) is not open.
Device "diskrestore" (/bacula/diskbackup) is not open.
Device "diskbackup2" (/bacula/diskbackup2) is mounted with:
Volume: disk2-1265
Pool: *unknown*
Media type: File
Total Bytes Read=0 Blocks Read=0 Bytes/block=0
Positioned at File=0 Block=1,575,192,713
====
Used Volume status:
Restarting the bacula-sd is the only way to get him back to work.
I tried to run the bacula-sd manual unther gdb, but gdb is not showing
something usefull:
only something like this:
[New Thread 0xb61e0b90 (LWP 8026)]
[New Thread 0xb57dfb90 (LWP 8027)]
[New Thread 0xb4fdeb90 (LWP 8028)]
[New Thread 0xb47ddb90 (LWP 8029)]
[Thread 0xb57dfb90 (LWP 8027) exited]
[Thread 0xb61e0b90 (LWP 8026) exited]
[Thread 0xb47ddb90 (LWP 8029) exited]
[Thread 0xb4fdeb90 (LWP 8028) exited]
[New Thread 0xb4fdeb90 (LWP 8046)]
[Thread 0xb4fdeb90 (LWP 8046) exited]
[New Thread 0xb4fdeb90 (LWP 8047)]
[Thread 0xb4fdeb90 (LWP 8047) exited]
[New Thread 0xb4fdeb90 (LWP 8050)]
[Thread 0xb4fdeb90 (LWP 8050) exited]
[New Thread 0xb4fdeb90 (LWP 8051)]
[New Thread 0xb47ddb90 (LWP 8052)]
[Thread 0xb47ddb90 (LWP 8052) exited]
[New Thread 0xb47ddb90 (LWP 8053)]
[New Thread 0xb61e0b90 (LWP 8062)]
[New Thread 0xb57dfb90 (LWP 8063)]
[Thread 0xb57dfb90 (LWP 8063) exited]
[Thread 0xb61e0b90 (LWP 8062) exited]
[New Thread 0xb57dfb90 (LWP 8064)]
[Thread 0xb57dfb90 (LWP 8064) exited]
[New Thread 0xb57dfb90 (LWP 8076)]
What can be the problem and how do i make a good trace using gdb. I tried the
way described in the manual:
http://bacula.org/5.0.x-manuals/en/problems/problems/What_Do_When_Bacula.html#SECTION00640000000000000000
I dont understand the part
thread apply all bt
Please help me out
Thanks and regards,
Jan Jaap
Config Bacula-sd autochanger part:
Autochanger {
Name = TandbergT40
Device = Drive-1
Device = Drive-2
Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d"
Changer Device = /dev/sg3
}
Device {
Name = Drive-1 #
Drive Index = 0
Media Type = LTO-4
Archive Device = /dev/st0
AutomaticMount = yes; # when device opened, read it
AlwaysOpen = yes;
RemovableMedia = yes;
RandomAccess = no;
AutoChanger = yes
Alert Command = "/bin/sh -c '/usr/sbin/smartctl -H -l error %c'"
Spool Directory = /bacula/spool
Maximum Concurrent Jobs = 1
}
Device {
Name = Drive-2 #
Drive Index = 1
Media Type = LTO-4
Archive Device = /dev/st1
AutomaticMount = yes; # when device opened, read it
AlwaysOpen = yes;
RemovableMedia = yes;
RandomAccess = no;
AutoChanger = yes
Alert Command = "/bin/sh -c '/usr/sbin/smartctl -H -l error %c'"
Spool Directory = /bacula/spool
Maximum Concurrent Jobs = 1
}
_________________________________________________________________
Download gratis emoticons voor Messenger
http://www.rulive.nl/aspx/emoticons.aspx------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel