Hello list, i have a problem with btape (bacula-sd too) and writing on a tape. After a while it times out sporadically and i can't find out the reason. The problem is reproducable.
Filling the tape with tar works well, I tested all the commands (rewind, fsr,bsr ...) by hand and there is no problem with that. The mtx-changer script from bacula works well too. My environment: Hardware: - a linux server running Debian Linux 2.6.32-bpo.4-xen-686 #1 SMP Tue Apr 13 16:03:38 UTC 2010 i686 GNU/Linux - a new Tapechanger Quantum Superloader 3 LTO4 with an Ultrium 4 drive (2190) - its coupled with an Adaptec 1045 SAS-adapter Since kernel 2.6.30 the mvsas-module supports the adapter Software: - Bacula 5.0.1-1 installed from the Backports - mt-st 1.1-4 - i recompiled the st driver with debugging: st: Version 20081215, fixed bufsize 32768, s/g segs 256 backup:~# lsscsi [1:0:0:0] tape QUANTUM ULTRIUM 4 2190 /dev/st0 [1:0:0:1] mediumx QUANTUM UHDL 0061 /dev/sch0 backup:~# lsmod ...stripped... scsi_transport_sas 15093 2 mvsas,libsas libata 114448 3 ata_piix,ata_generic,libsas scsi_mod 101065 9 sg,st,osst,ch,sd_mod,mvsas,libsas,scsi_transport_sas,libata backup:~# tapeinfo -f /dev/sg4 Product Type: Tape Drive Vendor ID: 'QUANTUM ' Product ID: 'ULTRIUM 4 ' Revision: '2190' Attached Changer API: No SerialNumber: 'JN0933AMJ50492' MinBlock: 1 MaxBlock: 16777215 SCSI ID: 0 SCSI LUN: 0 Ready: yes BufferedMode: yes Medium Type: Not Loaded Density Code: 0x46 BlockSize: 0 DataCompEnabled: yes DataCompCapable: yes DataDeCompEnabled: yes CompType: 0x1 DeCompType: 0x1 Block Position: 14111 Partition 0 Remaining Kbytes: 853050 Partition 0 Size in Kbytes: 853060 ActivePartition: 0 EarlyWarningSize: 0 backup:~# cat /etc/stinit.def # Defaults, i added read-ahead and scsi2logical {buffer-writes read-ahead can-bsr scsi2logical async-writes} # Entry from EMC NetWorker Technical Note manufacturer=QUANTUM model="ULTRIUM 4" { auto-lock=0 timeout=800 long-timeout=3600 can-partition=0 scsi2logical=1 # This gives the highest performance, but with old versions of st module it can cause problems with spaning tapes async-writes=1 # linux does not have full LTO-4 support to set modeline properly (should be 0x46 for LTO-4), so use device defaults instead mode1 blocksize=0 density=0x00 } In the document there was the option "drive-buffering", but it's not available (and covered with buffer-writes?) backup:~# mt-st -f /dev/nst0 stshowopt The options set: buffer-writes async-writes read-ahead can-bsr scsi2logical backup:~# cat /proc/sys/kernel/hung_task_timeout_secs 300 Storage-Daemon configuration: Autochanger { Name = wechsler Device = band Changer Command = "sudo /etc/bacula/scripts/mtx-changer %c %o %S %a % d" Changer Device = /dev/sg5 } Device { Name = band Drive Index = 0 Media Type = LTO-4 Archive Device = /dev/nst0 Device Type = Tape Automatic Mount = yes; # when device opened, read it AlwaysOpen = yes; RemovableMedia = yes; RandomAccess = no; Maximum File Size = 1GB #2GB #5GB Changer Command = "sudo /etc/bacula/scripts/mtx-changer %c %o %S %a % d" Changer Device = /dev/sg5 AutoChanger = yes Maximum Changer Wait = 300 } Okay, I can run btape test and it times out spouradically command: strace -ftt -ostrace_btape4 btape -c /etc/bacula/bacula-sd.conf band dmesg output since loading the recompiled module, maybe interesting: [59066.912097] st: Version 20081215, fixed bufsize 32768, s/g segs 256 [59066.912504] st 1:0:0:0: Attached scsi tape st0 [59066.912548] st 1:0:0:0: st0: try direct i/o: yes (alignment 4 B) [59066.973150] st0: Block limits 1 - 16777215 bytes. [59066.973818] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8 [59066.973867] st0: Density 46, tape length: 0, drv buffer: 1 [59066.973914] st0: Block size: 0, buffer size: 4096 (1 blocks). [59066.980191] st0: Block limits 1 - 16777215 bytes. [59066.980856] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8 [59066.980905] st0: Density 46, tape length: 0, drv buffer: 1 [59066.980952] st0: Block size: 0, buffer size: 4096 (1 blocks). [59066.981004] st0: Normal timeout set to 800 seconds. [59066.981040] st0: Long timeout set to 3600 seconds. [59066.981079] st0: Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 [59066.981138] st0: can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, [59066.981178] st0: defs for wr: 0, no block limits: 0, partitions: 0, s2 log: 0 [59066.981237] st0: sysv: 0 nowait: 0 sili: 0 [59066.981271] st0: debugging: 1 [59066.981303] st0: Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 [59066.981363] st0: can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, [59066.981403] st0: defs for wr: 0, no block limits: 0, partitions: 0, s2 log: 1 [59066.981463] st0: sysv: 0 nowait: 0 sili: 0 [59066.981497] st0: debugging: 1 [59066.981528] st0: Default block size set to 0 bytes. [59066.981564] st0: Density default set to 0 [59066.981597] st0: Setting block size to 0 bytes. [59066.981631] st0: Setting density code to 0. [59111.178879] st0: Block limits 1 - 16777215 bytes. [59111.179531] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8 [59111.179580] st0: Density 46, tape length: 0, drv buffer: 1 [59111.179627] st0: Block size: 0, buffer size: 4096 (1 blocks). [59111.179787] st0: Rewinding tape. [59111.190845] st0: Block limits 1 - 16777215 bytes. [59111.191525] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8 [59111.191574] st0: Density 46, tape length: 0, drv buffer: 1 [59111.191621] st0: Block size: 0, buffer size: 4096 (1 blocks). [59111.191753] st0: Locking drive door. [59111.192529] st0: Setting block size to 0 bytes. [59111.193439] st0: Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 [59111.193510] st0: can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, [59111.193554] st0: defs for wr: 0, no block limits: 0, partitions: 0, s2 log: 1 [59111.193616] st0: sysv: 0 nowait: 0 sili: 0 [59111.193650] st0: debugging: 1 [59111.195264] st0: Block limits 1 - 16777215 bytes. [59111.195921] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8 [59111.195970] st0: Density 46, tape length: 0, drv buffer: 1 [59111.196723] st0: Block size: 0, buffer size: 4096 (1 blocks). [59111.196857] st0: Rewinding tape. [59111.207772] st0: Block limits 1 - 16777215 bytes. [59111.208423] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8 [59111.208471] st0: Density 46, tape length: 0, drv buffer: 1 [59111.208518] st0: Block size: 0, buffer size: 4096 (1 blocks). [59111.208652] st0: Locking drive door. [59111.209338] st0: Setting block size to 0 bytes. [59111.210314] st0: Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 [59111.210375] st0: can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, [59111.210415] st0: defs for wr: 0, no block limits: 0, partitions: 0, s2 log: 1 [59111.210474] st0: sysv: 0 nowait: 0 sili: 0 [59111.210510] st0: debugging: 1 [59140.637500] st0: Rewinding tape. [59140.648091] st0: Setting block size to 0 bytes. [59140.648129] st0: Setting density code to 0. [59150.021243] st0: Writing 1 filemarks. [59159.801495] st0: Writing 1 filemarks. [59162.264608] st0: Rewinding tape. [59182.339981] st0: Error: 8000002, cmd: 8 0 0 fc 0 0 [59182.340021] st0: Sense Key : No Sense [current] [59182.340068] Info fld=0xfc00, FMK [59182.340099] st0: Add. Sense: Filemark detected [59182.340143] st0: Sense: f0 0 80 0 0 fc 0 1e [59182.340177] st0: EOF detected (0 bytes read). [59182.340764] st0: EOF/EOM flag up (2). Bytes 0 [59182.341791] st0: EOF up (2). Left 64512, needed 64512. [59191.023694] st0: Rewinding tape. [59191.025548] st0: Spacing tape forward over 4 blocks. [59191.033038] st0: Spacing tape forward over 195 blocks. [59191.094513] st0: Spacing tape forward over 9798 blocks. [59191.400620] st0: Spacing tape forward over 1 filemarks. [59191.423898] st0: EOF/EOM flag up (2). Bytes 0 [59191.425424] st0: EOF up (2). Left 64512, needed 64512. [59191.426159] st0: Spacing tape forward over 599 blocks. [59191.574577] st0: Spacing tape forward over 9398 blocks. [59191.962161] st0: Rewinding tape. [59196.152816] st0: Writing 1 filemarks. [59200.007350] st0: Writing 1 filemarks. [59200.050531] st0: Writing 1 filemarks. [59200.089635] st0: Unlocking drive door. [59200.090563] st0: Number of r/w requests 40013, dio used in 40013, pages 680221. [59200.092058] st0: Block limits 1 - 16777215 bytes. [59200.092714] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8 [59200.092765] st0: Density 46, tape length: 0, drv buffer: 1 [59200.092802] st0: Block size: 0, buffer size: 4096 (1 blocks). [59200.092933] st0: Rewinding tape. [59204.136462] st0: Block limits 1 - 16777215 bytes. [59204.137134] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8 [59204.137187] st0: Density 46, tape length: 0, drv buffer: 1 [59204.137242] st0: Block size: 0, buffer size: 4096 (1 blocks). [59204.137399] st0: Locking drive door. [59204.138218] st0: Setting block size to 0 bytes. [59204.139185] st0: Mode 0 options: buffer writes: 1, async writes: 1, read ahead: 1 [59204.139247] st0: can bsr: 1, two FMs: 0, fast mteom: 0, auto lock: 0, [59204.139287] st0: defs for wr: 0, no block limits: 0, partitions: 0, s2 log: 1 [59204.139346] st0: sysv: 0 nowait: 0 sili: 0 [59204.139380] st0: debugging: 1 [59204.139652] st0: Rewinding tape. [59204.141046] st0: Spacing tape forward over 8388607 filemarks. [59204.142287] st0: Error: 8000002, cmd: 11 1 7f ff ff 0 [59204.142334] st0: Sense Key : Blank Check [current] [59204.142403] Info fld=0x7ffffc [59204.142433] st0: Add. Sense: End-of-data detected [59204.427601] st0: Writing 1 filemarks. [59206.061925] st0: Rewinding tape. [59218.160250] st0: Error: 8000002, cmd: 8 0 0 fc 0 0 [59218.160299] st0: Sense Key : No Sense [current] [59218.160369] Info fld=0xfc00, FMK [59218.160399] st0: Add. Sense: Filemark detected [59218.160445] st0: Sense: f0 0 80 0 0 fc 0 1e [59218.160479] st0: EOF detected (0 bytes read). [59218.160960] st0: EOF/EOM flag up (2). Bytes 0 [59218.162043] st0: EOF up (2). Left 64512, needed 64512. [59218.183776] st0: Error: 8000002, cmd: 8 0 0 fc 0 0 [59218.183824] st0: Sense Key : No Sense [current] [59218.183892] Info fld=0xfc00, FMK [59218.183923] st0: Add. Sense: Filemark detected [59218.183969] st0: Sense: f0 0 80 0 0 fc 0 1e [59218.184004] st0: EOF detected (0 bytes read). [59218.184800] st0: EOF/EOM flag up (2). Bytes 0 [59218.185862] st0: EOF up (2). Left 64512, needed 64512. [59218.236914] st0: Error: 8000002, cmd: 8 0 0 fc 0 0 [59218.236962] st0: Sense Key : No Sense [current] [59218.237031] Info fld=0xfc00, FMK [59218.237062] st0: Add. Sense: Filemark detected [59218.237109] st0: Sense: f0 0 80 0 0 fc 0 1e [59218.237143] st0: EOF detected (0 bytes read). [59218.237657] st0: EOF/EOM flag up (2). Bytes 0 [59218.238703] st0: EOF up (2). Left 64512, needed 64512. [59223.238547] st0: Error: 8000002, cmd: 8 0 0 fc 0 0 [59223.238596] st0: Sense Key : No Sense [current] [59223.238661] Info fld=0xfc00, FMK [59223.238692] st0: Add. Sense: Filemark detected [59223.238741] st0: Sense: f0 0 80 0 0 fc 0 1e [59223.238775] st0: EOF detected (0 bytes read). [59223.239319] st0: EOF/EOM flag up (2). Bytes 0 [59223.240723] st0: Error: 8000002, cmd: 8 0 0 fc 0 0 [59223.240760] st0: Sense Key : Blank Check [current] [59223.240809] Info fld=0xfc00 [59223.240838] st0: Add. Sense: End-of-data detected [59223.240886] st0: Sense: f0 0 8 0 0 fc 0 1e [59223.240920] st0: Tape error while reading. [59223.240953] st0: Zero returned for first BLANK CHECK after EOF. [59223.241790] st0: Rewinding tape. [59223.243540] st0: Setting block size to 0 bytes. [59223.243576] st0: Setting density code to 0. [59227.330717] st0: Writing 1 filemarks. [59231.179419] st0: Spacing tape backward over 1 filemarks. [59235.158520] st0: Spacing tape backward over 1 blocks. [59235.169386] st0: Rewinding tape. [59239.568865] st0: Writing 1 filemarks. [59243.425475] st0: Writing 1 filemarks. [59243.468538] st0: Writing 1 filemarks. [59243.510459] st0: Writing 1 filemarks. [59243.551497] st0: Writing 1 filemarks. [59243.590430] st0: Rewinding tape. [59251.730032] st0: Spacing tape forward over 1 filemarks. [59255.792254] st0: Spacing tape forward over 2 filemarks. [59255.873336] st0: Rewinding tape. [59260.470836] st0: Spacing tape forward over 4 filemarks. [59260.476113] st0: Spacing tape forward over 1 filemarks. [59287.652420] st0: Rewinding tape. [59300.778049] st0: Writing 1 filemarks. [59772.344077] INFO: task btape:15348 blocked for more than 300 seconds. [59772.344132] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59772.344206] btape D c23e1668 0 15348 15347 0x00000001 [59772.344254] e901d100 00200286 c23e0eec c23e1668 00008020 c144ed20 c144ed20 00000000 [59772.344348] e901d2bc c45bed20 00000000 c10b6d5d 00000000 e98ba540 ed9f204c c2399079 [59772.344441] c2399000 e901d2bc 00000000 c12880e8 00000000 ed9d47c1 c1005df4 00000000 [59772.344533] Call Trace: [59772.344569] [<c10b6d5d>] ? kmem_cache_alloc+0x79/0xe5 [59772.344626] [<c12880e8>] ? _spin_lock_irq+0xb/0x21 [59772.344681] [<ed9d47c1>] ? sas_queuecommand+0x1f2/0x211 [libsas] [59772.344723] [<c1005df4>] ? xen_force_evtchn_callback+0xc/0x10 [59772.344764] [<c1287371>] ? schedule_timeout+0x20/0xb0 [59772.344803] [<c112d1ac>] ? blk_peek_request+0x135/0x143 [59772.344861] [<ed862987>] ? scsi_dispatch_cmd+0x185/0x1e5 [scsi_mod] [59772.344920] [<ed867387>] ? scsi_request_fn+0x3c1/0x47a [scsi_mod] [59772.344961] [<c128727a>] ? wait_for_common+0xa5/0x101 [59772.345000] [<c10349b0>] ? default_wake_function+0x0/0x8 [59772.345040] [<edf02984>] ? st_scsi_execute_end+0x0/0x45 [st] [59772.345080] [<edf03099>] ? st_do_scsi+0x2a2/0x2ca [st] [59772.345118] [<edf07828>] ? st_write+0x4f1/0x926 [st] [59772.345157] [<c1044dc3>] ? ptrace_stop+0xc0/0xf2 [59772.345195] [<edf07337>] ? st_write+0x0/0x926 [st] [59772.345232] [<c10bc156>] ? vfs_write+0x7e/0xd6 [59772.345269] [<c10bc246>] ? sys_write+0x3c/0x63 [59772.345306] [<c1008c3c>] ? syscall_call+0x7/0xb ... repeats several times ... [60130.146514] /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= -5 [60154.292977] /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= -5 [60178.439443] /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= -5 [60202.585918] /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= -5 [60226.732387] /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= -5 [60250.878860] /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= -5 [60275.025323] /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= -5 [60299.171787] /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= -5 [60323.318249] /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= -5 [60347.464715] /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= -5 [60372.344064] INFO: task btape:15348 blocked for more than 300 seconds. [60372.344119] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [60372.344193] btape D c23e1668 0 15348 15347 0x00000001 [60372.344241] e901d100 00200286 c23e0eec c23e1668 00008020 c144ed20 c144ed20 00000000 [60372.344334] e901d2bc c45bed20 00000000 c10b6d5d 00000000 e98ba540 ed9f204c c2399079 [60372.344428] c2399000 e901d2bc 00000000 c12880e8 00000000 ed9d47c1 c1005df4 00000000 [60372.344520] Call Trace: [60372.344556] [<c10b6d5d>] ? kmem_cache_alloc+0x79/0xe5 [60372.344613] [<c12880e8>] ? _spin_lock_irq+0xb/0x21 [60372.344668] [<ed9d47c1>] ? sas_queuecommand+0x1f2/0x211 [libsas] [60372.344710] [<c1005df4>] ? xen_force_evtchn_callback+0xc/0x10 [60372.344750] [<c1287371>] ? schedule_timeout+0x20/0xb0 [60372.344790] [<c112d1ac>] ? blk_peek_request+0x135/0x143 [60372.344848] [<ed862987>] ? scsi_dispatch_cmd+0x185/0x1e5 [scsi_mod] [60372.344907] [<ed867387>] ? scsi_request_fn+0x3c1/0x47a [scsi_mod] [60372.344948] [<c128727a>] ? wait_for_common+0xa5/0x101 [60372.344987] [<c10349b0>] ? default_wake_function+0x0/0x8 [60372.345026] [<edf02984>] ? st_scsi_execute_end+0x0/0x45 [st] [60372.345066] [<edf03099>] ? st_do_scsi+0x2a2/0x2ca [st] [60372.345105] [<edf07828>] ? st_write+0x4f1/0x926 [st] [60372.345143] [<c1044dc3>] ? ptrace_stop+0xc0/0xf2 [60372.345181] [<edf07337>] ? st_write+0x0/0x926 [st] [60372.345219] [<c10bc156>] ? vfs_write+0x7e/0xd6 [60372.345255] [<c10bc246>] ? sys_write+0x3c/0x63 [60372.345292] [<c1008c3c>] ? syscall_call+0x7/0xb the INFO-part just says the process is waiting for something and repeats every 300 seconds. i traced the btape process using strace -ftt. Its very big, so i stripped repeating parts and published it here: http://pastebin.com/87HdsCRG I'm thinking of too short ioctl-timeouts, but I could be wrong. Can you help me or give me some hints? Thanks for working on this, Christoph Lehmann -- Christoph Lehmann Dr. Ralf Schwedler e.K. Amtsgericht Pinneberg, HRA 5609 PI Internet / Sicherheit Bankverbindung: Sparkasse Westholstein Steinbr�ckstrasse 13b BLZ 222 500 20 25524 Itzehoe, Germany Konto-Nr. 11 00 84 323 T +49 4821 40 87 58 International Account Number 0700-SCHWEDLER IBAN DE05 2225 0020 0110 0843 23 F +49 4821 40 87 59 BIC HSHNDEH1IZH ---- ---- www.schwedler.com Steuer-Nr. 18019 34593 lehm...@schwedler.com USt.-Id. DE 214230181
------------------------------------------------------------------------------
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users