Hi,
In the meantime i have upgraded to 9.0.5, but the problem is still there.
This are the job status lines sent via email after job is canceled:
16-Nov 21:05 troll-dir JobId 777520: Start Backup JobId 777520,
Job=Backup-user1.2017-11-16_21.05.01_39
16-Nov 21:05 troll-dir JobId 777520: Using Device "FileStorage" to write.
16-Nov 21:05 troll-dir JobId 777520: Sending Accurate information to the FD.
16-Nov 21:11 troll-sd JobId 777520: Spooling data ...
16-Nov 21:11 user1-fd JobId 777520: /var/lib/nfs/rpc_pipefs is a different
filesystem. Will not descend from / into it.
17-Nov 09:05 troll-sd JobId 777520: Fatal error: append.c:184 Error reading
data header from FD. n=-2 msglen=0 ERR=Interrupted system call
17-Nov 09:05 troll-dir JobId 777520: Fatal error: Max run time exceeded. Job
canceled.
17-Nov 09:05 troll-dir JobId 777520: Bacula troll-dir 9.0.5 (02Nov17):
The storage status shows Job Backup-user1 is despooling, but actually it is
canceled and despooling never ends and therefore the FileStorage is BLOCKED
forever (only restart bacula helps).
*status storage=File
Connecting to Storage daemon File at troll.obvsg.at:9103
troll-sd Version: 9.0.5 (02 November 2017) x86_64-pc-linux-gnu redhat
Daemon started 15-Nov-17 09:51. Jobs: run=357, running=1.
Heap: heap=307,200 smbytes=1,349,350 max_bytes=14,454,879 bufs=279
max_bufs=1,398
Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0 newbsr=0
Res: ndevices=4 nautochgr=1
Running Jobs:
Writing: Incremental Backup job Backup-user1 JobId=777520 Volume=""
pool="DiskBackup" device="FileStorage" (/data/bacula/files)
spooling=0 despooling=1 despool_wait=0
Files=56,003 Bytes=2,504,128,940 AveBytes/sec=56,423 LastBytes/sec=7,513
FDReadSeqNo=526,592 in_msg=400588 out_msg=6 fd=37
Reading: Full Copy job CopyDiskToTape JobId=777552 Volume=""
pool="DiskBackup" device="FileStorage" (/data/bacula/files) newbsr=0
Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0
FDSocket closed
====
Jobs waiting to reserve a drive:
3602 JobId=777552 File device "FileStorage" (/data/bacula/files) is busy
(already reading/writing). read=0, writers=1 reserved=0
====
Terminated Jobs:
JobId Level Files Bytes Status Finished Name
===================================================================
777670 Full 107 92.16 M OK 17-Nov-17 09:29 CopyDiskToExtClone
777671 Incr 107 92.16 M OK 17-Nov-17 09:29 Backup-idefix
777673 Full 71 3.694 M OK 17-Nov-17 09:29 CopyDiskToExtClone
777675 Incr 71 3.694 M OK 17-Nov-17 09:29 Backup-pcmk1
777677 Full 70 3.611 M OK 17-Nov-17 09:29 CopyDiskToExtClone
777678 Incr 70 3.611 M OK 17-Nov-17 09:29 Backup-pcmk2
777681 Full 104 97.59 M OK 17-Nov-17 09:29 CopyDiskToExtClone
777682 Incr 104 97.59 M OK 17-Nov-17 09:30 Backup-paladin
777551 Full 1,089 336.3 M OK 17-Nov-17 09:30 CopyDiskToExtClone
777685 Incr 1,089 336.3 M OK 17-Nov-17 09:30 Backup-teamwork
====
Device status:
Autochanger "QTM-Scalar" with devices:
"QTM-Drive-0" (/dev/qtm-nst0)
"QTM-Drive-1" (/dev/qtm-nst1)
Device File: "FileStorage" (/data/bacula/files) is not open.
Device is BLOCKED waiting for media.
Available Space=2.058 TB
==
Device File: "FileStorage2" (/data/bacula/files) is not open.
Available Space=2.058 TB
==
Device Tape is "QTM-Drive-0" (/dev/qtm-nst0) mounted with:
Volume: BACU.130
Pool: DiskCopy
Media type: LTO-6
Total Bytes Read=129,024 Blocks Read=2 Bytes/block=64,512
Positioned at File=0 Block=0
Slot 22 is loaded in drive 0.
==
Device Tape is "QTM-Drive-1" (/dev/qtm-nst1) mounted with:
Volume: BACX.105
Pool: ExtClone
Media type: LTO-6
Total Bytes=291,559,389,184 Blocks=1,112,250 Bytes/block=262,134
Positioned at File=93 Block=0
Slot 5 is loaded in drive 1.
==
====
Used Volume status:
Reserved volume: BACU.130 on Tape device "QTM-Drive-0" (/dev/qtm-nst0)
Reader=0 writers=0 reserves=0 volinuse=0
Reserved volume: BACX.105 on Tape device "QTM-Drive-1" (/dev/qtm-nst1)
Reader=0 writers=0 reserves=0 volinuse=0
Volume: Backup-0169 no device. volinuse=0
====
Data spooling: 1 active jobs, 2,508,838,698 bytes; 69 total jobs,
47,483,401,984 max bytes/job.
Attr spooling: 1 active jobs, 2,129,346,316 bytes; 214 total jobs,
2,129,346,316 max bytes.
====
Any ideas what is going on here ?
Best regards
Ulrich
> Ulrich Leodolter <[email protected]> hat am 27. Oktober 2017 um 15:53
> geschrieben:
>
>
> Hi,
>
> i have a problem which seems to be triggered by after job run timeout.
>
> for our desktop machines we have configured:
>
> Max Run Time = 12 hours
>
> desktop machines are always somewhat unpredictable and i happens the we reach
> the 12 hours timeout. but sometimes the storage device is not released after
> the job is canceled.
>
> Device File: "FileStorage" (/data/bacula/files) is not open.
> Device is BLOCKED waiting for media.
> Available Space=2.018 TB
>
> lsof shows the storage daemon has spool files open even though the
> corresponding job was canceled.
>
> Our bacula server version is 9.0.4, but the problem happend also on 7.x
> releases.
>
> I know this description is somewhat vague, but maybe someone has seen
> something like this?
>
> Maybe i should add the we run Copy jobs into 2 Tape pools after all backups
> to disk are finished (or canceled). To allow the Copy jobs run in parallel
> we have defined FileStorage2 which points to the same directory
> (/data/bacula/files) as FileStorage device.
>
> Is it thinkable that jobs canceled after MaxRunTime do not release the File
> storage device?
> Best regards
> Ulrich
>
>
>
>
> Ulrich Leodolter <[email protected]>
> Oesterreichische Bibliothekenverbund und Service GmbH
> Raimundgasse 1/3, A-1020 Wien
> Fax +43 1 4035158-30
> Tel +43 1 4035158-21
> Web https://www.obvsg.at
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Bacula-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/bacula-devel
Ulrich Leodolter <[email protected]>
Oesterreichische Bibliothekenverbund und Service GmbH
Raimundgasse 1/3, A-1020 Wien
Fax +43 1 4035158-30
Tel +43 1 4035158-21
Web https://www.obvsg.at
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel