Ah, so the short block is part of jobid 12065 and therefore the copy of jobid 12068 is very likely to be complete.
The "[SI0202] End of Volume" message means that write() wrote less that expected. The only reasons I can think of are disk space shortage, I/O error, some other process-specific limit (such as controlled by ulimit -f) or buggy filesystem. Have you set some limit on the volume size? It is interesting that those 3 volumes are all just slightly smaller that 653,198,000,000 bytes. __Martin >>>>> On Fri, 22 Oct 2021 13:53:12 +0200, Marko Bartolić said: > > Hi, > > the backup job on 03-Oct definetly filled the volume 10, but not the > whole disk, it still has some free space. > > *list volume > > Automatically selected Catalog: MyCatalog > > Using Catalog "MyCatalog" > > Pool: Default > > +---------+------------+-----------+---------+-----------------+----------+--------------+---------+------+-----------+-----------+---------+----------+---------------------+------------+ > > | mediaid | volumename | volstatus | enabled | volbytes | volfiles | > volretention | recycle | slot | inchanger | mediatype | voltype | volparts | > lastwritten | expiresin | > > +---------+------------+-----------+---------+-----------------+----------+--------------+---------+------+-----------+-----------+---------+----------+---------------------+------------+ > > ... > > | 9 | volume-09 | Full | 1 | 653,197,959,321 | 152 | > 15,552,000 | 1 | 0 | 0 | data1 | 1 | 0 | > 2021-09-06 23:06:04 | 11,615,551 | > | 10 | volume-10 | Full | 1 | 653,197,888,553 | 152 | > 15,552,000 | 1 | 0 | 0 | data1 | 1 | 0 | > 2021-10-03 23:09:09 | 13,944,936 | > | 11 | volume-11 | Full | 1 | 653,197,958,518 | 152 | > 15,552,000 | 1 | 0 | 0 | data1 | 1 | 0 | > 2021-10-05 13:11:17 | 14,081,864 | > > ... > > > > /var/log/messages or dmesg do not show any problems on 03-Oct. > > Yes, that is the complete output for jobid 12068 from bacula.log. But > there was another job running at approximately same time during which > the volume changed from 10 to 11: > > 03-Oct 23:09 backup-sd JobId 12065: [SI0202] End of Volume "volume-10" at > 653197953065 on device "backup-device-data1" (/data1). Write of 64512 bytes > got 42967. > > 03-Oct 23:09 backup-sd JobId 12065: End of medium on Volume "volume-10" > Bytes=653,197,888,553 Blocks=10,125,619 at 03-Oct-2021 23:09. > > 03-Oct 23:09 backup-dir JobId 12065: There are no more Jobs associated with > Volume "volume-11". Marking it purged. > > 03-Oct 23:09 backup-dir JobId 12065: All records pruned from Volume > "volume-11"; marking it "Purged" > > 03-Oct 23:09 backup-dir JobId 12065: Recycled volume "volume-11" > > 03-Oct 23:09 backup-sd JobId 12065: Recycled volume "volume-11" on File > device "backup-device-data1" (/data1), all previous data lost. > > 03-Oct 23:09 backup-sd JobId 12065: New volume "volume-11" mounted on device > "backup-device-data1" (/data1) at 03-Oct-2021 23:09. > > > > Marko Bartolić, Nimium d.o.o., email:<marko.barto...@nimium.hr> > > On 22. 10. 2021. 13:18, Martin Simmons wrote: > > Did the backup job on 03-Oct fill the disk? Did you check for syslog > > messages > > for that date too? > > > > Also, did you include the complete output below from the backup jobid 12068? > > I would expect more information to be printed between 03-Oct 23:06 and > > 03-Oct > > 23:19 about the volumes being used and the reason for changing from > > volume-10 > > to volume-11. > > > > __Martin > > > > > >>>>>> On Thu, 21 Oct 2021 16:27:13 +0200, Marko Bartolić said: > >> Hello, > >> > >> I hope this is the right place to ask. > >> > >> We have Bacula version 9.6.5. on a RHEL 8.2 physical server. We use copy > >> job that runs every 6 months and copies the last full backup of 22 > >> clients to a separate SCSI disk which is then pulled out from the server > >> for offsite backup. So both the backup disks and disk on which the > >> backups are later copied are local, physical disks in the machine. This > >> was working fine every time up to now. During last run of the copy job, > >> we noticed the following error on one of the copy jobs: > >> > >> 21-Oct 04:00 backup-dir JobId 12612: Copying using JobId=12068 > >> Job=websrv.2021-10-03_23.05.00_12 > >> 21-Oct 05:45 backup-dir JobId 12612: Start Copying JobId 12612, > >> Job=Copy-to-offsite-disk.2021-10-21_04.00.01_36 > >> 21-Oct 05:45 backup-dir JobId 12612: Using Device "backup-device-data1" to > >> read. > >> 21-Oct 05:45 backup-sd JobId 12612: Ready to read from volume "volume-10" > >> on File device "backup-device-data1" (/data1). > >> 21-Oct 05:45 backup-sd JobId 12612: Forward spacing Volume "volume-10" to > >> addr=641829209227 > >> *21-Oct 05:46 backup-sd JobId 12612: Warning: block.c:682 [SE0208] Volume > >> data error at 0:0! Short block of 42967 bytes on device > >> "backup-device-data1" (/data1) discarded. ** > >> **21-Oct 05:46 backup-sd JobId 12612: Error: read_records.c:160 > >> block.c:682 [SE0208] Volume data error at 0:0! Short block of 42967 bytes > >> on device "backup-device-data1" (/data1) discarded. * > >> 21-Oct 05:46 backup-sd JobId 12612: End of Volume "volume-10" at > >> addr=653197996032 on device "backup-device-data1" (/data1). > >> 21-Oct 05:46 backup-sd JobId 12612: Ready to read from volume "volume-11" > >> on File device "backup-device-data1" (/data1). > >> 21-Oct 05:46 backup-sd JobId 12612: Forward spacing Volume "volume-11" to > >> addr=64742 > >> 21-Oct 05:50 backup-sd JobId 12612: End of Volume "volume-11" at > >> addr=49666111453 on device "backup-device-data1" (/data1). > >> 21-Oct 05:50 backup-sd JobId 12612: Elapsed time=00:05:02, Transfer > >> rate=16.65 M Bytes/second > >> 21-Oct 05:50 backup-dir JobId 12612: Error: Bacula backup-dir 9.6.5 > >> (11Jun20): > >> Build OS: x86_64-redhat-linux-gnu-bacula redhat (Core) > >> Prev Backup JobId: 12068 > >> Prev Backup Job: websrv.2021-10-03_23.05.00_12 > >> New Backup JobId: 12613 > >> Current JobId: 12612 > >> Current Job: Copy-to-offsite-disk.2021-10-21_04.00.01_36 > >> Backup Level: Full > >> Client: backup-fd > >> FileSet: "backup" 2020-09-17 21:53:08 > >> Read Pool: "Default" (From Command input) > >> Read Storage: "backup-storage-data1" (From Pool resource) > >> Write Pool: "Offsite" (From Command input) > >> Write Storage: "backup-storage-data2" (From Command input) > >> Catalog: "MyCatalog" (From Client resource) > >> Start time: 21-Oct-2021 05:45:32 > >> End time: 21-Oct-2021 05:50:42 > >> Elapsed time: 5 mins 10 secs > >> Priority: 10 > >> * SD Files Written: 200,038 ** > >> ** SD Bytes Written: 5,029,190,722 (5.029 GB) * > >> Rate: 16223.2 KB/s > >> Volume name(s): vol-01 > >> Volume Session Id: 5353 > >> Volume Session Time: 1619097903 > >> Last Volume Bytes: 573,628,844,941 (573.6 GB) > >> SD Errors: 1 > >> SD termination status: OK > >> Termination: *** Copying Error *** > >> > >> > >> > >> Job ends with error, but upon checking the backup size and number of > >> files of the original "websrv" backup job (which was successful), we see > >> that the number of files and bytes written are the same as is in the > >> copy job: > >> > >> 03-Oct 23:06 backup-dir JobId 12068: Start Backup JobId 12068, > >> Job=websrv.2021-10-03_23.05.00_12 > >> 03-Oct 23:06 backup-dir JobId 12068: Using Device "backup-device-data1" to > >> write. > >> 03-Oct 23:06 websrv-fd JobId 12068: shell command: run ClientBeforeJob > >> "/etc/bacula/scripts/mariadb-backup.sh" > >> 03-Oct 23:10 websrv-fd JobId 12068: /srv/cpap is a different > >> filesystem. Will not descend from / into it. > >> 03-Oct 23:19 backup-sd JobId 12068: Elapsed time=00:13:26, Transfer > >> rate=6.239 M Bytes/second > >> 03-Oct 23:19 backup-sd JobId 12068: Sending spooled attrs to the Director. > >> Despooling 53,043,158 bytes ... > >> 03-Oct 23:19 backup-dir JobId 12068: Bacula backup-dir 9.6.5 (11Jun20): > >> Build OS: x86_64-redhat-linux-gnu-bacula redhat (Core) > >> JobId: 12068 > >> Job: websrv.2021-10-03_23.05.00_12 > >> Backup Level: Full > >> Client: "websrv-fd" 9.0.6 (20Nov17) > >> x86_64-redhat-linux-gnu,redhat,Enterprise 8.1 > >> FileSet: "websrv" 2020-09-17 23:05:00 > >> Pool: "Default" (From Job resource) > >> Catalog: "MyCatalog" (From Client resource) > >> Storage: "backup-storage-data1" (From Pool resource) > >> Scheduled time: 03-Oct-2021 23:05:00 > >> Start time: 03-Oct-2021 23:06:04 > >> End time: 03-Oct-2021 23:19:37 > >> Elapsed time: 13 mins 33 secs > >> Priority: 10 > >> *FD Files Written: 200,038 SD Files Written: 200,038 FD Bytes Written: > >> 4,996,564,202 (4.996 GB) SD Bytes Written: 5,029,190,722 (5.029 GB)* > >> Rate: 6145.8 KB/s > >> Software Compression: 40.9% 1.7:1 > >> Comm Line Compression: 16.7% 1.2:1 > >> Snapshot/VSS: no > >> Encryption: no > >> Accurate: no > >> Volume name(s): volume-10|volume-11 > >> Volume Session Id: 4812 > >> Volume Session Time: 1619097903 > >> Last Volume Bytes: 49,996,799,959 (49.99 GB) > >> Non-fatal FD errors: 0 > >> SD Errors: 0 > >> FD termination status: OK > >> SD termination status: OK > >> Termination: Backup OK > >> > >> 03-Oct 23:19 backup-dir JobId 12068: Begin pruning Jobs older than 6 > >> months . > >> 03-Oct 23:19 backup-dir JobId 12068: Pruned 1 Job for client websrv-fd > >> from catalog. > >> 03-Oct 23:19 backup-dir JobId 12068: Begin pruning Files. > >> 03-Oct 23:19 backup-dir JobId 12068: Pruned Files from 1 Jobs for client > >> websrv-fd from catalog. > >> 03-Oct 23:19 backup-dir JobId 12068: End auto prune. > >> > >> > >> I also don't see any errors in messages or dmesg output. > >> > >> From what I gathered from an older thread: > >> https://marc.info/?l=bacula-devel&m=123237957307571&w=2 > >> <https://marc.info/?l=bacula-devel&m=123237957307571&w=2> , this points > >> that copying completed successfully. Also mentioned in the thread is > >> that "Maximium Block Size" could be the problem, but we haven't defined > >> that property. > >> > >> Does anyone know what could cause this? > >> > >> > >> BR > >> > >> -- > >> Marko Bartolić, Nimium d.o.o., email:<marko.barto...@nimium.hr> > >> > _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel