Hello,
by now I am mostly using our Ceph RGW with the S3 driver as storage and this
works just fine but time and again requests towards the RGW time out.
This is of course our business and not Bacula's but due to a behaviour I can't
understand this causes us more trouble than it should.
When one of these errors happens it looks like this in the logs:
04-Mai 02:32 mybackup-sd JobId 968544: Error: S3_delete_object
ERR=RequestTimeout CURL Effective URL:
https://myrgw/mystorage/myvolume-25809/part.10 CURL OS Error: 101 CURL
Effective URL: https://myrgw/mystorage/myvolume/part.10 CURL OS Error: 101
04-Mai 02:32 mybackup-sd JobId 968544: Fatal error: label.c:575 Truncate error on Cloud
device "mydevice" (/opt/bacula/cloudcache): ERR= S3_delete_object
ERR=RequestTimeout CURL Effective URL: https://myrgw/mystorage/myvolume/part.10 CURL OS
Error: 101 CURL Effective URL: https://myrgw/mystorage/myvolume/part.10 CURL OS Error: 101
04-Mai 02:32 mybackup-sd JobId 968544: Marking Volume "myvolume" in Error in
Catalog.
04-Mai 02:32 mybackup-sd JobId 968544: Fatal error: Job 968544 canceled.
04-Mai 02:32 mybackup-dir JobId 968544: Error: Bacula Enterprise wc-backup2-dir
13.0.2 (18Feb23):
However when I check the Volume status in the Catalog I see:
*list volume=myvolume
+---------+----------------------------+-----------+---------+----------+----------+--------------+---------+------+-----------+-----------+---------+----------+-----------+
| MediaId | VolumeName | VolStatus | Enabled | VolBytes |
VolFiles | VolRetention | Recycle | Slot | InChanger | MediaType | VolType |
VolParts | ExpiresIn |
+---------+----------------------------+-----------+---------+----------+----------+--------------+---------+------+-----------+-----------+---------+----------+-----------+
| 25,809 | myvolume | Recycle | 1 | 1 |
0 | 691,200 | 1 | 0 | 0 | CloudType | 14 | 12
| 0 |
+---------+----------------------------+-----------+---------+----------+----------+--------------+---------+------+-----------+-----------+---------+----------+-----------+
The VolStatus "Recycle" causes the Volume being used for subsequent Jobs which
then all fail with errors like this:
05-Mai 02:31 mybackup-sd JobId 968789: Fatal error: cloud_dev.c:1322 Unable to download
Volume="myvolume" label. S3_get_object ERR=NoSuchKey CURL Effective URL:
https://myrgw/mystorage/myvolume/part.1 CURL OS Error: 101 CURL Effective URL:
https://myrgw/mystorage/myvolume/part.1 CURL OS Error: 101 BucketName : mystorage
RequestId : xxx-default HostId : yyy-default
05-Mai 02:31 mybackup-sd JobId 968789: Marking Volume "myvolume" in Error in
Catalog.
Am I wrong in expecting the Volume to actually be in VolStatus "Error" in the
Catalog so other Jobs will not try to use it?
Would be grateful for any help as this is causing all backups using this
storage to fail once one of the requests to the rgw times out until I manually
mark the Volume as error or truncate the cloudcache for the Volume.
Regards,
Martin
--
Wavecon GmbH
Anschrift: Thomas-Mann-Straße 16-20, 90471 Nürnberg
Website: www.wavecon.de
Support: supp...@wavecon.de
Telefon: +49 (0)911-1206581 (werktags von 9 - 17 Uhr)
Hotline 24/7: 0800-WAVECON
Fax: +49 (0)911-2129233
Registernummer: HBR Nürnberg 41590
GF: Cemil Degirmenci
UstID: DE251398082
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users