Hello, see my answer below .. On Wed, 2014-04-09 at 21:08 +0200, Kern Sibbald wrote: > Hello, > > See my comments below ... > > On 04/09/2014 01:46 PM, Ulrich Leodolter wrote: > > Hello, > > > > i am testing one the new bacula 7.x features: > > > > *Migration/Copy/VirtualFull Performance Enhancements* > > > > The Bacula Storage daemon now permits multiple jobs to simultaneously read > > the same disk Volume, > > which gives substantial performance enhancements when running Migration, > > Copy, or VirtualFull jobs > > that read disk Volumes. Our testing shows that when running multiple > > simultaneous jobs, > > the jobs can finish up to ten times faster with this version of Bacula. > > This is built-in to the Storage daemon, so it happens automatically and > > transparently . > > > > > > i our setup we have 2 CopyDiskToTape which go into different pools on Tape > > storage. > > our storage is a 2-drive autochanger device. > > > > before the copy jobs are started, each drive has mounted a volume of the > > destination pools. > > > > the problem is that both copy jobs only look at drive index 0 > > and premounted volumes are always swapped (mounted/unmounted) at drive 0. > > > > > > we have a second bacula installation running 5.2.13 > > which has more or less the same setup and hardware. > > on this installation parallel copy jobs runs can run without > > swapping volumes on autochanger drive 0. > > to overcome the exclusive read-lock limitation in this bacula version > > we have defined two file storage devices which point to the > > same location. our sql selects the copy jobs in opposite order > > for the two jobs, so we can minimize the number of conflicts > > when one file volume is already locked. > > Unfortunately, I don't understand well enough what the real problem is. > It sounds like you are saying there is a problem on one bacula > installation and not on the other, which would imply that there is > something in the conf that is triggering the problem. > > I am also not sure what the problem is of swapping volumes. With the > current algorithm (rather primitive) when no jobs are running Bacula > will always look a the drives in the order they are in the conf file or > perhaps it is in alphabetic order, so it will always look at a > particular drive first. If that drive is not being used, the job will be > assigned that drive. > > A better behavior might be to search for an empty drive and always start > with that one, but that will not be an ideal solution as at some point > all the drives will have a volume in them so some volume needs to be > swapped. > > I have been meaning to work on improving the tape usage algorithm, in > particular putting in a better round robin scheme than currently exists, > but unfortunately there always seem to be more urgent tasks, and the > list of things to do is getting larger rather than smaller, so I am > probably not being very optimistic here. > > If there is a definitive bug her rather than an inefficiency, and it can > be clearly described I might be able to fix it. > > > > > > > my question: > > > > has there something changed in bacula 7.x how bacula determines > > if a volume is already mounted for an autochanger device ? > No, nothing has changed. If a volume is premounted and a job wants to > use it, Bacula should notice that and select that drive, because part of > the current algorithm is to look at all pre-mounted volumes to see if > one can be used. If that is not the case, and you are talking about a > single job (no other jobs contending for the same resources), I would > like to see a detailed analysis what is going wrong, because I could > probably fix it. > > > > why does bacula not use a premounted volume at drive index 1 ? > Good question. I would need to know the exact conditions before I could > answer, but most likely Bacula sees that drive index 0 is available and > takes it, and then asks for a tape and gets a different one. If it > swaps the Volume from drive 1 onto drive 0 and drive 1 is not being > used, then the current algorithm is not working correctly, but to fix it > I would need a reproducible case. > > Best regards, > Kern
It seems there is a bug and bacula 7.0.x behaves different than version 5.x.x. Yesterday have disabled one of my copy jobs, the one with should use drive 1. The other copy job was scheduled today at 6:05, it completed normal. Below you can see storage status after the first copy job finished. There is one important thing in the status below, volume BACX.101 from Pool ExtClone is still mounted in QTM-Drive-1, mounted it yesterday evening manually. Today i started the second copy job which has ExtClone as destination pool. Because BACX.101 from ExtClone pool is already mounted i expect Bacula to use it, but see below what happend (JobId 615860) BACX.101 was swapped into QTM-Drive-0. I am 100% sure this worked without swapping in version 5.2.13. This test also verified it has nothing to do with parallel copy jobs. After upgrading to 7.0.2 i did not change any config settings and it does not include PreferMountedVolumes, so i expect the default value PreferMountedVolumes=Yes Regarding the algorithm i will answer separate, i will try to explain what i would call a "natural" or clever algorithm. But i fully understand this is not a trivial task, because it may break existing installations. Do you need more information? When i have time i will try find why Bacula 7.0.x behaves different. Best regards Ulrich *status storage=QTM-Tape Connecting to Storage daemon QTM-Tape at troll.obvsg.at:9103 troll-sd Version: 7.0.2 (02 April 2014) x86_64-unknown-linux-gnu redhat Daemon started 03-Apr-14 14:36. Jobs: run=721, running=0. Heap: heap=270,336 smbytes=15,158,906 max_bytes=17,978,512 bufs=383 max_bufs=5,779 Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0 Running Jobs: No Jobs running. ==== Jobs waiting to reserve a drive: ==== Terminated Jobs: JobId Level Files Bytes Status Finished Name =================================================================== 615847 Full 2,658 57.70 M OK 10-Apr-14 06:14 CopyDiskToTape 615848 Incr 2,658 57.70 M OK 10-Apr-14 06:14 Backup-troll 615849 Full 1,835 49.44 M OK 10-Apr-14 06:15 CopyDiskToTape 615850 Incr 1,835 49.44 M OK 10-Apr-14 06:15 Backup-idefix 615851 Full 3,424 34.59 M OK 10-Apr-14 06:15 CopyDiskToTape 615852 Incr 3,424 34.59 M OK 10-Apr-14 06:15 Backup-apollo 615853 Full 2,128 47.67 M OK 10-Apr-14 06:15 CopyDiskToTape 615854 Incr 2,128 47.67 M OK 10-Apr-14 06:15 Backup-paladin 615812 Full 3,636 136.4 M OK 10-Apr-14 06:15 CopyDiskToTape 615855 Incr 3,636 136.4 M OK 10-Apr-14 06:15 Backup-teamwork ==== Device status: Autochanger "OVERLAND" with devices: Drive-1 Drive-2 Autochanger "QTM-Scalar" with devices: "QTM-Drive-0" (/dev/qtm-nst0) "QTM-Drive-1" (/dev/qtm-nst1) Device "FileStorage" (/disk0/bacula/files) is not open. == Device "Drive-1" is not open or does not exist. == Device "Drive-2" is not open or does not exist. == Device "QTM-Drive-0" (/dev/qtm-nst0) is mounted with: Volume: BACU.113 Pool: DiskCopy Media type: LTO-6 Slot 14 is loaded in drive 0. Total Bytes=2,119,786,651,648 Blocks=8,086,384 Bytes/block=262,142 Positioned at File=264 Block=0 == Device "QTM-Drive-1" (/dev/qtm-nst1) is mounted with: Volume: BACX.101 Pool: ExtClone Media type: LTO-6 Slot 1 is loaded in drive 1. Total Bytes Read=64,512 Blocks Read=1 Bytes/block=64,512 Positioned at File=0 Block=0 == ==== Used Volume status: Reserved volume: BACU.113 on tape device "QTM-Drive-0" (/dev/qtm-nst0) Reader=0 writers=0 reserves=0 volinuse=0 Reserved volume: BACX.101 on tape device "QTM-Drive-1" (/dev/qtm-nst1) Reader=0 writers=0 reserves=0 volinuse=0 ==== Data spooling: 0 active jobs, 0 bytes; 142 total jobs, 112,034,563,261 max bytes/job. Attr spooling: 0 active jobs, 8,716,940,756 bytes; 151 total jobs, 8,716,940,756 max bytes. ==== First log messages of first copy job: 2014-04-10 09:12:37 troll-sd JobId 615860: 3307 Issuing autochanger "unload slot 14, drive 0" command. 2014-04-10 09:14:33 troll-dir JobId 615860: Using Device "QTM-Drive-0" to write. 2014-04-10 09:14:33 troll-sd JobId 615860: 3307 Issuing autochanger "unload slot 1, drive 1" command. 2014-04-10 09:15:26 troll-sd JobId 615860: 3304 Issuing autochanger "load slot 1, drive 0" command. 2014-04-10 09:16:07 troll-sd JobId 615860: 3305 Autochanger "load slot 1, drive 0", status is OK. 2014-04-10 09:16:18 troll-sd JobId 615860: Volume "BACX.101" previously written, moving to end of data. 2014-04-10 09:17:22 troll-sd JobId 615860: Ready to append to end of Volume "BACX.101" at file=239. 2014-04-10 09:17:27 troll-sd JobId 615860: Elapsed time=00:00:05, Transfer rate=45.33 M Bytes/second ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel