Re: [Bacula-users] Job transfer rate
Its def disk IO on the client OR disk/database io on the bacula director. Its all a symptom of too much VM on not enough hardware. Jeff. On Oct 30, 2014, at 2:26 PM, John Lockard jlock...@umich.edu wrote: Yes, but which IO? Disk IO on the client? Network IO from the client to the network? Network IO from the network to the Bacula Director? Network IO from the Bacula Director to the Bacula SD? Disk IO on the Bacula SD? Database IO on the Bacula Director? Seems like you have more work to do than just saying it's the IO. Not sure of the tools on Windows to interrogate IO at disk or network, but on Linux/Unix a good place to start is the sar (sysstat) utilities. -John On Thu, Oct 30, 2014 at 12:20 PM, Jeff MacDonald j...@terida.com mailto:j...@terida.com wrote: Just be aware that you might not see a dramatic increase in speed just moving Bacula itself! If you are using VMWare with VMDK files on a VMFS volume you need to be aware that any IO by a guest requires a reservation of the entire VMFS volume. Locking is happening at the SCSI layer - if one guest wants to read one byte of data nobody else can do anything until its IO operation is complete. Remembering that you probably are only going to get around 75 IOPs you can see how a VMFS volume with more than a handful of virtual machines on it can very quickly end up performing very poorly, especially with spinning rust underneath it. A good RAID card with a LOT of cache memory can help with overall system performance, but backups by definition are going to be touching lots of areas of data that aren't likely to be in cache. What I'm getting at is you might actually need to focus your efforts and dollars on the storage underneath your VMs before you do too much with your backup system. A great big nice happy dedicated Bacula server would be nice, but if the VMs are still IOP constrained ESPECIALLY if they are actively in use while being backed up you probably won't see that much of an improvement. An easy way to validate this would be to ensure you have attribute spooling turned on and to set up the attribute spooling to write to your NAS rather than to local storage. That will get the VM storage infrastructure out of your backup pathway. Bryn This has been a fantastic education. Thanks. I’ll recommend to the client that their IO is slow.. and I’ll get told “Oh! It seems fine to us!” :) I googled and found documentation about turning on Data Spooling, but not indepnedantly turning on Attribute Spooling. Could you point me at that please.. ( I know I Know.. I’ll keep looking :) ) Jeff. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net mailto:Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users https://lists.sourceforge.net/lists/listinfo/bacula-users -- --- John M. Lockard | U of Michigan - School of Information Unix Sys Admin | 105 South State St. | 4325 North Quad jlock...@umich.edu mailto:jlock...@umich.edu |Ann Arbor, MI 48109-1285 www.umich.edu/~jlockard http://www.umich.edu/%7Ejlockard | 734-936-7255 | 734-764-2475 FAX --- - The University of Michigan will never ask you for your password - -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] Job transfer rate
Hi, I have some backups going at 2MB/s which for a 380gig backup is just too slow. I’m trying to find my bottleneck. Some questions: - Is the rate of the backup only shown in “messages” or is it stored in the db anywhere. Or could I just do jobbytes / endtime-starttime in the jobs table? - Does bacula write data to disk via a stream or lots of little latency dependant writes? My environment looks like this - Bacula (and postgres on the same VM), a MS Small business server and 3 or 4 other VMs run on a 6 disk array of 7200rpm SATA disks ( I bet this is already my slowpoint ) - Bacula stores its backups on a NFS mounted NAS, about .7ms of ping away. Tips/Suggestions? Jeff. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Job transfer rate
On Thu, Oct 30, 2014 at 9:27 AM, Jeff MacDonald j...@terida.com wrote: Hi, I have some backups going at 2MB/s which for a 380gig backup is just too slow. I’m trying to find my bottleneck. Some questions: - Is the rate of the backup only shown in “messages” or is it stored in the db anywhere. Or could I just do jobbytes / endtime-starttime in the jobs table? - Does bacula write data to disk via a stream or lots of little latency dependant writes? My environment looks like this - Bacula (and postgres on the same VM), a MS Small business server and 3 or 4 other VMs run on a 6 disk array of 7200rpm SATA disks ( I bet this is already my slowpoint ) - Bacula stores its backups on a NFS mounted NAS, about .7ms of ping away. Tips/Suggestions? Did you benchmark the client filesystem? Are there loads of small files? Did you try enabling attribute spooling? Did you tune your database? John -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Job transfer rate
On 14-10-30 06:27 AM, Jeff MacDonald wrote: Hi, I have some backups going at 2MB/s which for a 380gig backup is just too slow. I’m trying to find my bottleneck. Some questions: - Is the rate of the backup only shown in “messages” or is it stored in the db anywhere. Or could I just do jobbytes / endtime-starttime in the jobs table? - Does bacula write data to disk via a stream or lots of little latency dependant writes? My environment looks like this - Bacula (and postgres on the same VM), a MS Small business server and 3 or 4 other VMs run on a 6 disk array of 7200rpm SATA disks ( I bet this is already my slowpoint ) - Bacula stores its backups on a NFS mounted NAS, about .7ms of ping away. Tips/Suggestions? Jeff. What is the content of your backups? Some things (ie thousands of tiny files) will cause a lot of seeks on the machine to be backed up. If you aren't using attribute spooling then each backed up file also causes a record to be inserted in to the database, which may take time depending on your DB environment. The 'suggestions' for tuning will be different if you are backing up a few dozen 10GB files versus backing up a million 10kb files. Bryn -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Job transfer rate
Tips/Suggestions? Jeff. What is the content of your backups? Some things (ie thousands of tiny files) will cause a lot of seeks on the machine to be backed up. If you aren't using attribute spooling then each backed up file also causes a record to be inserted in to the database, which may take time depending on your DB environment. The 'suggestions' for tuning will be different if you are backing up a few dozen 10GB files versus backing up a million 10kb files. Its mostly a windows os, with all its sundry smaller files and a few larger database dumps etc. I guess what I have to accertain is the slow part getting the data FROM the servers or the slow part putting the data TO the storage. I’m not sure which value the rate is in the job report, or if rate is somehow encompassing both. jeff Bryn -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net mailto:Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users https://lists.sourceforge.net/lists/listinfo/bacula-users -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Job transfer rate
On 14-10-30 07:50 AM, Jeff MacDonald wrote: Tips/Suggestions? Jeff. What is the content of your backups? Some things (ie thousands of tiny files) will cause a lot of seeks on the machine to be backed up. If you aren't using attribute spooling then each backed up file also causes a record to be inserted in to the database, which may take time depending on your DB environment. The 'suggestions' for tuning will be different if you are backing up a few dozen 10GB files versus backing up a million 10kb files. Its mostly a windows os, with all its sundry smaller files and a few larger database dumps etc. I guess what I have to accertain is the slow part getting the data FROM the servers or the slow part putting the data TO the storage. I’m not sure which value the rate is in the job report, or if rate is somehow encompassing both. jeff The job report rate will be the final average rate of the job, it doesn't know/specify the difference between the 'input' rate and the 'output' rate. Yep, you're going to need to do some investigation on the storage side of the VM machine you are backing up, the director itself, the storage daemon itself (though I'm guessing it is on the same system as the director for you) and the final storage. Also it's not quite clear from your description, is the final storage on a different NAS all together from your VMs? (hoping so!) What virtualization platform are you running? Finally the question about attribute spooling is a big one - if you are backing up a lot of small files and you do not have attribute spooling turned on, you will have abysmal performance especially if the director is running on the same disks that you are backing up. Database writes are (almost) always synchronous writes, meaning the system will stop and wait for the storage layer to say yes the data is ACTUALLY committed to disk before proceeding. If you are seeking all over backing up a bunch of small files, then trying to do a whole ton of tiny DB writes at the same time to the same spindles your hard drive heads are going to be flying around like crazy. An array of 7200 RPM disks in any sort of parity RAID configuration will not be able to handle more than 50-90 random IOPs (Operations per Second) at best in real life, with a DB write or a file read counting as an IOP. If you are backing up lots of small files randomly distributed around the storage you are quite likely hitting an IOP wall - an IOP to read the file and an IOP to write the DB record means not more than 25-45 files per second. 4kb files = 100-180kb/sec and a completely maxed out storage layer. Even WITH attribute spooling enabled you are still going to be in a less-than-ideal position since the spooled attributes still need to be written to the same spindles with the hardware configuration you've described. Bryn -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Job transfer rate
On Oct 30, 2014, at 12:17 PM, Bryn Hughes li...@nashira.ca wrote: The job report rate will be the final average rate of the job, it doesn't know/specify the difference between the 'input' rate and the 'output' rate. Yep, you're going to need to do some investigation on the storage side of the VM machine you are backing up, the director itself, the storage daemon itself (though I'm guessing it is on the same system as the director for you) and the final storage. Also it's not quite clear from your description, is the final storage on a different NAS all together from your VMs? (hoping so!) What virtualization platform are you running? Finally the question about attribute spooling is a big one - if you are backing up a lot of small files and you do not have attribute spooling turned on, you will have abysmal performance especially if the director is running on the same disks that you are backing up. Database writes are (almost) always synchronous writes, meaning the system will stop and wait for the storage layer to say yes the data is ACTUALLY committed to disk before proceeding. If you are seeking all over backing up a bunch of small files, then trying to do a whole ton of tiny DB writes at the same time to the same spindles your hard drive heads are going to be flying around like crazy. An array of 7200 RPM disks in any sort of parity RAID configuration will not be able to handle more than 50-90 random IOPs (Operations per Second) at best in real life, with a DB write or a file read counting as an IOP. If you are backing up lots of small files randomly distributed around the storage you are quite likely hitting an IOP wall - an IOP to read the file and an IOP to write the DB record means not more than 25-45 files per second. 4kb files = 100-180kb/sec and a completely maxed out storage layer. Even WITH attribute spooling enabled you are still going to be in a less-than-ideal position since the spooled attributes still need to be written to the same spindles with the hardware configuration you've described. Bryn This was really helpful and basically just answered all of my questions without having to investigate the actual setup very much. I’m using VMWare for my virt platform. Bacula and its postgres live on the same disks that they are backing up (which is local storage) and data is sent off to to a remote NAS via gige. My guess is that its an IOP wall like you mentioned.Its running a bunch of VMs that are under heavy usage by the staff. Making a stronger and stronger arguement for me to recommend dedicated bacula appliance. 16 gigs of ram, 4 cores. 1tb of 7200 for postgres and a tape drive :) jeff. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Job transfer rate
Making a stronger and stronger arguement for me to recommend dedicated bacula appliance. 16 gigs of ram, 4 cores. 1tb of 7200 for postgres and a tape drive :) Maybe an enterprise ssd for postgres. John -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Job transfer rate
On Oct 30, 2014, at 12:36 PM, John Drescher dresche...@gmail.com wrote: Making a stronger and stronger arguement for me to recommend dedicated bacula appliance. 16 gigs of ram, 4 cores. 1tb of 7200 for postgres and a tape drive :) Maybe an enterprise ssd for postgres. John Agreed, they’re not even that much of a $$ hit. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Job transfer rate
On 14-10-30 08:27 AM, Jeff MacDonald wrote: On Oct 30, 2014, at 12:17 PM, Bryn Hughes li...@nashira.ca wrote: The job report rate will be the final average rate of the job, it doesn't know/specify the difference between the 'input' rate and the 'output' rate. Yep, you're going to need to do some investigation on the storage side of the VM machine you are backing up, the director itself, the storage daemon itself (though I'm guessing it is on the same system as the director for you) and the final storage. Also it's not quite clear from your description, is the final storage on a different NAS all together from your VMs? (hoping so!) What virtualization platform are you running? Finally the question about attribute spooling is a big one - if you are backing up a lot of small files and you do not have attribute spooling turned on, you will have abysmal performance especially if the director is running on the same disks that you are backing up. Database writes are (almost) always synchronous writes, meaning the system will stop and wait for the storage layer to say yes the data is ACTUALLY committed to disk before proceeding. If you are seeking all over backing up a bunch of small files, then trying to do a whole ton of tiny DB writes at the same time to the same spindles your hard drive heads are going to be flying around like crazy. An array of 7200 RPM disks in any sort of parity RAID configuration will not be able to handle more than 50-90 random IOPs (Operations per Second) at best in real life, with a DB write or a file read counting as an IOP. If you are backing up lots of small files randomly distributed around the storage you are quite likely hitting an IOP wall - an IOP to read the file and an IOP to write the DB record means not more than 25-45 files per second. 4kb files = 100-180kb/sec and a completely maxed out storage layer. Even WITH attribute spooling enabled you are still going to be in a less-than-ideal position since the spooled attributes still need to be written to the same spindles with the hardware configuration you've described. Bryn This was really helpful and basically just answered all of my questions without having to investigate the actual setup very much. I’m using VMWare for my virt platform. Bacula and its postgres live on the same disks that they are backing up (which is local storage) and data is sent off to to a remote NAS via gige. My guess is that its an IOP wall like you mentioned.Its running a bunch of VMs that are under heavy usage by the staff. Making a stronger and stronger arguement for me to recommend dedicated bacula appliance. 16 gigs of ram, 4 cores. 1tb of 7200 for postgres and a tape drive :) jeff. Just be aware that you might not see a dramatic increase in speed just moving Bacula itself! If you are using VMWare with VMDK files on a VMFS volume you need to be aware that any IO by a guest requires a reservation of the entire VMFS volume. Locking is happening at the SCSI layer - if one guest wants to read one byte of data nobody else can do anything until its IO operation is complete. Remembering that you probably are only going to get around 75 IOPs you can see how a VMFS volume with more than a handful of virtual machines on it can very quickly end up performing very poorly, especially with spinning rust underneath it. A good RAID card with a LOT of cache memory can help with overall system performance, but backups by definition are going to be touching lots of areas of data that aren't likely to be in cache. What I'm getting at is you might actually need to focus your efforts and dollars on the storage underneath your VMs before you do too much with your backup system. A great big nice happy dedicated Bacula server would be nice, but if the VMs are still IOP constrained ESPECIALLY if they are actively in use while being backed up you probably won't see that much of an improvement. An easy way to validate this would be to ensure you have attribute spooling turned on and to set up the attribute spooling to write to your NAS rather than to local storage. That will get the VM storage infrastructure out of your backup pathway. Bryn -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Job transfer rate
Just be aware that you might not see a dramatic increase in speed just moving Bacula itself! If you are using VMWare with VMDK files on a VMFS volume you need to be aware that any IO by a guest requires a reservation of the entire VMFS volume. Locking is happening at the SCSI layer - if one guest wants to read one byte of data nobody else can do anything until its IO operation is complete. Remembering that you probably are only going to get around 75 IOPs you can see how a VMFS volume with more than a handful of virtual machines on it can very quickly end up performing very poorly, especially with spinning rust underneath it. A good RAID card with a LOT of cache memory can help with overall system performance, but backups by definition are going to be touching lots of areas of data that aren't likely to be in cache. What I'm getting at is you might actually need to focus your efforts and dollars on the storage underneath your VMs before you do too much with your backup system. A great big nice happy dedicated Bacula server would be nice, but if the VMs are still IOP constrained ESPECIALLY if they are actively in use while being backed up you probably won't see that much of an improvement. An easy way to validate this would be to ensure you have attribute spooling turned on and to set up the attribute spooling to write to your NAS rather than to local storage. That will get the VM storage infrastructure out of your backup pathway. Bryn This has been a fantastic education. Thanks. I’ll recommend to the client that their IO is slow.. and I’ll get told “Oh! It seems fine to us!” :) I googled and found documentation about turning on Data Spooling, but not indepnedantly turning on Attribute Spooling. Could you point me at that please.. ( I know I Know.. I’ll keep looking :) ) Jeff. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Job transfer rate
Yes, but which IO? Disk IO on the client? Network IO from the client to the network? Network IO from the network to the Bacula Director? Network IO from the Bacula Director to the Bacula SD? Disk IO on the Bacula SD? Database IO on the Bacula Director? Seems like you have more work to do than just saying it's the IO. Not sure of the tools on Windows to interrogate IO at disk or network, but on Linux/Unix a good place to start is the sar (sysstat) utilities. -John On Thu, Oct 30, 2014 at 12:20 PM, Jeff MacDonald j...@terida.com wrote: Just be aware that you might not see a dramatic increase in speed just moving Bacula itself! If you are using VMWare with VMDK files on a VMFS volume you need to be aware that any IO by a guest requires a reservation of the entire VMFS volume. Locking is happening at the SCSI layer - if one guest wants to read one byte of data nobody else can do anything until its IO operation is complete. Remembering that you probably are only going to get around 75 IOPs you can see how a VMFS volume with more than a handful of virtual machines on it can very quickly end up performing very poorly, especially with spinning rust underneath it. A good RAID card with a LOT of cache memory can help with overall system performance, but backups by definition are going to be touching lots of areas of data that aren't likely to be in cache. What I'm getting at is you might actually need to focus your efforts and dollars on the storage underneath your VMs before you do too much with your backup system. A great big nice happy dedicated Bacula server would be nice, but if the VMs are still IOP constrained ESPECIALLY if they are actively in use while being backed up you probably won't see that much of an improvement. An easy way to validate this would be to ensure you have attribute spooling turned on and to set up the attribute spooling to write to your NAS rather than to local storage. That will get the VM storage infrastructure out of your backup pathway. Bryn This has been a fantastic education. Thanks. I’ll recommend to the client that their IO is slow.. and I’ll get told “Oh! It seems fine to us!” :) I googled and found documentation about turning on Data Spooling, but not indepnedantly turning on Attribute Spooling. Could you point me at that please.. ( I know I Know.. I’ll keep looking :) ) Jeff. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users -- --- John M. Lockard | U of Michigan - School of Information Unix Sys Admin | 105 South State St. | 4325 North Quad jlock...@umich.edu |Ann Arbor, MI 48109-1285 www.umich.edu/~jlockard http://www.umich.edu/%7Ejlockard | 734-936-7255 | 734-764-2475 FAX --- - The University of Michigan will never ask you for your password - -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users