Re: [Bacula-users] Disk setup for best performance on backups
Sorry for the late reply guys, I've been busy these last days. I really appreciate all the responses you made to this thread. I'll read carefully all of them and take into consideration for my deployment. Hope you all have a nice weekend! On Wed, Jan 22, 2020 at 11:55 AM Josh Fisher wrote: > > > On 1/22/2020 10:52 AM, dmaziuk via Bacula-users wrote: > > On 1/22/2020 2:19 AM, Radosław Korzeniewski wrote: > > > >> Unless you are using BEE GED or other similar functionality you should > >> never use the SSD in your backup solution as it will be a pure waste of > >> money. > > > > I'm running a bunch of jobs in parallel and spooling them on an ssd. > > Works pretty well for the money, but you need to work out how to size it. > > > It makes sense to put Bacula's work directory and any spooling to a > separate disk of some kind. It prevents writes for spooling, logging, > etc. from interfering with the sequential nature of the volume data writes. > > Another thing that makes SSD worth the money is the very fast random > read/write speeds and IOPS allows running a local DB server for the > catalog. This is particularly helpful for those of us contending with 1G > networks, and adding a SSD is far cheaper than upgrading the network to > 10 G. > > > > > > Dima > > > > > > ___ > > Bacula-users mailing list > > Bacula-users@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/bacula-users > > > ___ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Disk setup for best performance on backups
On 1/22/2020 10:52 AM, dmaziuk via Bacula-users wrote: On 1/22/2020 2:19 AM, Radosław Korzeniewski wrote: Unless you are using BEE GED or other similar functionality you should never use the SSD in your backup solution as it will be a pure waste of money. I'm running a bunch of jobs in parallel and spooling them on an ssd. Works pretty well for the money, but you need to work out how to size it. It makes sense to put Bacula's work directory and any spooling to a separate disk of some kind. It prevents writes for spooling, logging, etc. from interfering with the sequential nature of the volume data writes. Another thing that makes SSD worth the money is the very fast random read/write speeds and IOPS allows running a local DB server for the catalog. This is particularly helpful for those of us contending with 1G networks, and adding a SSD is far cheaper than upgrading the network to 10 G. Dima ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Disk setup for best performance on backups
On 1/22/2020 2:19 AM, Radosław Korzeniewski wrote: Unless you are using BEE GED or other similar functionality you should never use the SSD in your backup solution as it will be a pure waste of money. I'm running a bunch of jobs in parallel and spooling them on an ssd. Works pretty well for the money, but you need to work out how to size it. Dima ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Disk setup for best performance on backups
Hello, pon., 20 sty 2020 o 16:57 Jason Voorhees napisał(a): > - Does it matter a lot choosing XFS instead of ext4 as filesystem? > IMVHO, yes. :) > - How can I know the amount of IOPS needed for my local disk? > You can calculate the value based on required throughput and expected block size. But it is very tricky as Bacula generates a sequential stream of 64k blocks during backup which underlying VFS/FS/OS can consolidate for best performance for a particular disk device. > - What does Bacula need most: high IOPS or throughput (MB/s)? > For standard Bacula backup it is throughput as described above but for Bacula Enterprise with GED plugin it would be IOPS in some components. > - Based on the previous question, should I choose SSD over HDD disks? > Unless you are using BEE GED or other similar functionality you should never use the SSD in your backup solution as it will be a pure waste of money. > - Is it worth using RAID1 or RAID10 for improving performance? > A simple RAID1 will never improve performance during writes, never ever. RAID10 could or even should improve performance but its "redundancy factor" will always be 2x. > By the way, I pretend to use an external DB (Amazon RDS) for my > Catalog, so my Storage daemon wouldn't share the same underlying > storage. > Bacula support SQL_ASCII database encoding only. > I hope someone can share some ideas about disk performance. I didn't find enough info about this topic on Internet. Thanks in advance > There is a bunch of whitepapers about designing a disk based backups for Bacula at website: https://www.bacula.org/white-papers/ Did you check it? best regards -- Radosław Korzeniewski rados...@korzeniewski.net ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Disk setup for best performance on backups
On 1/20/2020 10:56 AM, Jason Voorhees wrote: Hello guys: I'm planning a Bacula deployment on AWS in the following weeks. I have some doubts about disk performance for Disk based backups. See https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/disk-performance.html Based on the idea that Bacula writes data on big files (i.e. 100 GB each volume), what technical considerations should I have for the underlying storage device? These are some of my questions I have around it: - Does it matter a lot choosing XFS instead of ext4 as filesystem? XFS - How can I know the amount of IOPS needed for my local disk? It depends on several things, like how much data, how long it is acceptable for the backups to run. It depends greatly on the speed of the clients being backed up. You can run concurrent jobs, but it also depends on network speed. For example, if your VPC is on a 1 Gbps network, then max network throughput, regardless of concurrency, is 125 MB/s. - What does Bacula need most: high IOPS or throughput (MB/s)? - Based on the previous question, should I choose SSD over HDD disks? - Is it worth using RAID1 or RAID10 for improving performance? I was planning to use HDD disks which offers high throughput (500 MB/s) and up to 500 IOPS per disk (these are "st1" EBS volumes). By the way, I pretend to use an external DB (Amazon RDS) for my Catalog, so my Storage daemon wouldn't share the same underlying storage. I would not think RAIDing the EBS volumes would be needed. The backing store for those volumes is already RAID. Also, SSD for the volume files would be very expensive and not needed unless your VPC is on a 10 Gbps network and your client VPC instances can exceed the 500 MB/s throughput of the HDD. It would be better to add a separate smaller SSD virtual disk to use for the Bacula work directory and for spooling attributes if using a RDS DB. This would prevent writes to the work directory and attribute spooling from affecting sequential writes to the volume files. The volume files are effectively sequential writes, so they really should be on a separate disk that is not being written to by anything other than the Bacula SD. I hope someone can share some ideas about disk performance. I didn't find enough info about this topic on Internet. Thanks in advance ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Disk setup for best performance on backups
On 2020-01-20 10:56, Jason Voorhees wrote: > > - Does it matter a lot choosing XFS instead of ext4 as filesystem? It is worth noting that even Red Hat, long the champion of ext*, has officially abandoned it and will do no further development on it. (And frankly, it didn't come a day too soon.) I haven't run an ext* filesystem in years; every Unix filesystem I have is either XFS or ZFS. > - How can I know the amount of IOPS needed for my local disk? > - What does Bacula need most: high IOPS or throughput (MB/s)? Well, to a certain extent it's the same thing, at the storage level. But consider that you are going to be doing mostly long streaming reads and writes (a use case which the design of XFS was specifically optimized for when SGI developed it). That said, for Bacula storage purposes, what you care about is throughput and stability above pure IOPS. > - Is it worth using RAID1 or RAID10 for improving performance? On AWS it's all RAID under the hood anyway. RAIDing virtual RAIDs probably doesn't gain you much. That said, in the database and backup world it's generally considered that RAID10 is best for performance. -- Phil Stracchino Babylon Communications ph...@caerllewys.net p...@co.ordinate.org Landline: +1.603.293.8485 Mobile: +1.603.998.6958 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Disk setup for best performance on backups
Hi > I'm planning a Bacula deployment on AWS in the following weeks. I have some > doubts about disk performance for Disk based backups. I use tapes, so you should take my response with a grain of salt. Question though, how do one protect the backup from being damaged if one is compromised and there is no air gap? If you look at the Riviera Beach for example, they did have backup, but ransomware encrypted them too. > - Does it matter a lot choosing XFS instead of ext4 as filesystem? Really don't make a big difference what filesystem you choose if your files are large. But if you have too many tinny files, use XFS as it allocate metadata space dynamically > - How can I know the amount of IOPS needed for my local disk? You need to test, every setup have different IOPS requirements. But anyway, if you want a generic answer, it depends with how many concurrent jobs you are willing to run. And what do you mean by local? On the bacula storage? On the bacula client? How many bacula clients do you have? Better define these details to get a better answer from the rest of the team. > - What does Bacula need most: high IOPS or throughput (MB/s)? Again, depends. In general, I would say full backups would need throughput and differentials/incremental need more IOPS. But for the storage, I guess IOPS irrespective of the type of jobs if you are running too many concurrent jobs? > - Based on the previous question, should I choose SSD over HDD disks? Use HDD, SSD are too expensive for backups in my humble opinion > - Is it worth using RAID1 or RAID10 for improving performance? Wouldn't make a difference in my opinion, bacula level details would make more difference >By the way, I pretend to use an external DB (Amazon RDS) for myCatalog, so my >Storage daemon wouldn't share the same underlying storage. If you are using mysql and you have a lot of file, make sure to have a big temp space. During pruning, mysql, does generate a really odd queries that can fill up most ramdisks. Use SHOW VARIABLES LIKE 'tmpdir'; to figure what you are currently using > I hope someone can share some ideas about disk performance. You really seem intent to optimize your disk performance. Consider dm-cache then. Petty effective in my experience. Regards, William ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] Disk setup for best performance on backups
Hello guys: I'm planning a Bacula deployment on AWS in the following weeks. I have some doubts about disk performance for Disk based backups. Based on the idea that Bacula writes data on big files (i.e. 100 GB each volume), what technical considerations should I have for the underlying storage device? These are some of my questions I have around it: - Does it matter a lot choosing XFS instead of ext4 as filesystem? - How can I know the amount of IOPS needed for my local disk? - What does Bacula need most: high IOPS or throughput (MB/s)? - Based on the previous question, should I choose SSD over HDD disks? - Is it worth using RAID1 or RAID10 for improving performance? I was planning to use HDD disks which offers high throughput (500 MB/s) and up to 500 IOPS per disk (these are "st1" EBS volumes). By the way, I pretend to use an external DB (Amazon RDS) for my Catalog, so my Storage daemon wouldn't share the same underlying storage. I hope someone can share some ideas about disk performance. I didn't find enough info about this topic on Internet. Thanks in advance ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users