Re: [Bacula-users] Disk setup for best performance on backups

2020-01-24 Thread Jason Voorhees
Sorry for the late reply guys, I've been busy these last days.

I really appreciate all the responses you made to this thread. I'll
read carefully all of them and take into consideration for my
deployment.

Hope you all have a nice weekend!

On Wed, Jan 22, 2020 at 11:55 AM Josh Fisher  wrote:
>
>
> On 1/22/2020 10:52 AM, dmaziuk via Bacula-users wrote:
> > On 1/22/2020 2:19 AM, Radosław Korzeniewski wrote:
> >
> >> Unless you are using BEE GED or other similar functionality you should
> >> never use the SSD in your backup solution as it will be a pure waste of
> >> money.
> >
> > I'm running a bunch of jobs in parallel and spooling them on an ssd.
> > Works pretty well for the money, but you need to work out how to size it.
>
>
> It makes sense to put Bacula's work directory and any spooling to a
> separate disk of some kind. It prevents writes for spooling, logging,
> etc. from interfering with the sequential nature of the volume data writes.
>
> Another thing that makes SSD worth the money is the very fast random
> read/write speeds and IOPS allows running a local DB server for the
> catalog. This is particularly helpful for those of us contending with 1G
> networks, and adding a SSD is far cheaper than upgrading the network to
> 10 G.
>
>
> >
> > Dima
> >
> >
> > ___
> > Bacula-users mailing list
> > Bacula-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/bacula-users
>
>
> ___
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Disk setup for best performance on backups

2020-01-22 Thread Josh Fisher


On 1/22/2020 10:52 AM, dmaziuk via Bacula-users wrote:

On 1/22/2020 2:19 AM, Radosław Korzeniewski wrote:


Unless you are using BEE GED or other similar functionality you should
never use the SSD in your backup solution as it will be a pure waste of
money.


I'm running a bunch of jobs in parallel and spooling them on an ssd. 
Works pretty well for the money, but you need to work out how to size it.



It makes sense to put Bacula's work directory and any spooling to a 
separate disk of some kind. It prevents writes for spooling, logging, 
etc. from interfering with the sequential nature of the volume data writes.


Another thing that makes SSD worth the money is the very fast random 
read/write speeds and IOPS allows running a local DB server for the 
catalog. This is particularly helpful for those of us contending with 1G 
networks, and adding a SSD is far cheaper than upgrading the network to 
10 G.





Dima


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Disk setup for best performance on backups

2020-01-22 Thread dmaziuk via Bacula-users

On 1/22/2020 2:19 AM, Radosław Korzeniewski wrote:


Unless you are using BEE GED or other similar functionality you should
never use the SSD in your backup solution as it will be a pure waste of
money.


I'm running a bunch of jobs in parallel and spooling them on an ssd. 
Works pretty well for the money, but you need to work out how to size it.


Dima


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Disk setup for best performance on backups

2020-01-22 Thread Radosław Korzeniewski
Hello,

pon., 20 sty 2020 o 16:57 Jason Voorhees  napisał(a):

> - Does it matter a lot choosing XFS instead of ext4 as filesystem?
>

IMVHO, yes. :)


> - How can I know the amount of IOPS needed for my local disk?
>

You can calculate the value based on required throughput and expected block
size.
But it is very tricky as Bacula generates a sequential stream of 64k blocks
during backup which underlying VFS/FS/OS can consolidate for best
performance for a particular disk device.


> - What does Bacula need most: high IOPS or throughput (MB/s)?
>

For standard Bacula backup it is throughput as described above but for
Bacula Enterprise with GED plugin it would be IOPS in some components.


> - Based on the previous question, should I choose SSD over HDD disks?
>

Unless you are using BEE GED or other similar functionality you should
never use the SSD in your backup solution as it will be a pure waste of
money.


> - Is it worth using RAID1 or RAID10 for improving performance?
>

A simple RAID1 will never improve performance during writes, never ever.
RAID10 could or even should improve performance but its "redundancy factor"
will always be 2x.


> By the way, I pretend to use an external DB (Amazon RDS) for my
> Catalog, so my Storage daemon wouldn't share the same underlying
> storage.
>

Bacula support SQL_ASCII database encoding only.


> I hope someone can share some ideas about disk performance.

I didn't find enough info about this topic on Internet. Thanks in advance
>

There is a bunch of whitepapers about designing a disk based backups for
Bacula at website: https://www.bacula.org/white-papers/
Did you check it?

best regards
-- 
Radosław Korzeniewski
rados...@korzeniewski.net
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Disk setup for best performance on backups

2020-01-21 Thread Josh Fisher


On 1/20/2020 10:56 AM, Jason Voorhees wrote:

Hello guys:

I'm planning a Bacula deployment on AWS in the following weeks. I have
some doubts about disk performance for Disk based backups.



See 
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/disk-performance.html




Based on the idea that Bacula writes data on big files (i.e. 100 GB
each volume), what technical considerations should I have for the
underlying storage device? These are some of my questions I have
around it:

- Does it matter a lot choosing XFS instead of ext4 as filesystem?


XFS


- How can I know the amount of IOPS needed for my local disk?


It depends on several things, like how much data, how long it is 
acceptable for the backups to run. It depends greatly on the speed of 
the clients being backed up. You can run concurrent jobs, but it also 
depends on network speed. For example, if your VPC is on a 1 Gbps 
network, then max network throughput, regardless of concurrency, is 125 
MB/s.



- What does Bacula need most: high IOPS or throughput (MB/s)?
- Based on the previous question, should I choose SSD over HDD disks?
- Is it worth using RAID1 or RAID10 for improving performance?

I was planning to use HDD disks which offers high throughput (500
MB/s) and up to 500 IOPS per disk (these are "st1" EBS volumes).

By the way, I pretend to use an external DB (Amazon RDS) for my
Catalog, so my Storage daemon wouldn't share the same underlying
storage.


I would not think RAIDing the EBS volumes would be needed. The backing 
store for those volumes is already RAID. Also, SSD for the volume files 
would be very expensive and not needed unless your VPC is on a 10 Gbps 
network and your client VPC instances can exceed the 500 MB/s throughput 
of the HDD. It would be better to add a separate smaller SSD virtual 
disk to use for the Bacula work directory and for spooling attributes if 
using a RDS DB. This would prevent writes to the work directory and 
attribute spooling from affecting sequential writes to the volume files. 
The volume files are effectively  sequential writes, so they really 
should be on a separate disk that is not being written to by anything 
other than the Bacula SD.




I hope someone can share some ideas about disk performance.

I didn't find enough info about this topic on Internet. Thanks in advance


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Disk setup for best performance on backups

2020-01-20 Thread Phil Stracchino
On 2020-01-20 10:56, Jason Voorhees wrote:
> 
> - Does it matter a lot choosing XFS instead of ext4 as filesystem?

It is worth noting that even Red Hat, long the champion of ext*, has
officially abandoned it and will do no further development on it.  (And
frankly, it didn't come a day too soon.)  I haven't run an ext*
filesystem in years; every Unix filesystem I have is either XFS or ZFS.

> - How can I know the amount of IOPS needed for my local disk?
> - What does Bacula need most: high IOPS or throughput (MB/s)?

Well, to a certain extent it's the same thing, at the storage level.
But consider that you are going to be doing mostly long streaming reads
and writes (a use case which the design of XFS was specifically
optimized for when SGI developed it).

That said, for Bacula storage purposes, what you care about is
throughput and stability above pure IOPS.

> - Is it worth using RAID1 or RAID10 for improving performance?

On AWS it's all RAID under the hood anyway.  RAIDing virtual RAIDs
probably doesn't gain you much.

That said, in the database and backup world it's generally considered
that RAID10 is best for performance.



-- 
  Phil Stracchino
  Babylon Communications
  ph...@caerllewys.net
  p...@co.ordinate.org
  Landline: +1.603.293.8485
  Mobile:   +1.603.998.6958


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Disk setup for best performance on backups

2020-01-20 Thread William Muriithi
Hi

> I'm planning a Bacula deployment on AWS in the following weeks. I have some 
> doubts about disk performance for Disk based backups.
I use tapes, so you should take my response with a grain of salt. Question 
though, how do one protect the backup from being damaged if one is compromised 
and there is no air gap?  If you look at the Riviera Beach for example, they 
did have backup, but ransomware encrypted them too.

> - Does it matter a lot choosing XFS instead of ext4 as filesystem?
Really don't make a big difference what filesystem you choose if your files are 
large.  But if you have too many tinny files, use XFS as it allocate metadata 
space dynamically
> - How can I know the amount of IOPS needed for my local disk?
You need to test, every setup have different IOPS requirements.  But anyway, if 
you want a generic answer, it depends with how many concurrent jobs you are 
willing to run.  And what do you mean by local?  On the bacula storage?   On 
the bacula client? How many bacula clients do you have?  Better define these 
details to get a better answer from the rest of the team.
> - What does Bacula need most: high IOPS or throughput (MB/s)?
Again, depends.  In general, I would say full backups would need throughput and 
differentials/incremental need more IOPS.  But for the storage, I guess IOPS 
irrespective of the type of jobs if you are running too many concurrent jobs?
> - Based on the previous question, should I choose SSD over HDD disks?
Use HDD, SSD are too expensive for backups in my humble opinion
> - Is it worth using RAID1 or RAID10 for improving performance?
Wouldn't make a difference in my opinion, bacula level details would make more 
difference 

>By the way, I pretend to use an external DB (Amazon RDS) for myCatalog, so my 
>Storage daemon wouldn't share the same underlying storage.
If you are using mysql and you have a lot of file, make sure to have a big temp 
space.  During pruning, mysql, does generate a really odd queries that can fill 
up most ramdisks. Use SHOW VARIABLES LIKE 'tmpdir'; to figure what you are 
currently using 

> I hope someone can share some ideas about disk performance.
You really seem intent to optimize your disk performance.  Consider dm-cache 
then.   Petty effective in my experience.

Regards,
William


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Disk setup for best performance on backups

2020-01-20 Thread Jason Voorhees
Hello guys:

I'm planning a Bacula deployment on AWS in the following weeks. I have
some doubts about disk performance for Disk based backups.

Based on the idea that Bacula writes data on big files (i.e. 100 GB
each volume), what technical considerations should I have for the
underlying storage device? These are some of my questions I have
around it:

- Does it matter a lot choosing XFS instead of ext4 as filesystem?
- How can I know the amount of IOPS needed for my local disk?
- What does Bacula need most: high IOPS or throughput (MB/s)?
- Based on the previous question, should I choose SSD over HDD disks?
- Is it worth using RAID1 or RAID10 for improving performance?

I was planning to use HDD disks which offers high throughput (500
MB/s) and up to 500 IOPS per disk (these are "st1" EBS volumes).

By the way, I pretend to use an external DB (Amazon RDS) for my
Catalog, so my Storage daemon wouldn't share the same underlying
storage.

I hope someone can share some ideas about disk performance.

I didn't find enough info about this topic on Internet. Thanks in advance


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users