Re: [Bacula-users] Virtual tapes or virtual disks

2022-01-26 Thread Josip Deanovic

On 2022-01-26 20:13, dmitri maziuk wrote:

On 2022-01-26 12:57 PM, Josip Deanovic wrote:


The number of files per directory is far bigger and is unlikely to
get reached, especially not for this use case.


The limit is one thing, the scaling is another. I agree: 40TB of 10GB
files is not enough to see the slow-down on any modern system, you'd
need an order of magnitude more files to get there. Still it's
something to be aware of when deciding on volume size.


40 TB is 40960 GB which would give 4096 files, 10 GB in size.
Order of magnitude would be 40960 files which is still nothing.
Right now on my laptop I have 291794 files and 34481 directories
and that's only under /usr.

I had systems with hundreds of millions of files on UFS2 (FreeBSD)
and systems with billions of files on ext3 (Linux) and that was like
15 years ago.

As far as I can remember there were no issues with read/write
performance related to the number of files. The issue was backup
which would take a lot of time to traverse the whole file system.
This is a problem common to all hierarchical databases without
some kind of indexing employed to deal with the issue.
As long the full path of a file is known, I don't think the
read/write performance of a file would change noticeably with
the increase of number of files on the file system.

Modern file systems are using directory indexing so even
searching through a file system doesn't take too long but
it's common sense that the time needed to perform a lookup
would increase (not necessary linearly) with the number of
files on the file system.

In any case, Bacula knows the path names of the file volumes
and doesn't need to search the file system. I can't imagine
the setup where the number of files on the local file system
containing Bacula file volumes would pose a problem.


Regards!

--
Josip Deanovic


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Virtual tapes or virtual disks

2022-01-26 Thread dmitri maziuk

On 2022-01-26 12:57 PM, Josip Deanovic wrote:


The number of files per directory is far bigger and is unlikely to
get reached, especially not for this use case.


The limit is one thing, the scaling is another. I agree: 40TB of 10GB 
files is not enough to see the slow-down on any modern system, you'd 
need an order of magnitude more files to get there. Still it's something 
to be aware of when deciding on volume size.


Dima


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Virtual tapes or virtual disks

2022-01-26 Thread Thomas Lohman


I'm having a RAID5 array of about 40TB in size. A separate RAID 
controller card handles the disks. I'm planning to use the normal ext4 
file system. It's standard and well known, most probably not the 
fastest though. That will not have any great impact, as there is a 4TB 
NVMe SSD drive, which takes the odd of the slow physical disk 
performance.



Hi,

I'd recommend if you're going to use RAID that you at least use a RAID-6 
configuration.  You don't want to risk losing all your backups if you 
have a drive fail and then during the rebuilding of the RAID-5, you 
happen to have another drive failure/error.


cheers,

--tom




___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Virtual tapes or virtual disks

2022-01-26 Thread Josip Deanovic

On 2022-01-26 18:42, dmitri maziuk wrote:

If you use actual disks as "magazines" with vchanger, you need to
pre-label the volumes. If you use just one big filesystem, you can let
bacula do it for you (last I looked that functionality didn't work w/
autochangers).

If you use disk "magazines" you also need to consider the whole-disk
failure. If you use one big filesystem, use RAID (of course) to guard
against those. But then you should look at the number of file volumes:
some filesystems handle large numbers of directory entries better than
others and you may want to balance the volume file size vs the number
of directory entries.


Regarding the number of directory entries...
It is common to see the file system limit of 32000 directories per
directory.
The number of files per directory is far bigger and is unlikely to
get reached, especially not for this use case.


Regards!

--
Josip Deanovic


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Virtual tapes or virtual disks

2022-01-26 Thread dmitri maziuk

On 2022-01-26 11:59 AM, Peter Milesson via Bacula-users wrote:



...
I'm having a RAID5 array of about 40TB in size. A separate RAID 
controller card handles the disks. I'm planning to use the normal ext4 
file system. It's standard and well known, most probably not the fastest 
though. That will not have any great impact, as there is a 4TB NVMe SSD 
drive, which takes the odd of the slow physical disk performance.


Yeah, we gave up on hardware RAID controllers long ago, but YMMV.

As for SSDs, if you spool the jobs you can run them in parallel to 
spool->volume stream. You'd have to look at the numbers for your setup 
but generally despooling off the SSD over the bus runs just fine while 
clients are spooling to it over the network.


Dima



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Virtual tapes or virtual disks

2022-01-26 Thread Josip Deanovic

On 2022-01-26 18:06, Peter Milesson via Bacula-users wrote:

I'm used to fixed volume sizes from the tape drives, I feel
comfortable with it, and I do not need to relearn a lot to configure
the Bacula system. The only thing I haven't found out is how to
preallocate the number of volumes needed. Maybe there is no need, if
the volumes are created automagically. Most of the RAID array will be
used by Bacula, just leaving a couple of percent as free space. When
using mhvtl, I started a script with the tape size and number of tapes
I wanted, and the corresponding tape directories and volumes were
created on the fly.

Thanks Josip!


You are welcome.
I would like to point out that different requirements people may
have will dictate different approaches.

Regarding preallocation of the voluems, if there is a way to do it
I am not aware of it.

However, if you define maximum volume size and the maximum number
of volumes in the pool, you should be able to calculate the space
needed. Just leave some free space like 2x size of a volume and you
should be good. Later, when you use all the volumes you will see
if there is enough space to create yet another volume.

You can chose to label volumes by yourself or leave that to Bacula.
It's up to you.

If you intend to recycle your volumes automatically, make sure that
your retention periods are short enough to expire before all the
volumes are used. Otherwise Bacula will not be able to perform backup.
The alternative would be to force the recycle of the oldest volume
but this doesn't happen by default, this option must be explicitly
turned on.


Regards!

--
Josip Deanovic


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Virtual tapes or virtual disks

2022-01-26 Thread Peter Milesson via Bacula-users




On 26.01.2022 18:42, dmitri maziuk wrote:

On 2022-01-26 11:06 AM, Peter Milesson via Bacula-users wrote:



...
Your way of explaining the reasoning of why to use smaller file 
volumes, is very appreciated. 

...

The only thing I haven't found out is how to preallocate the number 
of volumes needed. Maybe there is no need, if the volumes are created 
automagically. Most of the RAID array will be used by Bacula, just 
leaving a couple of percent as free space.


If you use actual disks as "magazines" with vchanger, you need to 
pre-label the volumes. If you use just one big filesystem, you can let 
bacula do it for you (last I looked that functionality didn't work w/ 
autochangers).


If you use disk "magazines" you also need to consider the whole-disk 
failure. If you use one big filesystem, use RAID (of course) to guard 
against those. But then you should look at the number of file volumes: 
some filesystems handle large numbers of directory entries better than 
others and you may want to balance the volume file size vs the number 
of directory entries.


For single filesystem, I suggest using ZFS instead of a traditional 
RAID if you can: you can later grow it on-line by replacing disks w/ 
bigger ones when (not if) you need to.


Dima


Thanks for your input Dima.

I'm having a RAID5 array of about 40TB in size. A separate RAID 
controller card handles the disks. I'm planning to use the normal ext4 
file system. It's standard and well known, most probably not the fastest 
though. That will not have any great impact, as there is a 4TB NVMe SSD 
drive, which takes the odd of the slow physical disk performance.


Best regards,

Peter




___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Virtual tapes or virtual disks

2022-01-26 Thread dmitri maziuk

On 2022-01-26 11:06 AM, Peter Milesson via Bacula-users wrote:



...
Your way of explaining the reasoning of why to use smaller file volumes, 
is very appreciated. 

...

The only thing I haven't found out is how to preallocate the 
number of volumes needed. Maybe there is no need, if the volumes are 
created automagically. Most of the RAID array will be used by Bacula, 
just leaving a couple of percent as free space.


If you use actual disks as "magazines" with vchanger, you need to 
pre-label the volumes. If you use just one big filesystem, you can let 
bacula do it for you (last I looked that functionality didn't work w/ 
autochangers).


If you use disk "magazines" you also need to consider the whole-disk 
failure. If you use one big filesystem, use RAID (of course) to guard 
against those. But then you should look at the number of file volumes: 
some filesystems handle large numbers of directory entries better than 
others and you may want to balance the volume file size vs the number of 
directory entries.


For single filesystem, I suggest using ZFS instead of a traditional RAID 
if you can: you can later grow it on-line by replacing disks w/ bigger 
ones when (not if) you need to.


Dima


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Virtual tapes or virtual disks

2022-01-26 Thread Peter Milesson via Bacula-users



On 26.01.2022 0:02, Josip Deanovic wrote:

On 23.01.2022 11:37, Radosław Korzeniewski wrote:

Hello,

pt., 21 sty 2022 o 14:22 Peter Milesson via Bacula-users 
 napisał(a):


    If somebody has got experience with disk based, multi volume Bacula
    backup, I would be grateful about some information (tips, what to
    expect, pitfalls, etc.).


The best IMVHO (but not the only mine) is to configure one job = one 
volume. You will get no real benefit to limit the size of a single 
volume.
In the single volume = single job configuration you can set up job 
retention very easily as purging a volume will purge a single job only.
It is not required to "wait" a particular volume to fill up to start 
retention. Purging a volume affects a single job only. And finally 
you end up with a way less number of volumes then when limiting its 
size to i.e. 10G.



There are many different approaches which can fit different requirements.
I don't see the benefit of having a single job per volume as Bacula
is tracking media, files, jobs and everything else.
That's why Bacula has a catalog which allows the backup system
to determine the location and state of volumes, jobs, files, etc.

To logically separate backup data I use pools and leave the rest
to Bacula.

When Bacula needs a particular file volume, if it's available Bacula
will simply use it and if it's not or if we are using tape volume
which is currently not in the tape drive/library, Bacula will ask
for the volume by name.

The number of smaller file volumes (e.g. 10GB) is not an issue as
Bacula is handling them correctly and automatically (provided that
Bacula is correctly configured, of course).


I'll go through few examples where smaller file volumes (e.g. 10GB)
could prove useful:

1. If the catalog database get corrupted or completely lost,
   due to the the small size, it's easier and faster to handle
   and determine volumes which contain database backup.
   That makes the process of importing the data into a new
   catalog database using a tool such as bscan easier.

2. Similar to 1), it is easier to manage small file volumes and
   extract particular jobs from a volume using bextract tool.

3. If the space is an issue (as it usually is), bigger volumes
   tend to eat more space which cannot be reused (volume
   cannot be recycled) as long as the volume contains a single
   job we want to preserve.

4. Although I don't like that approach, sometimes people chose
   to sync or copy whole file volumes to a secondary location
   using the usual tools such as rsync, cp and similar.
   In such case it is better to keep file volumes small.

5. When recycling a file volume, it will take longer time to
   wipe bigger file volume. If a volume is smaller it will
   take less time to recycle ensuring more time windows where
   other tasks could benefit from I/O performance. In case of
   large file volumes all other tasks would have to fight for
   the opportunity to access the file system and that gets more
   obvious when a slow network file system is being used.

6. In case of any kind of corruption of a file volume due to
   the file system corruption or damage in transport, it is
   likely that less data will be lost in case of a smaller
   file volume. And again, it's easier to handle smaller file
   volume when trying to recover pieces of data.


Regards!


Great Post!

Your way of explaining the reasoning of why to use smaller file volumes, 
is very appreciated. The truth is, most files are fairly small. 
Particularly files created by office users. They range from a few kbytes 
up to some tens of megabytes. Videos can be huge, but I guess most 
companies handle instruction videos and similar, and not full blown 
movies. This type of content very seldom exceed 1GB. So a 10Gbyte volume 
limit seems to be a good balance.


I'm used to fixed volume sizes from the tape drives, I feel comfortable 
with it, and I do not need to relearn a lot to configure the Bacula 
system. The only thing I haven't found out is how to preallocate the 
number of volumes needed. Maybe there is no need, if the volumes are 
created automagically. Most of the RAID array will be used by Bacula, 
just leaving a couple of percent as free space. When using mhvtl, I 
started a script with the tape size and number of tapes I wanted, and 
the corresponding tape directories and volumes were created on the fly.


Thanks Josip!

Best regards,

Peter


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] upgrade from 9.4.2 to 11.0.5

2022-01-26 Thread Lionel PLASSE
Hello,

Simple question,

On Debian 11, for an upgrade from v9.4.2 mysql + baculum to v11, is there any 
difficulties to do it from sources. I always waited  to the Debian repository 
version and they still are on 9.4.2... Should I upgrade to 9.6 before? 

has anybody already do it right? 

I think I must upgrade the fd client on windows systems too?



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users