If you are writing to tape, then spooling is a large help. If you can afford the space to spool up entire jobs. We spool the data to disk before writing it to LTO-6 tape.

Also, note that there are two different types of spooling: Data and attributes. Data spooling is, like you typed, spooling to disk/SSD before writing to tape. Attribute spooling is the meta-data insert into your database, it stores those up and then does them all at once.

We've found that spooling attributes is almost always good, so we set that as the default.

For spooling on disk to disk backups I think it depends more on how many backups you have running concurrently. We don't use spooling for disk based backups.

On our tape backups, which are virtual full backups, we use spooling to increase throughput. We run 5 parallel virtual-full backups that are spooling to disk, but only one de-spools at a time to tape. This tends to keep the tape drive streaming more often than not, and the parallel virtual-full backups let the whole thing finish faster.

We aren't letting the tape drive do any compression, with today's CPUs we let Bareos do that work.

On 05/11/18 15:16, DB10 wrote:
We're just starting out with Bareos, coming from Legato (sheesh $$$). I’ve been 
running this in parallel with our old system for a couple of weeks and  thus 
far, I’m happy with how my tests are going, and it dumps all over the old kit 
but….

I thought I would post to solicit views/thoughts improving performance/reducing 
time taken for some big multi TB jobs we have.

Currently, we backup everything to disk, full, incremental etc.

Set up is
10 Core Xeon, 2.4Ghz
128GB RAM,
2 x SATA RAID1 OS drives,
36 x 10TB SAS-3 RAID10 "storage",
2 x SATA SSD RAID1 "cache",
2 x LTO7 drives in an Overland NEO80 Autochanger,
OS = CentOS 7,
Bareos = 17.2.4 installed via repo,
and using postgres DB.
2 x 10GbE LAN

Disk backups over LAN run plenty quick enough. No issue there.

Periodically, to free up disk, we run migrate jobs using SQL select to migrate 
full sets over to tape. Data spooling is turned on and we spool to the RAID-1 SSD, 
which then despools to LTO7. These jobs are typically >6Tb with one in 
particular nearly hitting the 20Tb mark.

Now, the first question I have, is there any point in spooling from disk to 
disk, then despooling to tape?

I ask because, the array backing the full volumes is capable of upwards of 
600MB/s reads (according to bonnie++) , so read performance shouldn't be an 
issue. The SSD used for the spooling does ~450MB/s reads so is actually slower 
than the main array (SAS-3, RAID10, lots of spindles, yay!). So is there any 
point in wasting the time spooling?

We spool at a max size of 200G, and get ~160MB/s off to LTO using this method. I know 
LTO7 claims it'll do 300MB/s, but I suspect that is with compression and all zeros. I was 
hoping for a bit(or a lot!) more from our setup.   Is 160MB/s comparable with what others 
are seeing with LTO7 and "real" data?

Our full backup pool consists of 100Gb Volumes, so it spools 2 at a time to 
SSD, then to tape. *most* of the full jobs aren't interleaved with other hosts, 
and certainly the big ones aren’t, so the volumes should contain only files 
from the given job.

So in this scenario, would turning off spooling slow things down or improve 
things?

Another part to this is the defaults for our full jobs use a fileset with:

Compression = LZ4

Will the migration job use this (Compression = None in the migration job), or 
will it uncompress as it spools?

If it keeps the compression, would disabling compression on the LTO drives 
speed things up a bit?

Sorry for the long post and TIA

Dan


--
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to