Good day,

When benchmarking my backup with and without spooling it seems that with
spooling the jobs take longer than without spooling, even though the
efficiencies of the resources have been increased.
My thinking is that this is due to the fact that Bareos first spools data,
then despools it, then spools it again and so on - causing 100% network
usage, followed by 100% tape usage alternating.
My tape bandwidth and network bandwidth are about the same, around 1 Gbit/s.

Instead, I would like to propose a change where the spooling process
creates two spool files (each bound to 50% of Maximum Spool Size), Spool-A
and Spool-B.
When Spool-A is filled up, Bareos-SD starts to despool that file while
continuing to spool data to Spool-B.
If Spool-B fills up while Spool-A is running, the spooling is paused until
the spooling of Spool-A is done. At that moment Spool-B is despooled and
Spool-A is used to spool data.

This should make sure that the slowest part is always being used 100%
during the backup job, which should be a significant improvement if you are
running a single large job.
If you are running multiple jobs in parallel then this solution will do
little to help you as you will spool and despool onto different spool files
and possibly multiple drives.

Thoughts?

I am considering writing a proof-of-concept by modifying the spool.cc parts
and among other things move the intermediate call to DespoolData (
https://github.com/bareos/bareos/blob/7cd54133cd9a4f206259b3612f3ad6ab7add9743/core/src/stored/spool.cc#L547-L550)
into a background thread. Since I haven't touched the Bareos code base
before I would be grateful for any tips or hints regarding if this is an
acceptable way for potential upstream inclusion later on.

Regards,

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/bareos-users/CADiuDARs3CTzJ0RXgUyet0%3DWVj59Wgz4gonD7ti_R_C6AxnEJA%40mail.gmail.com.

Reply via email to