Hello Jasper,

I would like to help testing (specially for Windows) and my company can provide 
a small contribution if the patch is accepted.

Regards,

⁣Sent from TypeApp ​

On Feb 10, 2019, 2:57 PM, at 2:57 PM, jes...@krogh.cc wrote:
>Hi
>
>This has been an like-to-have for years, we are ending up braking up
>volumes  artificially to support bacula, because of the single threaded
>nature of the filedaemon. We have LTO6, spooling and 10Gbit network.
>When
>a full backup end up spanning 3 weeks run time - it get very very
>painfull. (Example below)
>
>10-Feb 07:22 bacula-dir JobId 201646: Error: Bacula bacula-dir 7.0.5
>(28Jul14):
>  Build OS:               x86_64-pc-linux-gnu ubuntu 16.04
>  JobId:                  201646
>  Job:                    Abe_Daily_RTP.2019-02-01_21.03.30_01
>  Backup Level:           Full (upgraded from Incremental)
>  Client:                 "abe-fd" 7.0.5 (28Jul14)
>x86_64-pc-linux-gnu,ubuntu,16.04
>  FileSet:                "Abe Set RTP" 2019-01-16 21:03:01
>  Pool:                   "Full-Pool" (From Job FullPool override)
>Catalog:                "MyCatalog" (From Client resource)
>  Storage:                "LTO-5" (From Job resource)
>  Scheduled time:         01-Feb-2019 21:03:30
>  Start time:             02-Feb-2019 05:38:30
>  End time:               10-Feb-2019 07:22:30
>  Elapsed time:           8 days 1 hour 44 mins
>  Priority:               10
>  FD Files Written:       3,096,049
>  SD Files Written:       0
>  FD Bytes Written:       3,222,203,306,821 (3.222 TB)
>  SD Bytes Written:       0 (0 B)
>  Rate:                   4620.0 KB/s
>  Software Compression:   None
>  VSS:                    no
>  Encryption:             no
>  Accurate:               no
>  Volume name(s):
>005641L5|005746L5|006211L5|006143L5|006125L5|006217L5|006221L5|005100L5|006158L5|006135L5|006175L5|006240L5|005291L5|006297L5|007543L6|007125L6|007180L6|007105L6|005538L5|005050L5|006254L5
>  Volume Session Id:      3874
>  Volume Session Time:    1544207587
>  Last Volume Bytes:      1,964,015,354,880 (1.964 TB)
>  Non-fatal FD errors:    1
>  SD Errors:              0
>  FD termination status:  Error
>  SD termination status:  Running
>  Termination:            *** Backup Error ***
>
>Average filesize is 1MB here...
>
>Underlying disk/filesystems are typically composed of 12/24/36 or more
>spinning disks. Disk systems today really need parallelism to perform.
>
>When dealing with large files, kernel readahead makes thing work nice,
>but
>when someone dumps 100.000 2KB files it slows down to single disk iops
>speed.
>
>True single job parallelism would of course be awesome - multiple
>spools, multiple drives, multiple streams over a single fileset.
>But that is also very complex.
>
>I have two “suggestions” for less intrusive benefits.
>
>1) When reading a catalog, loop over all files and issue a
>posix_fasvise WILLNEED on the first 1MB of the file.
>
>I have prototyped this outside bacula and it seem to work very
>nicely and should be a small non-intrusive patch. It will allow the IO
>stack to issue concurrently around the smaller files caching them in
>memory. I have inspected the sourcecode and cannot find traces that
>this
>should be in place allready.
>
>2) Thread out the filedaemon
>Implement a X MB buffer in the filedaemon. could be 16 slots of
>max 5MB, for files smaller than 5MB this serves as staging area
>for the thread, haning it over to the master process.
>Yes, this can be tuned in a lot of ways, but most of us with large
>filesystems would easily sacrifice 5-10GB memory on the server, just
>for
>speeding up this stuff.
>
>This is more intrusive but can be isolated fully to the filedaemon.
>
>If someone is willing to help some of this along the way please let me
>know and lets see if we can make ends meet.
>
>Potentially others would like to co-fund here? I feel it unlikely
> that we are theonly ones with the need.
>
>Basics of our installation ~10PB on tape, 0,5 PB live data under
>backup, Quantum Scalar i6000 library with 6xLTO6 and 1100 slots.
>
>Our current bacula catalog has survived since 2006’ish and 5 LTO
>generations - pretty impressive by itself.
>
>Jesper
>
>
>
>
>
>_______________________________________________
>Bacula-devel mailing list
>Bacula-devel@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/bacula-devel
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to