Hello, śr., 20 lut 2019 o 13:29 Josh Fisher <jfis...@pvct.com> napisał(a):
> Note that posix_fadvise() only affects caching and read-ahead at the OS > level. While the use of posix_fadvise() may indeed improve i/o performance > for particular use cases, it is not parallelism and does not cause multiple > user-space threads to be executed in parallel. I believe that Kern is > referring to a multi-threaded approach in the bacula-fd, where multiple > threads are executing in parallel to read and process files. > > Also, I believe that bacula-fd already does make use of posix_fadvise(). > Yes, I mentioned about it in my previous email. > I would think that a reader-writer approach would be possible. A single > writer thread would perform all i/o with the SD while multiple reader > threads would read and process single files at a time. A single management > thread would manage the list of files to be backed up and spawn reader > threads to process them. This could improve FD performance, particularly > when compression and/or encryption is being used. > This topic has a lot of branches and detail levels causing a high level of misunderstanding, i.e. - concurrent data scan (finding what to backup) - concurrent data read at directory (or filesystem) level - concurrent data read at file level - concurrent data read at block level - concurrent data processing (i.e. compression, see *1 below) - asynchronous IO for data read (single thread) - multiple network streams to single storage - single network stream to multiple storages = multiple network streams - multiple network streams to multiple storages - support for high latency networks - single thread - support for high latency networks - multiple threads - automatic concurrency scaling (i.e. by a number of available cpu or system utilization) - manual concurrency scaling *1) you cannot make a concurrent (threaded) encryption with CBC encryption mode used by Bacula, you can switch to CTR, but the required code does not exist in Bacula, AFAIR. > I am not sure this approach is always a good thing. It depends on the > client hardware. When backing up weak clients using compression or > encryption, it would bring them to their knees, although a mechanism to > limit the number of reader threads that may be spawned would fix that. > Also, with weak clients, the real problem is slow disks on the clients, and > no amount of parallelism will fix that. > We have a following chain: 1. find data 2. read data 3. process data 4. send data over the network 5. receive and process data 6. write data where every single point can slow down the whole process. Some points can by optimized, some are fixed on hardware limitations. The optimization could be achieved with concurrent (threaded) execution or with some clever tricks, i.e. instead of finding files on NFS, ask network storage to prepare the list of files for you - this is how BEE Incremental Accelerator is working. best regards -- Radosław Korzeniewski rados...@korzeniewski.net
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users