On Fri, 2011-01-28 at 11:49 +0100, Sebastian J. Bronner wrote: > Hi Eric, > > thanks for your input. > > Patching block-fuse didn't work because the file-handle passed to the > application by the kernel was created without O_DIRECT in every case, > despite the inclusion of O_DIRECT in the proper block-fuse function. > > Patching rdiff-backup to support O_DIRECT would also not help us. We > discovered that it is not possible to open files from a fuse volume > using O_DIRECT. The attempt is rejected with "Invalid argument". > > However, inspired by the specs you mentioned in your setup, we finally > found a mention of the NFS mount flag '-onoac'. Since we are backing up > from raid1 to NFS, this was interesting.
Interesting! Yes, that would do the trick! It may be useful to patch blockfuse with a rate-limit option, thus slowing the reading speed out of blockfuse and preventing write-side saturation. -Eric > Principally, this flag is meant to disable attribute caches, however it > has the nice side-effect of throttling the read speed of the copy > process to match its write when writing to NFS. This way, the output > cache (buffers) don't fill up with unwritten data. So the system never > reaches a point where it has to write to disk before being able to free > buffers to be able to allocate to another process RAM. > > This is exactly what we need, and the procedure seems to be working for > us without degrading server performance now. > > Thanks again. > > Cheers, > Sebastian > > > > On 27.01.2011 21:10, Eric Wheeler wrote: > >> Hi Eric, > > > > Hi Sebastian, > > > > I'm cc'ing the rdiff-backup-users list too, they may have some insight > > as well. > > > >> on LVM snapshots and came across your blog and your articles in that > >> regard: > >> > >> http://www.globallinuxsecurity.pro/blog.php?q=rdiff-backup-lvm-snapshot > >> > >> I'm very impressed both with your rdiff-backup patch and the block-fuse > >> application. > > > > I'm glad you will find it useful! Unfortunately, I have found the > > sparse-destination patch for rdiff-backup is sometimes slow. I'm > > running without sparse files until I can figure out a faster way to > > detect blocks of 0-bytes. If you or someone on the list knows python > > better than I, please take a look! > > > >> Since you mentioned that you use this combination to backup up images up > >> to 350GB, I am interested to find out whether you have encountered > >> problems with I/O-Wait. > > > > I'm using blockfuse+rdiff-backup after business hours, so if the VM > > slows down, nobody (or very few) notice. The server runs 4x 1TB drives > > in RAID-10, and block-IO peaks at ~225MB/sec. That 350GB volume was > > recently extended to 600GB. > > > >> There is a Linux Kernel bug that causes I/O-Wait to skyrocket when > >> copying large files, especially when those files are larger than the > >> available memory. > >> > >> https://bugzilla.kernel.org/show_bug.cgi?id=12309 > > > > Good to know, I was unaware of this bug. See comment#128, it looks like > > using ext4 works a little better for writing, possibly because of > > delayed allocation ("delalloc"). Since I'm using ext4 as my destination > > backup filesystem, this could be the reason I am not experiencing the > > same issue. I suppose it could be my RAID controller (LSI 9240) > > buffering the IO overhead from the host CPU, too. > > > > What disk hardware are you using for source and destination? > > > >> In our case, a quad-core server running rdiff-backup on a block-fuse > >> directory, having 8GB ram, is basically made unavailable by the symptoms > >> I described above. All the virtual machines on it become unreachable. > > > > I have a feeling that this is due to backup-destination contention > > rather than backup-source contention. BlockFuse mmaps the source > > device, and I'm not certain if mmap'ed IO is cached or not. To > > guarantee you are missing the source's disk cache, you could patch > > blockfuse to use direct-IO (O_DIRECT), or backup from a "/dev/raw/rawX" > > device. (Missing disk cache is important for backups, because backups > > tend to be read-once. Thus, thrashing the cache effects the "good > > stuff" in the cache.) > > > > For large files, rdiff-backup may benefit from writing with the O_DIRECT > > flag (a hint from comment#128). Again, this would help miss the disk > > cache. > > > > I'm backing up local-to-local; the source is a RAID-10 array, and the > > destination is a slow 5400rpm 2TB single-disk as tertiary storage. Do > > you backup local-to-local, or over a network? > > > >> If you have any experience with this in your backup scenarios, I would > >> love to hear back from you. > > > > So far it works great on my side. I'm deploying this to backup LVM > > snapshots of Windows VMs under KVM in about 2 weeks on different > > hardware. I might have better insight then if I run into new issues. > > > >> > >> Cheers, > >> Sebastian > > > > _______________________________________________ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki