[rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapshot

Sebastian J. Bronner Mon, 31 Jan 2011 07:06:27 -0800

Hi Eric,

thanks for your input.


Patching block-fuse didn't work because the file-handle passed to the
application by the kernel was created without O_DIRECT in every case,
despite the inclusion of O_DIRECT in the proper block-fuse function.

Patching rdiff-backup to support O_DIRECT would also not help us. We
discovered that it is not possible to open files from a fuse volume
using O_DIRECT. The attempt is rejected with "Invalid argument".

However, inspired by the specs you mentioned in your setup, we finally
found a mention of the NFS mount flag '-onoac'. Since we are backing up
from raid1 to NFS, this was interesting.

Principally, this flag is meant to disable attribute caches, however it
has the nice side-effect of throttling the read speed of the copy
process to match its write when writing to NFS. This way, the output
cache (buffers) don't fill up with unwritten data. So the system never
reaches a point where it has to write to disk before being able to free
buffers to be able to allocate to another process RAM.

This is exactly what we need, and the procedure seems to be working for
us without degrading server performance now.

Thanks again.

Cheers,
Sebastian



On 27.01.2011 21:10, Eric Wheeler wrote:
>> Hi Eric,
> 
> Hi Sebastian,
> 
> I'm cc'ing the rdiff-backup-users list too, they may have some insight
> as well.
> 
>> on LVM snapshots and came across your blog and your articles in that regard:
>>
>> http://www.globallinuxsecurity.pro/blog.php?q=rdiff-backup-lvm-snapshot
>>
>> I'm very impressed both with your rdiff-backup patch and the block-fuse
>> application.
> 
> I'm glad you will find it useful!  Unfortunately, I have found the
> sparse-destination patch for rdiff-backup is sometimes slow.  I'm
> running without sparse files until I can figure out a faster way to
> detect blocks of 0-bytes.  If you or someone on the list knows python
> better than I, please take a look!
> 
>> Since you mentioned that you use this combination to backup up images up
>> to 350GB, I am interested to find out whether you have encountered
>> problems with I/O-Wait.
> 
> I'm using blockfuse+rdiff-backup after business hours, so if the VM
> slows down, nobody (or very few) notice.  The server runs 4x 1TB drives
> in RAID-10, and block-IO peaks at ~225MB/sec.  That 350GB volume was
> recently extended to 600GB.
> 
>> There is a Linux Kernel bug that causes I/O-Wait to skyrocket when
>> copying large files, especially when those files are larger than the
>> available memory.
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=12309
> 
> Good to know, I was unaware of this bug.  See comment#128, it looks like
> using ext4 works a little better for writing, possibly because of
> delayed allocation ("delalloc").  Since I'm using ext4 as my destination
> backup filesystem, this could be the reason I am not experiencing the
> same issue.  I suppose it could be my RAID controller (LSI 9240)
> buffering the IO overhead from the host CPU, too.
> 
> What disk hardware are you using for source and destination?
> 
>> In our case, a quad-core server running rdiff-backup on a block-fuse
>> directory, having 8GB ram, is basically made unavailable by the symptoms
>> I described above. All the virtual machines on it become unreachable.
> 
> I have a feeling that this is due to backup-destination contention
> rather than backup-source contention.  BlockFuse mmaps the source
> device, and I'm not certain if mmap'ed IO is cached or not.  To
> guarantee you are missing the source's disk cache, you could patch
> blockfuse to use direct-IO (O_DIRECT), or backup from a "/dev/raw/rawX"
> device.  (Missing disk cache is important for backups, because backups
> tend to be read-once.  Thus, thrashing the cache effects the "good
> stuff" in the cache.)
> 
> For large files, rdiff-backup may benefit from writing with the O_DIRECT
> flag (a hint from comment#128).  Again, this would help miss the disk
> cache.
> 
> I'm backing up local-to-local; the source is a RAID-10 array, and the
> destination is a slow 5400rpm 2TB single-disk as tertiary storage.  Do
> you backup local-to-local, or over a network?  
> 
>> If you have any experience with this in your backup scenarios, I would
>> love to hear back from you.
> 
> So far it works great on my side.  I'm deploying this to backup LVM
> snapshots of Windows VMs under KVM in about 2 weeks on different
> hardware.  I might have better insight then if I run into new issues.
> 
>>
>> Cheers,
>> Sebastian
> 


-- 
*Sebastian J. Bronner*
Administrator

D9T GmbH - Magirusstr. 39/1 - D-89077 Ulm
Tel: +49 731 1411 696-0 - Fax: +49 731 3799-220

Geschäftsführer: Daniel Kraft
Sitz und Register: Ulm, HRB 722416
Ust.IdNr: DE 260484638

http://d9t.de - D9T High Performance Hosting
i...@d9t.de

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

[rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapshot

Reply via email to