Re: [BackupPC-users] large archives and scalability issues

Tomasz Chmielewski Fri, 29 Feb 2008 02:17:53 -0800

dan schrieb:
> just some more info, I ran the same directory creation test on a live 
> zfs filesystem on  nexenta install.  i got very similar performance 
> differences between ext3 and zfs as i did in vmware except it went from 
> 45seconds on ext3 to 15 seconds and from about 3 seconds to what felt 
> instant on zfs.  i tried to make another observation here in reguards to 
> delayed writes.  on ext3, when the directory script finished, the hard 
> disk light went out immediately(as expected) but the script on the zfs 
> volume kept the light on for about 4 seconds before it went out.  i 
> repeated the script 3 times and got pretty much the same result each 
> time.  zfs is definitely doing a delayed write and reporting faster 
> performance than reallity BUT i still wall-clock it at about 6 seconds 
> verses the 15 seconds for ext3.


You're missing the point here.

Creating 10000 directories *once* is a completely irrelevant test to 
check BackupPC performance on a given filesystem.

With BackupPC usage you will just not have enough memory to store all 
changes in RAM indefinitely.
Sooner or later, you will have so many changes that the system will 
*have to* commit them to the disk - and this is where advantage of one 
filesystem over another could be shown: how well does it prevent 
fragmentation, does it try to place certain inodes/blocks close to each 
other etc.



Heck, it's not only filesystem, but also IO scheduler that can make the 
results *very* different. At least with Linux, I'm not familiar enough 
with I/O scheduler algorithms and tuning in FreeBSD or Solaris to 
discuss it.

By default, Linux uses a CFQ I/O scheduler, which tries to be fair to 
all processes. For a typical desktop or some server scenarios it usually 
works fine.

With BackupPC though, using a fair scheduler doesn't make much sense, 
though: we're not interested in latency, but in a general throughput, so 
here, anticipatory scheduler is better (at least for me).
So it is completely fine if 4 BackupPC processes will read and write in 
one close area of HDD, and 5th process will have to wait a bit longer to 
access a completely different HDD area. Such approach will decrease 
seeks, and improve overall performance, although some processes may wait 
longer for I/O sometimes.



And definitely, there is something wrong in the tests you're doing - for 
me, creating 10000 directories on ext3 takes between ~4 seconds (and 2 
more seconds to sync):

# sync ; bash mkdirs.sh ; time sync
0.00user 2.69system 0:04.00elapsed 67%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+211minor)pagefaults 0swaps
0.00user 0.01system 0:02.25elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+198minor)pagefaults 0swaps



And creating 10000 hardlinks to a file takes under 3 seconds on ext3, 
including sync:

# sync ; time perl mklink ; time sync
0.01user 2.51system 0:02.57elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+532minor)pagefaults 0swaps
0.00user 0.01system 0:00.16elapsed 6%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+198minor)pagefaults 0swaps



And guess what, the filesystem is accessible via network (iSCSI), hence 
slower, the system is a Xen guest (one of many on that host), and it 
runs on a not-so-fast 2.93 GHz Celeron CPU with just 256 kB cache.

Again, such test is completely irrelevant for BackupPC.


At the bottom of this post, I pasted simple scripts which create 
directories and links, but which results are completely irrelevant for 
BackupPC - unless you modify them a bit, try to run them thousands of 
times for a couple of hours (add file removals, adding content to files 
etc., perhaps). I have too Nexenta with ZFS at home, it might be 
interesting to test it there.


> i would expect that this quote from sun's website is the explanation:
> 
>     *Blazing performance*
>     ZFS is based on a transactional object model that removes most of
>     the traditional constraints on the order of issuing I/Os, which
>     results in huge performance gains.
> 
> there is the I/O caching.  zfs caches checks of I/O and then reorders it 
> to do a large, more-sequential write. 

In Linux, that's a I/O scheduler domain: when small read latency is 
needed, such behaviour may not be always desired.



Scripts:


# cat mklink.pl
#!/usr/bin/perl

use strict;
use Fcntl;

my $dir = "/mnt/iscsi_backup/test";

chdir $dir;

sysopen (TESTFILE, '0', O_CREAT, 0644);
close (TESTFILE);

for ( my $i = 1; $i <= 10000; $i++ ) {
         link 0,$i;
}



# cat mkdirs.sh
#!/bin/bash

DIR="/mnt/iscsi_backup/test"

cd $DIR

seq 1 10000 | xargs time mkdir



-- 
Tomasz Chmielewski
http://wpkg.org

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
BackupPC-users mailing list
[email protected]
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Re: [BackupPC-users] large archives and scalability issues

Reply via email to