If you want to have a consistent benchmark with other installations you need to provide a scripted set of operations using a common dataset to measure. Selecting a random source of files to copy I chose my openafs .git directory tree.
Using robocopy on a Windows client to a MacMini running OSX 10.6.8
Server and OpenAFS 1.6.2 using iSCSI RAID6 EXT4 storage for the vice
partition I obtain transfer rates of:
Copy to AFS:
Total Copied Skipped Mismatch FAILED Extras
Dirs : 274 273 1 0 0 0
Files : 2720 2720 0 0 0 0
Bytes : 77.53 m 77.53 m 0 0 0 0
Times : 0:00:51 0:00:38 0:00:00 0:00:13
Speed : 2127901 Bytes/sec.
Speed : 121.759 MegaBytes/min.
Copy from AFS (cached):
Total Copied Skipped Mismatch FAILED Extras
Dirs : 274 273 1 0 0 0
Files : 2720 2720 0 0 0 0
Bytes : 77.53 m 77.53 m 0 0 0 0
Times : 0:00:25 0:00:19 0:00:00 0:00:05
Speed : 4092669 Bytes/sec.
Speed : 234.184 MegaBytes/min.
Copy from AFS (not cached):
Total Copied Skipped Mismatch FAILED Extras
Dirs : 274 273 1 0 0 0
Files : 2720 2720 0 0 0 0
Bytes : 77.53 m 77.53 m 0 0 0 0
Times : 0:00:34 0:00:28 0:00:00 0:00:05
Speed : 2875507 Bytes/sec.
Speed : 164.537 MegaBytes/min.
The data set contains one 64MB file, one 5MB file and the other ~8MB
spread across 2718 files and 273 directories.
The time to copy a single 200MB file is as follows:
Write to AFS:
timer on
Timer 1 on: 13:17:04
copy d:\random.200mb
D:\random.200mb => \\afs\yfs\project\test\test1\random.200mb
1 file copied
echo 18.8501413761 MB/secs
18.8501413761 MB/secs
timer off
Timer 1 off: 13:17:14 Elapsed: 0:00:10.61
Read from AFS (no flush):
timer on
Timer 1 on: 13:17:14
copy random.200mb d:\random.200mb
\\afs\yfs\project\test\test1\random.200mb => D:\random.200mb
1 file copied
echo 416.6666666667 MB/secs
416.6666666667 MB/secs
timer off
Timer 1 off: 13:17:15 Elapsed: 0:00:00.48
Read from AFS (after flush):
timer on
Timer 1 on: 13:17:15
copy random.200mb d:\random.200mb
\\afs\yfs\project\test\test1\random.200mb => D:\random.200mb
1 file copied
echo 24.1837968561 MB/secs
24.1837968561 MB/secs
timer off
Timer 1 off: 13:17:23 Elapsed: 0:00:08.27
As others have mentioned:
1. when there are large numbers of small files the cost of
the RPC overhead including network latency far surpasses the
cost of sending the file data.
2. data encryption (all of the above numbers are with encryption)
comes at a performance cost.
3. virtualization of I/O (network and disk) is expensive.
if the VM I/O is competing with the hypervisor, other VMs,
and the host system for disk and network cycles there will
be large increases in the latency associated with each request.
4. vice partition file system makes a difference
5. file system journals makes a difference
6. raid makes a difference
7. there are many other potential variables
I hope these numbers are helpful.
Jeffrey Altman
On 4/8/2013 12:11 PM, [email protected] wrote:
>
> Hi all,
>
> thank you for your responses. Before going through them in detail, I would
> just like to make a reality check. What kind of performance figures should
> one expect from an averagely working afs network (LAN/WAN)? Say, if you
> would duplicate roughly the 2000 files/300MB directory in your setup, what
> kind of rates do you get? That is, is 500-1000KB/s a reasonable starting
> point for optimization, not a magnitude or two higher?
>
> I actually created a simple genetic-algorithm program to optimize
> fileserver parameters. Eventhough I didn't run it to full length, I'm
> pretty confident I couldn't achieve much higher rates just by
> parameterizing the fileserver.
>
> I'm trying to get the performance good for generic, mixed use; some user
> may work with large individual files, another one with large amount of
> small files. It shouldn't be optimized to either extreme.
>
> br, jukka
>
>
>
> _______________________________________________
> OpenAFS-info mailing list
> [email protected]
> https://lists.openafs.org/mailman/listinfo/openafs-info
>
smime.p7s
Description: S/MIME Cryptographic Signature
