On 2011-05-19 at 13:25, Andrew Deason ( [email protected] ) said:
On Tue, 17 May 2011 23:14:03 +0100
Hugo Monteiro <[email protected]> wrote:
- Low performance and high discrepancy between test results
Transfer rates (only a few) hardly touched 30MB/s between the server and
a client sitting on the same network, connected via GB ethernet. Most of
the times that transfer rate is around 20MB/s, falling down to 13 or
14MB/s in some cases.
The client and server configs would help. I'm not used to looking at
single-client performance, but... assuming you're using a disk cache,
keep in mind the data is written twice: once to the cache and once on
the server. So, especially when you're running the client and server on
the same machine, there's no way you're going to reach the theoretical
110M/s of the disk.
You can certainly get close if your disk for the disk cache is fast
enough. I've seen close to 80MB/s with 15K SAS under ideal conditions.
Re: client and server on the same machine - I've seen that actually result
in lower performance. When you take the physical network out of the mix,
Rx starts limiting you as a function of CPU usage it seems.
You may want to see what you get with memcache (or if you want to try a
1.6 client, cache bypass) and a higher chunksize. Just running dd on a
box I have, running a 1.4 afsd with -memcache -chunksize 24 made it jump
from the low 20s to high 40s/low 50s (M/s), after starting with the
defaults for a 100M disk cache.
Just to add some more data points...
I recently saw peaks of 90M/s for memcache for single client writes. Reads
from memcache can be as fast as your memory is, so upwards of a couple
GB/s.
In general, 1.6 memcache > 1.4 memcache > 1.6 diskcache > 1.4 diskcache.
1.6 disk cache uses a LOT less CPU than 1.4 disk cache, however. Nice for
processes that need IO and CPU at the same time on a machine that might
already be lacking CPU.
Options I used to get those numbers with 1.6.0pre5:
Client:
-dynroot -fakestat -afsdb -nosettime -stat 48000 -daemons 12 -volumes 512
-memcache -blocks 655360 -chunksize 19
Server:
-p 128 -busyat 600 -rxpck 4096 -s 10000 -l 1200 -cb 1000000 -b 240 -vc 1200
-abortthreshold 0 -udpsize 1048576
Server in this case is a very new 16-core Opteron box with 32GB of RAM (it
runs multiple fileserver instances under Solaris zones). Client is a
relatively new 8-core Opteron box with 64GB of memory.
Also in general, client performance seems to get worse the more CPUs you
have. Our 48-core boxes tend to get lower numbers than our smaller 16 and
8 core boxes. I haven't done too many comparison tests to really quantify
how much of a difference that makes, though.
Cache bypass definitely makes things faster for things that aren't cached,
though I will withold performance numbers for that as I was testing bypass
inside a ESX vm (one of our webservers), but within the same machine, it
got similar numbers to disk cache after the files had been cached (where
disk cache is a raw FC LUN)
Under normal conditions with fairly modern hardware, you
should expect 50M/s with some simple tuning (-chunksize mostly, and
-memcache if your machine has the memory to spare).
I haven't done any testing for the multi-client case, as that's slightly
more difficult to properly test while holding everything else constant. By
multi-client, I mean multiple actual cache managers involved as well as
multiple users behind the same cache manager.
--andy
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info