On Apr 24, 2010, at 3:20 PM, Simon Wilkinson wrote:


On 24 Apr 2010, at 19:00, Holger Rauch wrote:

(Both dd commands were run directly on the file server host in order
to rule out possible network latency problems as a cause for the bad
performance).

So, it's very important to realise that this is a terrible comparison. Whilst both filesystems ultimately result in the data hitting an ext3 filesystem, OpenAFS is doing significantly more work.

With the ext3 case, the dd calls the write() syscall, the data gets copied from user into kernel space, the kernel marks the page as dirty, and at a moment of its choosing flushes that page to disk. Job done.

With OpenAFS, dd calls write(), and the data gets copied into the kernel. The OpenAFS kernel module then copies the page to the local disk cache (by calling ext3's write), and returns control to the user. When dd completes it calls close. The kernel module then loads the data back from the disk cache (by calling ext3's read, if the data has been paged out), and converts it into a set of arguments to an RPC call. It figures out where to deliver that RPC to, possibly by making network calls to the vlserver. The RPC is then checksummed, encrypted, split up into appropriately addressed UDP packets and passed to the kernel's networking stack. The networking drivers then route the packets (hopefully round the loopback interface, but that does depend) and deliver them to the fileserver. This runs in userspace, so the data gets copied out of the kernel into the fileserver's buffers. The file server then decrypts the data, decodes the RPC arguments to get the data being written, works out which file it corresponds to, and whether it needs to notify anyone that that file has changed. It then calls ext3's write() syscall which copies the data from user into kernel space, the kernel marks that page as dirty, and eventually flushes it to disk. Finally, we're done.

With a level playing field, a directly connected local disk is always be faster than a network filesystem - there's simply less work to be done when throwing data straight onto a local disk than there is when you send it across the network.

Any ideas as to where that bad performance might come from? (I do have
encryption enabled, but since it's only plain DES encryption on
current machines that most likely can't be an explanation for the
performance problem).

The encryption isn't DES, it's fcrypt. And encryption does have a significant effect on performance (not only doing the encryption itself, but also the number of additional copies that it adds in to the data path)

It's worth taking a look at the configuration of your fileserver. There's been a lot written here in the past about the ideal settings for the fileserver - I'll leave you to Google over the list, but it's just worth noting that the out of the box configuration is not likely to result in good performance.

As a datapoint, using your test across a network, my homedirectory is seeing write speeds of 70 MB/s from an OpenAFS 1.4.11 client. We're currently running our fileservers with "-L -p 128 -rxpck 400 - busyat 600 -s 1200 -l 1200 -cb 100000 -b 240 -vc 1200"

As another data point on our fileservers we use

-p 128 -b 480 -l 2400 -s 2400 -vc 2400 -cb 64000 -rxpck 800 -udpsize 1048576 -busyat 1200 -hr 1 -vattachpar 4 -implicit all -nojumbo

But I think it's your client (or server) set to 100 Mb.

Rich


All that said, we are trying to improve the performance of OpenAFS. There are numerous changes in the 1.5 tree, particularly for Linux, that help speed it up. I'd also encourage you to look at more meaningful benchmarks - in particular those which mirror the kind of use you'll actually be putting the filesystem to.

Cheers,

Simon.

_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to