On 24 Apr 2010, at 19:00, Holger Rauch wrote: > > (Both dd commands were run directly on the file server host in order > to rule out possible network latency problems as a cause for the bad > performance).
So, it's very important to realise that this is a terrible comparison. Whilst both filesystems ultimately result in the data hitting an ext3 filesystem, OpenAFS is doing significantly more work. With the ext3 case, the dd calls the write() syscall, the data gets copied from user into kernel space, the kernel marks the page as dirty, and at a moment of its choosing flushes that page to disk. Job done. With OpenAFS, dd calls write(), and the data gets copied into the kernel. The OpenAFS kernel module then copies the page to the local disk cache (by calling ext3's write), and returns control to the user. When dd completes it calls close. The kernel module then loads the data back from the disk cache (by calling ext3's read, if the data has been paged out), and converts it into a set of arguments to an RPC call. It figures out where to deliver that RPC to, possibly by making network calls to the vlserver. The RPC is then checksummed, encrypted, split up into appropriately addressed UDP packets and passed to the kernel's networking stack. The networking drivers then route the packets (hopefully round the loopback interface, but that does depend) and deliver them to the fileserver. This runs in userspace, so the data gets copied out of the kernel into the fileserver's buffers. The file server then decrypts the data, decodes the RPC arguments to get the data being written, works out which file it corresponds to, and whether it needs to notify anyone that that file has changed. It then calls ext3's write() syscall which copies the data from user into kernel space, the kernel marks that page as dirty, and eventually flushes it to disk. Finally, we're done. With a level playing field, a directly connected local disk is always be faster than a network filesystem - there's simply less work to be done when throwing data straight onto a local disk than there is when you send it across the network. > Any ideas as to where that bad performance might come from? (I do have > encryption enabled, but since it's only plain DES encryption on > current machines that most likely can't be an explanation for the > performance problem). The encryption isn't DES, it's fcrypt. And encryption does have a significant effect on performance (not only doing the encryption itself, but also the number of additional copies that it adds in to the data path) It's worth taking a look at the configuration of your fileserver. There's been a lot written here in the past about the ideal settings for the fileserver - I'll leave you to Google over the list, but it's just worth noting that the out of the box configuration is not likely to result in good performance. As a datapoint, using your test across a network, my homedirectory is seeing write speeds of 70 MB/s from an OpenAFS 1.4.11 client. We're currently running our fileservers with "-L -p 128 -rxpck 400 -busyat 600 -s 1200 -l 1200 -cb 100000 -b 240 -vc 1200" All that said, we are trying to improve the performance of OpenAFS. There are numerous changes in the 1.5 tree, particularly for Linux, that help speed it up. I'd also encourage you to look at more meaningful benchmarks - in particular those which mirror the kind of use you'll actually be putting the filesystem to. Cheers, Simon. _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
