Hi Harald!

On 15 Sep 2005, at 12:18, Harald Barth wrote:

There are some "workarounds" to this problem. First, we could abandon
the current zero-copy semantics and just do very large reads and
writes to the disk, and then do memcpy's in userspace.  For fast
machines, this will almost certainly beat the current algorithm for
raw throughput.  But, it's certainly not what I'd call an elegant
solution.


Yes, the data would go diskIO->kernel->userspace->kernel->net.
On the diskIO side it will be in big chunks. In the net side
it will be in MTU or MTU*4 size chunks. Bad?

This I don't understand: right now we have readv(small segments)- >buffer->sendmsg(small segments), where the term 'zero-copy' indicates that the buffer is somehow special. My question is: Why can't this be replaced by read(big segment)->buffer->sendmsg(small segments). AFAIK readv() is implemented in terms of read() in the kernel for almost all filesystems, so it should really only have the effect of making the disk transfer more efficient. The msg headers interspersed with the data have to come from userspace in any case, right?


Second, we could use iovecs for the extension headers. Unfortunately,
most OS's limit us to 16 iovecs, so this would cut our max jumbogram
size nearly in half.


What impact would that have? Measurements? Speculations? If half the
jumbogram size does not kill us, it sounds like an alternative worth
to test.

Well, testing is always a good idea. The problem is that while I have the hardware setup, I do not possess the openAFS internal knowledge to produce a patch.

Ciao,
                    Roland

--
TU Muenchen, Physik-Department E18, James-Franck-Str. 85747 Garching
Telefon 089/289-12592; Telefax 089/289-12570
--
A mouse is a device used to point at
the xterm you want to type in.
Kim Alm on a.s.r.
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GS/CS/M/MU d-(++) s:+ a-> C+++ UL++++ P-(+) L+++ E(+) W+ !N K- w--- M + !V Y+
PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++>++++ h---- y+++
------END GEEK CODE BLOCK------


Attachment: PGP.sig
Description: This is a digitally signed message part

Reply via email to