Avi Kivity wrote: > I don't see an immediate need to put the host-side driver in the kernel, > but I don't want to embed the host fd (which is an implementation > detail) into the host/guest ABI. There may not even be a host fd. Your point is taken, it also punches a hole in the security barrier between guest kernel and userspace which our usage scenario of multiple guests per uid requires.
>> System call latency adds to the in-kernel approach here. > I don't understand this. What I meant to state was: If the host side of the block driver runs in userspace, we have the extra latency to leave the kernel system call context, compute on behalf of the user process, and do another system call (to drive the IO). This extra overhead does not show when handling IO requests from the guest in the kernel. > The bio layer already has scatter/gather (basically, a biovec), but the > aio api (which you copy) doesn't. The basic request should be a bio, > not a bio page. With our block driver it is, we submit an entire bio which may contain multiple biovecs at one hypercall. > Right. But the ABI needs to support barriers regardless of host kernel > support. When unavailable, barriers can be emulated by waiting for the > request queue to flush itself. If we do implement the host side in the > kernel, then barriers become available. Agreed. > I/O may be slow, but you can have a lot more disks than cpus. > > For example, if an I/O takes 1ms, and you have 100 disks, then you can > issue 100K IOPS. With one hypercall per request, that's ~50% of a cpu > (at about 5us per hypercall that goes all the way to userspace). That's > not counting the overhead of calling io_submit(). Even when a hypercall round-trip takes as long as 5us, and even if you have 512byte per biovec only (we use 4k blocksize), I don't see how this gets a performance problem: With linear read/write you get 200.000 hypercalls per second with 128 kbyte per hypercall. That's 25.6 GByte per second per CPU. With random read (worst case: 512 byte per hypercall) you still get 100 MByte per second per CPU. There are tighter bottlenecks in the IO hardware afaics. so long, Carsten ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel