Anthony Liguori wrote: > Avi Kivity wrote: >> Marcelo Tosatti wrote: >> >>> Add three PCI bridges to support 128 slots. >>> >>> Changes since v1: >>> - Remove I/O address range "support" (so standard PCI I/O space is >>> used). >>> - Verify that there's no special quirks for 82801 PCI bridge. >>> - Introduce separate flat IRQ mapping function for non-SPARC targets. >>> >>> >> >> I've cooled off on the 128 slot stuff, mainly because most real hosts >> don't have them. An unusual configuration will likely lead to >> problems as most guest OSes and workloads will not have been tested >> thoroughly with them. >> >> - it requires a large number of interrupts, which are difficult to >> provide, and which it is hard to ensure all OSes support. MSI is >> relatively new. >> - is only a few interrupts are available, then each interrupt >> requires scanning a large number of queues >> >> If we are to do this, then we need better tests than "80 disks show up". >> >> The alternative approach of having the virtio block device control up >> to 16 disks allows having those 80 disks with just 5 slots (and 5 >> interrupts). This is similar to the way traditional SCSI controllers >> behave, and so should not surprise the guest OS. >> > > If you have a single virtio-blk device that shows up as 8 functions, > we could achieve the same thing. We can cheat with the interrupt > handlers to avoid cache line bouncing too.
You can't cheat on all guests, and even on Linux, it's better to keep on doing what real hardware does than go off on a tangent than no one else uses. You'll have to cheat on ->kick(), too. Virtio needs one exit per O(queue depth). With one spindle per ring, it doesn't make sense to have a queue depth > 4 (or latency goes to hell), so you have many exits. > Plus, we can use PCI hotplug so we don't have to reinvent a new > hotplug mechanism. You can plug disks into a Fibre Channel mesh, so presumably that works on real hardware somehow. > > I'm inclined to think that ring sharing isn't as useful as it seems as > long as we don't have indirect scatter gather lists. I agree, but I think that indirect sg is very important for storage: - a long sg list is cheap from the disk's point of view (the seeks are what's expensive) - it is important to keep the queue depth meaningful and small (O(spindles * 3)), as it drastically affects latency -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel