On Mon, 2009-05-04 at 10:09 +0800, Sheng Yang wrote:
> On Monday 04 May 2009 08:53:07 Nicholas A. Bellinger wrote:
> > On Sat, 2009-05-02 at 18:22 +0800, Sheng Yang wrote:
> > > On Thu, Apr 30, 2009 at 01:22:54PM -0700, Nicholas A. Bellinger wrote:
> > > > Greetings KVM folks,
> > > >
> > > > I wondering if any information exists for doing SR-IOV on the new VT-d
> > > > capable chipsets with KVM..? From what I understand the patches for
> > > > doing this with KVM are floating around, but I have been unable to find
> > > > any user-level docs for actually making it all go against a upstream
> > > > v2.6.30-rc3 code..
> > > >
> > > > So far I have been doing IOV testing with Xen 3.3 and 3.4.0-pre, and I
> > > > am really hoping to be able to jump to KVM for single-function and and
> > > > then multi-function SR-IOV. I know that the VM migration stuff for IOV
> > > > in Xen is up and running, and I assume it is being worked in for KVM
> > > > instance migration as well..? This part is less important (at least
> > > > for me :-) than getting a stable SR-IOV setup running under the KVM
> > > > hypervisor.. Does anyone have any pointers for this..?
> > > >
> > > > Any comments or suggestions are appreciated!
> > >
> > > Hi Nicholas
> > >
> > > The patches are not floating around now. As you know, SR-IOV for Linux
> > > have been in 2.6.30, so then you can use upstream KVM and qemu-kvm(or
> > > recent released kvm-85) with 2.6.30-rc3 as host kernel. And some time
> > > ago, there are several SRIOV related patches for qemu-kvm, and now they
> > > all have been checked in.
> > >
> > > And for KVM, the extra document is not necessary, for you can simple
> > > assign a VF to guest like any other devices. And how to create VF is
> > > specific for each device driver. So just create a VF then assign it to
> > > KVM guest is fine.
> >
> > Greetings Sheng,
> >
> > So, I have been trying the latest kvm-85 release on a v2.6.30-rc3
> > checkout from linux-2.6.git on a CentOS 5u3 x86_64 install on Intel
> > IOH-5520 based dual socket Nehalem board. I have enabled DMAR and
> > Interrupt Remapping my KVM host using v2.6.30-rc3 and from what I can
> > tell, the KVM_CAP_* defines from libkvm are enabled with building kvm-85
> > after './configure --kerneldir=/usr/src/linux-2.6.git' and the PCI
> > passthrough code is being enabled in kvm-85/qemu/hw/device-assignment.c
> > AFAICT..
> >
> > >From there, I use the freshly installed qemu-x86_64-system binary to
> >
> > start a Debian 5 x86_64 HVM (that previously had been moving network
> > packets under Xen for PCIe passthrough). I see the MSI-X interrupt
> > remapping working on the KVM host for the passed -pcidevice, and the
> > MMIO mappings from the qemu build that I also saw while using
> > Xen/qemu-dm built with PCI passthrough are there as well..
> >
>
> Hi Nicholas
>
> > But while the KVM guest is booting, I see the following exception(s)
> > from qemu-x86_64-system for one of the VFs for a multi-function PCIe
> > device:
> >
> > BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
>
> This one is mostly harmless.
> >
Ok, good to know.. :-)
> > I try with one of the on-board e1000e ports (02:00.0) and I see the same
> > exception along with some MSI-X exceptions from qemu-x86_64-system in
> > KVM guest.. However, I am still able to see the e1000e and the other
> > vxge multi-function device with lspci, but I am unable to dhcp or ping
> > with the e1000e and VF from multi-function device fails to register the
> > MSI-X interrupt in the guest..
>
> Did you see the interrupt in the guest and host side?
Ok, I am restarting the e1000e test with a fresh Fedora 11 install and
KVM host kernel 2.6.29.1-111.fc11.x86_64. After unbinding and
attaching the e1000e single-function device at 02:00.0 to pci-stub with:
echo "8086 10d3" > /sys/bus/pci/drivers/pci-stub/new_id
echo 0000:02:00.0 > /sys/bus/pci/devices/0000:02:00.0/driver/unbind
echo 0000:02:00.0 > /sys/bus/pci/drivers/pci-stub/bind
I see the following the KVM host kernel ring buffer:
e1000e 0000:02:00.0: PCI INT A disabled
pci-stub 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
pci-stub 0000:02:00.0: irq 58 for MSI/MSI-X
> I think you can try on-
> board e1000e for MSI-X first. And please ensure correlated driver have been
> loaded correctly.
<nod>..
> And what do you mean by "some MSI-X exceptions"? Better with
> the log.
Ok, with the Fedora 11 installed qemu-kemu, I see the expected
kvm_destroy_phys_mem() statements:
#kvm-host qemu-kvm -m 2048 -smp 8 -pcidevice host=02:00.0
lenny64guest1-orig.img
BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
However I still see the following in the KVM guest kernel ring buffer
running v2.6.30-rc in the HVM guest.
[ 5.523790] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10
[ 5.524582] e1000e 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level,
high) -> IRQ 10
[ 5.525710] e1000e 0000:00:05.0: setting latency timer to 64
[ 5.526048] 0000:00:05.0: 0000:00:05.0: Failed to initialize MSI-X
interrupts. Falling back to MSI interrupts.
[ 5.527200] 0000:00:05.0: 0000:00:05.0: Failed to initialize MSI interrupts.
Falling back to legacy interrupts.
[ 5.829988] 0000:00:05.0: eth0: (PCI Express:2.5GB/s:Width x1)
00:e0:81:c0:90:b2
[ 5.830672] 0000:00:05.0: eth0: Intel(R) PRO/1000 Network Connection
[ 5.831240] 0000:00:05.0: eth0: MAC: 3, PHY: 8, PBA No: ffffff-0ff
While doing dhcp, the e1000e throws a netdev watchdog transmit timeout..
Here is what lspci -v -s 00:05.0 looks like:
00:05.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
Subsystem: Intel Corporation Device 0000
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 10
Region 0: Memory at f2020000 (32-bit, non-prefetchable) [size=128K]
Region 2: I/O ports at c220 [size=32]
Region 3: Memory at f2040000 (32-bit, non-prefetchable) [size=16K]
Kernel driver in use: e1000e
Kernel modules: e1000e
I am going to double check my v2.6.30-rc3 KVM guest kernel build for the
PCI options. Is there anything special I need to enable other than PCI
express support in the v2.6.30-rc3 guest under the PCI Bus options in
the kernel config..? DMAR and Interrupt Remapping should be DISABLED in
the guest HVM kernels, right..?
Also just an observation, I noticed that in Xen HVM with SR-IOV
passthrough the PCIe devices appear as 05:00.0 after an 'xm pci-attach'
call. Is there a reason that SR-IOV with KVM attaches said passthrough
devices under the 00.* PCI bus instead of it's own $NEXT_BUS_ID.00.0
value under the KVM guest.? Does this have any effects on
functionality..?
Many thanks for your most valuable of time,
--nab
> >
> > Soooo, I enabled the debugging code in kvm-85/qemu/hw/device-assignment.c
> > and see the PAGE aligned MMIO memory for the passed PCIe device is being
> > released during the BUG exceptions above.. Is there something else I
> > should be looking at..?
>
> That part of memory should be released for trap MMIO for MSI-X table.
>
> > I have pci-stub enabled, and I unbind 02:00.0
> > from /sys/bus/pci/drivers/e1000e/unbind successfully (just like with Xen
> > and pciback), but I am unable to do the 'echo -n 02:00.0
> >
> > > /sys/bus/pci/drivers/pci-stub/bind' (it returns write error, no such
> >
> > device, with no dmesg output) on the KVM host running v2.6.30-rc3. Is
> > this supposed to happen on v2.6.30-rc3 with pci-stub..?
>
> Maybe you need "echo 0000:02:00.0 > /sys/bus/pci/drivers/pci-stub/bind"?
>
> > I am also using
> > the the kvm-85 source dist kvm_intel.ko and kvm.ko kernel modules. Is
> > there something I am missing when building kvm-85 for SR-IOV passthrough..?
>
> I think the first thing is to confirm that device assignment work in your
> environment, using on-board card. You can also refer to
> http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM
>
> And you can post debug_device_assignment=1 log and qemu log and the tail of
> dmesg as well.
>
> Thanks!
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html