Re: PCI capabilities support for assigned devices

2010-03-12 Thread Sebastian Hetze
On Wed, Mar 10, 2010 at 02:52:30PM +0100, Sebastian Hetze wrote:
 Hi *,
 
 in qemu-kvm/hw/device-assignment.c assigned_device_pci_cap_init()
 appearently only PCI_CAP_ID_MSI and PCI_CAP_ID_MSIX are exposed
 to the guest.
 
 Linux Broadcom bnx2 and tg3 drivers expect PCI_CAP_ID_PM to be present.
 
 Are there any plans to implement this and possibly other PCI capability
 features for assigned devices?
 
 If not, is there a list of network cards known to work with PCI
 assignment in KVM?
 

Answering my own mail, I can contribute experience with direct
assignment of two network cards.
I was able to test Intel 82571EB (e1000e) and 82575EB (igb). Both drivers work
with IOMMU assigned devices in a SMP guest running 2.6.31-16-generic-pae
ubuntu kernel with 4 CPUs.

However, the e1000e driver hangs reproduceable on NFS load after 10 to 20 
minutes

[ 1312.989127] :00:05.0: eth0: Detected Tx Unit Hang:
[ 1312.989130] TDH b8
[ 1312.989131] TDT f4
[ 1312.989132] next_to_use f4
[ 1312.989133] next_to_clean b5
[ 1312.989134] buffer_info[next_to_clean]:
[ 1312.989135] time_stamp 3d3dc
[ 1312.989136] next_to_watch b8
[ 1312.989137] jiffies 3dd3f
[ 1312.989138] next_to_watch.status 0
[ 1313.988199] [ cut here ]
[ 1313.988237] WARNING: at 
/build/buildd/linux-2.6.31/net/sched/sch_generic.c:246 
dev_watchdog+0x1f6/0x210()
[ 1313.988247] Hardware name:
[ 1313.988249] NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
[ 1313.988251] Modules linked in: ppdev lp parport autofs4 video output nfsd 
exportfs nfs lockd nfs_acl auth_rpcgss sunrpc e100 via_rhine 3c59x 8139too mii 
snd_ens1370 gameport snd_rawmidi snd_seq_device snd_pcm snd_timer snd soundcore 
snd_page_alloc pcspkr psmouse virtio_net i2c_piix4 serio_raw joydev virtio_pci 
virtio_ring virtio usbhid floppy fbcon tileblit font bitblit softcursor e1000e
[ 1313.988294] Pid: 0, comm: swapper Not tainted 2.6.31-16-generic-pae 
#53-Ubuntu
[ 1313.988297] Call Trace:
[ 1313.988323] [c0146a1d] warn_slowpath_common+0x6d/0xa0
[ 1313.988327] [c04b6746] ? dev_watchdog+0x1f6/0x210
[ 1313.988331] [c04b6746] ? dev_watchdog+0x1f6/0x210
[ 1313.988334] [c0146a96] warn_slowpath_fmt+0x26/0x30
[ 1313.988337] [c04b6746] dev_watchdog+0x1f6/0x210
[ 1313.988350] [c0159efb] ? insert_work+0x5b/0xa0
[ 1313.988361] [c0128098] ? default_spin_lock_flags+0x8/0x10
[ 1313.988375] [c0576bda] ? _spin_lock_irqsave+0x2a/0x40
[ 1313.988379] [c015a291] ? __queue_work+0x31/0x40
[ 1313.988383] [c01520d7] run_timer_softirq+0x117/0x200
[ 1313.988390] [c016c045] ? tick_dev_program_event+0x45/0xe0
[ 1313.988393] [c04b6550] ? dev_watchdog+0x0/0x210
[ 1313.988399] [c014cc40] __do_softirq+0x90/0x1a0
[ 1313.988402] [c0161e83] ? hrtimer_interrupt+0x183/0x210
[ 1313.988406] [c016c488] ? tick_do_update_jiffies64+0x118/0x160
[ 1313.988410] [c014cd8d] do_softirq+0x3d/0x40
[ 1313.988414] [c014cecd] irq_exit+0x5d/0x70
[ 1313.988425] [c011ddc7] smp_apic_timer_interrupt+0x57/0x90
[ 1313.988433] [c0103d71] apic_timer_interrupt+0x31/0x40
[ 1313.988437] [c0127365] ? native_safe_halt+0x5/0x10
[ 1313.988441] [c010a5b6] default_idle+0x46/0xd0
[ 1313.988444] [c010202c] cpu_idle+0x8c/0xd0
[ 1313.988454] [c0572498] start_secondary+0xc6/0xc8
[ 1313.988457] ---[ end trace c5b28b21ada19e1d ]---

I also tried the 1.1.2-NAPI driver from Intel without success.

The 82575EB device works fine.

To get an impression of the performance improvement you
win from an directly assigned NIC, take a look at the
following numbers taken from sysstat in a real live
production system. Both tables show the same guest
on two subsequent days with similar every day workloads.

virtio-net
00:00:01 IFACE rxpck/s  txpck/s  rxkB/s  txkB/s   rxcmp/s txcmp/s rxmcst/s
06:05:01 eth0 577,01582,57   141,83  211,79   0,000,000,00
06:15:01 eth0 169,70181,93   70,04   90,490,000,000,00
06:25:01 eth0 135,22138,81   60,94   56,010,000,000,00
06:35:01 eth0 787,44879,40   183,05  498,02   0,000,000,00
06:45:01 eth0 1430,22   1660,49  323,80  1163,14  0,000,000,00
06:55:01 eth0 1524,15   1730,98  400,70  1112,32  0,000,000,00
07:05:02 eth0 1300,95   1414,48  307,40  741,41   0,000,000,00
07:15:01 eth0 380,77435,28   141,65  289,76   0,000,000,00
07:25:01 eth0 312,16365,33   112,46  230,35   0,000,000,00
07:35:01 eth0 758,51801,66   169,99  375,84   0,000,000,00
07:45:01 eth0 1685,90   1922,92  301,87  1408,62  0,000,000,00
07:55:02 eth0 2531,08   3205,48  579,60  3033,27  0,000,000,00
08:05:02 eth0 2011,90   2180,31  426,22  1041,64  0,000,000,00
08:15:01 eth0 1054,10   1252,39  267,30  648,87   0,000,000,00
08:25:02 eth0 613,17761,34   170,96  551,79   0,000,000,00
08:35:02 eth0 858,47921,50   205,33  440,02   0,000,000,00
08:45:02 eth0 1426,27   1465,96  336,28  635,85   0,000,000,00
08:55:02 eth0 1539,78   1600,87  361,98  716,52   0,000,000,00
09:05:01 

PCI capabilities support for assigned devices

2010-03-10 Thread Sebastian Hetze
Hi *,

in qemu-kvm/hw/device-assignment.c assigned_device_pci_cap_init()
appearently only PCI_CAP_ID_MSI and PCI_CAP_ID_MSIX are exposed
to the guest.

Linux Broadcom bnx2 and tg3 drivers expect PCI_CAP_ID_PM to be present.

Are there any plans to implement this and possibly other PCI capability
features for assigned devices?

If not, is there a list of network cards known to work with PCI
assignment in KVM?

Best Regards,

  Sebastian
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html