> -----Original Message-----
> From: jacob jacob [mailto:opstk...@gmail.com]
> Sent: Wednesday, March 18, 2015 11:26 AM
> To: e1000-devel@lists.sourceforge.net
> Subject: [E1000-devel] Fwd: PCI passthrough of 40G ethernet interface
> (Openstack/KVM)
> 
> Hi,
> 
> Seeing failures when trying to do PCI passthrough of Intel XL710 40G
> interface to KVM vm.
>     0a:00.1 Ethernet controller: Intel Corporation Ethernet Controller
> XL710 for 40GbE QSFP+ (rev 01)
> 
> >From dmesg on host:
> 
> [80326.559674] kvm: zapping shadow pages for mmio generation wraparound
> [80327.271191] kvm [175994]: vcpu0 unhandled rdmsr: 0x1c9
> [80327.271689] kvm [175994]: vcpu0 unhandled rdmsr: 0x1a6
> [80327.272201] kvm [175994]: vcpu0 unhandled rdmsr: 0x1a7
> [80327.272681] kvm [175994]: vcpu0 unhandled rdmsr: 0x3f6
> [80327.376186] kvm [175994]: vcpu0 unhandled rdmsr: 0x606
> 
> The pci device is still available in the VM but stat transfer fails.
> 
> With the i40e driver, the data transfer fails.
>  Relevant dmesg output:
>  [   11.544088] i40e 0000:00:05.0 eth1: NIC Link is Up 40 Gbps Full
> Duplex, Flow Control: None
> [   11.689178] i40e 0000:00:06.0 eth2: NIC Link is Up 40 Gbps Full
> Duplex, Flow Control: None
> [   16.704071] ------------[ cut here ]------------
> [   16.705053] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:303
> dev_watchdog+0x23e/0x250()
> [   16.705053] NETDEV WATCHDOG: eth1 (i40e): transmit queue 1 timed out
> [   16.705053] Modules linked in: cirrus ttm drm_kms_helper i40e drm
> ppdev serio_raw i2c_piix4 virtio_net parport_pc ptp virtio_balloon
> crct10dif_pclmul pps_core parport pvpanic crc32_pclmul
> ghash_clmulni_intel virtio_blk crc32c_intel virtio_pci virtio_ring
> virtio ata_generic pata_acpi
> [   16.705053] CPU: 1 PID: 0 Comm: swapper/1 Not tainted
> 3.18.7-200.fc21.x86_64 #1
> [   16.705053] Hardware name: Fedora Project OpenStack Nova, BIOS
> 1.7.5-20140709_153950- 04/01/2014
> [   16.705053]  0000000000000000 2e5932b294d0c473 ffff88043fc83d48
> ffffffff8175e686
> [   16.705053]  0000000000000000 ffff88043fc83da0 ffff88043fc83d88
> ffffffff810991d1
> [   16.705053]  ffff88042958f5c0 0000000000000001 ffff88042865f000
> 0000000000000001
> [   16.705053] Call Trace:
> [   16.705053]  <IRQ>  [<ffffffff8175e686>] dump_stack+0x46/0x58
> [   16.705053]  [<ffffffff810991d1>] warn_slowpath_common+0x81/0xa0
> [   16.705053]  [<ffffffff81099245>] warn_slowpath_fmt+0x55/0x70
> [   16.705053]  [<ffffffff8166e62e>] dev_watchdog+0x23e/0x250
> [   16.705053]  [<ffffffff8166e3f0>] ? dev_graft_qdisc+0x80/0x80
> [   16.705053]  [<ffffffff810fd52a>] call_timer_fn+0x3a/0x120
> [   16.705053]  [<ffffffff8166e3f0>] ? dev_graft_qdisc+0x80/0x80
> [   16.705053]  [<ffffffff810ff692>] run_timer_softirq+0x212/0x2f0
> [   16.705053]  [<ffffffff8109d7a4>] __do_softirq+0x124/0x2d0
> [   16.705053]  [<ffffffff8109db75>] irq_exit+0x125/0x130
> [   16.705053]  [<ffffffff817681d8>] smp_apic_timer_interrupt+0x48/0x60
> [   16.705053]  [<ffffffff817662bd>] apic_timer_interrupt+0x6d/0x80
> [   16.705053]  <EOI>  [<ffffffff811005c8>] ? hrtimer_start+0x18/0x20
> [   16.705053]  [<ffffffff8105ca96>] ? native_safe_halt+0x6/0x10
> [   16.705053]  [<ffffffff810f81d3>] ? rcu_eqs_enter+0xa3/0xb0
> [   16.705053]  [<ffffffff8101ec7f>] default_idle+0x1f/0xc0
> [   16.705053]  [<ffffffff8101f64f>] arch_cpu_idle+0xf/0x20
> [   16.705053]  [<ffffffff810dad35>] cpu_startup_entry+0x3c5/0x410
> [   16.705053]  [<ffffffff8104a2af>] start_secondary+0x1af/0x1f0
> [   16.705053] ---[ end trace 7bda53aeda558267 ]---
> [   16.705053] i40e 0000:00:05.0 eth1: tx_timeout recovery level 1
> [   16.705053] i40e 0000:00:05.0: i40e_vsi_control_tx: VSI seid 519 Tx
> ring 0 disable timeout
> [   16.744198] i40e 0000:00:05.0: i40e_vsi_control_tx: VSI seid 520 Tx
> ring 64 disable timeout
> [   16.779322] i40e 0000:00:05.0: i40e_ptp_init: added PHC on eth1
> [   16.791819] i40e 0000:00:05.0: PF 40 attempted to control timestamp
> mode on port 1, which is owned by PF 1
> [   16.933869] i40e 0000:00:05.0 eth1: NIC Link is Up 40 Gbps Full
> Duplex, Flow Control: None
> [   18.853624] SELinux: initialized (dev tmpfs, type tmpfs), uses
> transition SIDs
> [   22.720083] i40e 0000:00:05.0 eth1: tx_timeout recovery level 2
> [   22.826993] i40e 0000:00:05.0: i40e_vsi_control_tx: VSI seid 519 Tx
> ring 0 disable timeout
> [   22.935288] i40e 0000:00:05.0: i40e_vsi_control_tx: VSI seid 520 Tx
> ring 64 disable timeout
> [   23.669555] i40e 0000:00:05.0: i40e_ptp_init: added PHC on eth1
> [   23.682067] i40e 0000:00:05.0: PF 40 attempted to control timestamp
> mode on port 1, which is owned by PF 1
> [   23.722423] i40e 0000:00:05.0 eth1: NIC Link is Up 40 Gbps Full
> Duplex, Flow Control: None
> [   23.800206] i40e 0000:00:06.0: i40e_ptp_init: added PHC on eth2
> [   23.813804] i40e 0000:00:06.0: PF 48 attempted to control timestamp
> mode on port 0, which is owned by PF 0
> [   23.855275] i40e 0000:00:06.0 eth2: NIC Link is Up 40 Gbps Full
> Duplex, Flow Control: None
> [   38.720091] i40e 0000:00:05.0 eth1: tx_timeout recovery level 3
> [   38.725844] random: nonblocking pool is initialized
> [   38.729874] i40e 0000:00:06.0: HMC error interrupt
> [   38.733425] i40e 0000:00:06.0: i40e_vsi_control_tx: VSI seid 518 Tx
> ring 0 disable timeout
> [   38.738886] i40e 0000:00:06.0: i40e_vsi_control_tx: VSI seid 521 Tx
> ring 64 disable timeout
> [   39.689569] i40e 0000:00:06.0: i40e_ptp_init: added PHC on eth2
> [   39.704197] i40e 0000:00:06.0: PF 48 attempted to control timestamp
> mode on port 0, which is owned by PF 0
> [   39.746879] i40e 0000:00:06.0 eth2: NIC Link is Down
> [   39.838356] i40e 0000:00:05.0: i40e_ptp_init: added PHC on eth1
> [   39.851788] i40e 0000:00:05.0: PF 40 attempted to control timestamp
> mode on port 1, which is owned by PF 1
> [   39.892822] i40e 0000:00:05.0 eth1: NIC Link is Down
> [   43.011610] i40e 0000:00:06.0 eth2: NIC Link is Up 40 Gbps Full
> Duplex, Flow Control: None
> [   43.059976] i40e 0000:00:05.0 eth1: NIC Link is Up 40 Gbps Full
> Duplex, Flow Control: None
> 
> 
> Would appreciate any information on how to debug this issue further
> and if the "unhandled rdmsr" logs from KVM indicate some issues with
> the device passthrough.
> 
> Thanks
> Jacob

I have no idea on the "unhandled rdmsr" messages.

As for the driver, we can't do much debugging without version information from 
the driver and the NIC - the best way to get this is from "ethtool -i".  If 
this is the same setup as from your previous thread on another forum, then I 
believe you're using a NIC with the e800013fd firmware from late last summer, 
and that you saw these issues with both the 1.2.9-k and the 1.2.37 version 
drivers.  I suggest the next step would be to update the NIC firmware as there 
are some performance and stability updates available that deal with similar 
issues.  Please see the Intel Networking support webpage at 
https://downloadcenter.intel.com/download/24769 and look for the 
NVMUpdatePackage.zip.

sln






------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to