Hey Victor,

I’m afraid I’m far from an ESX expert. :(  I would suggest contracting them, 
they will talk with our ESX driver team if they believe the issue is with the 
driver.  It looks to me like the hypervisor is disagreeing with the driver 
about there being packets to transmit.  In most cases the driver knows better 
as it has access to the state of the HW.  In the linux driver, which the ESX is 
closely based off of to the point of coming from the same code base, we see 
this issue when faced with extreme flow control.  Most of the time this is 
caused by a faulty switch, but it comes down to the OS thinking that we should 
be transmitting, but the driver not as it is constantly stopped by pause 
frames.  Which is something the network stack isn’t aware of.

You could check for that particular case by looking at the ethtool statistics.  
But I’m not saying this is the cause of your error just more as an example of 
issues we have seen in Linux.  Which is why I think talking to VMware would be 
the way to go.

Thanks,
-Don Skidmore <donald.c.skidm...@intel.com>

From: Victor Detoni [mailto:victordet...@gmail.com]
Sent: Monday, October 03, 2016 6:29 PM
To: Skidmore, Donald C <donald.c.skidm...@intel.com>
Cc: e1000-devel@lists.sourceforge.net
Subject: Re: [E1000-devel] Ubuntu ixgbe - Fake Tx hang detected with timeout

Hello Don,

I tried Ubuntu 16.04.1 on bare metal and it worked with no errors. I did some 
performance tests and it's ok.

Do you have any tips to use ESX?

On Tue, Sep 27, 2016 at 2:27 PM, Skidmore, Donald C 
<donald.c.skidm...@intel.com<mailto:donald.c.skidm...@intel.com>> wrote:
Hey Victor,

Could you try it with ESX out of the picture, ixgbe driver/Ubuntu on bare 
metal?  This should help isolate if it is an ESX issue or Linux.

From looking at the log the OS thinks the driver is hung but the driver 
disagrees.  The other thing you could check to see if you seeing excessive 
pause frames.  But ESX is really the wild card here, I would like to remove it 
from environment and see if you can recreate it.

Thanks,
-Don

From: Victor Detoni 
[mailto:victordet...@gmail.com<mailto:victordet...@gmail.com>]
Sent: Tuesday, September 27, 2016 10:16 AM
To: Skidmore, Donald C 
<donald.c.skidm...@intel.com<mailto:donald.c.skidm...@intel.com>>
Cc: e1000-devel@lists.sourceforge.net<mailto:e1000-devel@lists.sourceforge.net>
Subject: Re: [E1000-devel] Ubuntu ixgbe - Fake Tx hang detected with timeout

Don,

I'm using ESX with Ubuntu vm and PCI device with passthrough enabled, in other 
words, I'm accessing ixgbe driver/nic directly and not vmxnet3 or e1000 drivers 
from ESX (not paravirtualized NIC).

On Tue, Sep 27, 2016 at 1:32 PM, Skidmore, Donald C 
<donald.c.skidm...@intel.com<mailto:donald.c.skidm...@intel.com>> wrote:
Hey Victor,

Just to be clear you're talking about the ESX driver correct?  Excuses my 
ignorance but I work primarily with the linux drivers so I wanted to be clear.  
But if it is ESX I can route your question to that team as I don't believe they 
regularly monitor this forum.

Thanks,
-Don Skidmore <donald.c.skidm...@intel.com<mailto:donald.c.skidm...@intel.com>>


> -----Original Message-----
> From: Victor Detoni 
> [mailto:victordet...@gmail.com<mailto:victordet...@gmail.com>]
> Sent: Monday, September 26, 2016 6:19 PM
> To: 
> e1000-devel@lists.sourceforge.net<mailto:e1000-devel@lists.sourceforge.net>
> Subject: [E1000-devel] Ubuntu ixgbe - Fake Tx hang detected with timeout
>
> Hello,
>
> I'm trying to use ixgbe driver for X520-DA2 Intel nic (82599EB) in
> Ubuntu16.04.1 LTS and kernel 4.4.0-38-generic version, but my interface
> couldn't reply an arp pkt. It just send pkts.
>
> I've already updated ixgbe driver to 4.3.15 and vmware esxi to last update
> (6.0.0 update 2), but the error messages are the same:
>
> ethtool -i ens192f0
> driver: ixgbe
> version: 4.3.15
> firmware-version: 0x18b30001
> expansion-rom-version:
> bus-info: 0000:0b:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: no
>
> dmesg msg:
>
> [ 2667.286143] ixgbe 0000:0b:00.0: registered PHC device on ens192f0 [
> 2667.388529] IPv6: ADDRCONF(NETDEV_UP): ens192f0: link is not ready [
> 2667.452260] ixgbe 0000:0b:00.0 ens192f0: detected SFP+: 3 [ 2667.691979]
> ixgbe 0000:0b:00.0 ens192f0: NIC Link is Up 10 Gbps, Flow
> Control: RX/TX
> [ 2667.692268] IPv6: ADDRCONF(NETDEV_CHANGE): ens192f0: link becomes
> ready [ 2672.847889] ------------[ cut here ]------------ [ 2672.847927] 
> WARNING:
> CPU: 2 PID: 0 at
> /build/linux-R0TiM8/linux-4.4.0/net/sched/sch_generic.c:306
> dev_watchdog+0x237/0x240()
> [ 2672.847930] NETDEV WATCHDOG: ens192f0 (ixgbe): transmit queue 2
> timed out [ 2672.847932] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4
> xt_conntrack nf_conntrack iptable_filter ip_tables x_tables
> vmw_vsock_vmci_transport vsock binfmt_misc coretemp ppdev vmw_balloon
> joydev input_leds serio_raw vmw_vmci parport_pc parport shpchp 8250_fintek
> i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr
> iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10
> raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor
> raid6_pq libcrc32c raid1 raid0 multipath linear vmwgfx ttm drm_kms_helper
> syscopyarea sysfillrect crct10dif_pclmul sysimgblt fb_sys_fops ixgbe(OE) dca
> crc32_pclmul vxlan ip6_udp_tunnel udp_tunnel drm ptp aesni_intel
> aes_x86_64 lrw gf128mul mptspi glue_helper ablk_helper mptscsih cryptd
> psmouse mptbase pps_core vmxnet3 [ 2672.848047]  scsi_transport_spi
> pata_acpi fjes
> [ 2672.848053] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G           OE
> 4.4.0-38-generic #57-Ubuntu
> [ 2672.848055] Hardware name: VMware, Inc. VMware Virtual
> Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015 [
> 2672.848056]  0000000000000286 1bdc84bfd903c2b5 ffff88023fc83d98
> ffffffff813f1b73
> [ 2672.848059]  ffff88023fc83de0 ffffffff81d675e0 ffff88023fc83dd0
> ffffffff810811c2
> [ 2672.848066]  0000000000000002 ffff8802309a6980 0000000000000002
> ffff880230ac0000
> [ 2672.848068] Call Trace:
> [ 2672.848070]  <IRQ>  [<ffffffff813f1b73>] dump_stack+0x63/0x90 [
> 2672.848110]  [<ffffffff810811c2>] warn_slowpath_common+0x82/0xc0 [
> 2672.848113]  [<ffffffff8108125c>] warn_slowpath_fmt+0x5c/0x80 [
> 2672.848116]  [<ffffffff8174c037>] dev_watchdog+0x237/0x240 [
> 2672.848119]  [<ffffffff8174be00>] ? qdisc_rcu_free+0x40/0x40 [ 2672.848131]
> [<ffffffff810ec6f5>] call_timer_fn+0x35/0x120 [ 2672.848134]
> [<ffffffff8174be00>] ? qdisc_rcu_free+0x40/0x40 [ 2672.848136]
> [<ffffffff810ed0aa>] run_timer_softirq+0x23a/0x2f0 [ 2672.848139]
> [<ffffffff81085c21>] __do_softirq+0x101/0x290 [ 2672.848142]
> [<ffffffff81085f23>] irq_exit+0xa3/0xb0 [ 2672.848158]  [<ffffffff818331a2>]
> smp_apic_timer_interrupt+0x42/0x50
> [ 2672.848161]  [<ffffffff81831462>] apic_timer_interrupt+0x82/0x90 [
> 2672.848162]  <EOI>  [<ffffffff810645d6>] ? native_safe_halt+0x6/0x10 [
> 2672.848182]  [<ffffffff81038d7e>] default_idle+0x1e/0xe0 [ 2672.848184]
> [<ffffffff8103958f>] arch_cpu_idle+0xf/0x20 [ 2672.848192]
> [<ffffffff810c407a>] default_idle_call+0x2a/0x40 [ 2672.848195]
> [<ffffffff810c43e1>] cpu_startup_entry+0x2f1/0x350 [ 2672.848202]
> [<ffffffff810516e4>] start_secondary+0x154/0x190 [ 2672.848209] ---[ end
> trace 045ad90bda4eb823 ]--- [ 2672.848223] ixgbe 0000:0b:00.0 ens192f0:
> Fake Tx hang detected with timeout of 5 seconds [ 2682.863866] ixgbe
> 0000:0b:00.0 ens192f0: Fake Tx hang detected with timeout of 10 seconds [
> 2702.863890] ixgbe 0000:0b:00.0 ens192f0: Fake Tx hang detected with
> timeout of 20 seconds [ 2742.927868] ixgbe 0000:0b:00.0 ens192f0: Fake Tx
> hang detected with timeout of 40 seconds [ 2822.927879] ixgbe 0000:0b:00.0
> ens192f0: Fake Tx hang detected with timeout of 80 seconds [ 2903.055898]
> ixgbe 0000:0b:00.0 ens192f0: Fake Tx hang detected with timeout of 80
> seconds [ 2982.927867] ixgbe 0000:0b:00.0 ens192f0: Fake Tx hang detected
> with timeout of 80 seconds [ 3063.055905] ixgbe 0000:0b:00.0 ens192f0: Fake
> Tx hang detected with timeout of 80 seconds [ 3142.927891] ixgbe
> 0000:0b:00.0 ens192f0: Fake Tx hang detected with timeout of 80 seconds
>
> Please, someone knows what's happening? any tips?


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to