Re: [e1000-devel] No versions for igb driver available on "modinfo igb" for SLE15-SP4
On 1/11/2024 4:21 AM, kumar.mo...@swisscom.com wrote: > Hi, > > Unlike the previous releases for the drivers, we don’t see the column for > version anymore for the Intel(R) Gigabit Ethernet Network Driver in SUSE > Linux (SLE15-SP4) when doing modinfo igb. > > Is this something expected and if yes, is there any other way to get the igb > driver versions? For older SUSE installations we have , we see 5.6.0-k or > something like that for igb driver versions. Hi Mohit, The Intel out-of-tree (OOT) drivers (like the one you download from sourceforge) have a version number in them, but in the upstream, version numbers were removed by the kernel community, and the version is equivalent to the kernel the driver was released with. If there was some reason you thought you needed a driver version, please let us know. The reason the kernel community removed the driver versions from upstream (and therefore from consumers of upstream, like the SLES distro you mention) is that the version numbers were misleading, wrong, or not kept up to date. Basically the idea that comparing in-kernel to OOT using a version number is not a good idea, as the drivers are not the same, they're two different products released at different times, with differing functionality. If you need the specific upstream commit that igb was updated to in the SLE15 SP4 release, please contact SuSE. Hope this helps! Jesse ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://community.intel.com/t5/Ethernet-Products/bd-p/ethernet-products
Re: [e1000-devel] Intel E810 100Gb goes down sporadically
On 12/3/2023 1:26 AM, Assaf Albo via E1000-devel wrote: > Hello guys, > > We are having constant network issues in production in that the link goes > down, waits *exactly* 7-8 seconds, and goes up again. > This can happen zero to a few times a day on all our servers; they are not > in the same location and are connected to different network devices. > > Each server runs as a KVM virtual machine with 60 CPUs (Pinning) and 224Gi > (Huge pages) - overall performance is excellent. > The NIC is PCI passed through to the KVM machine AS IS. > OS Rocky Linux 8.5, kernel 4.18.0-348.23.1.el8_5.x86_64 with Intel ice > 1.9.11 built and installed using rpm. > We have a traffic generator between two servers (our app: client+server) > that is reaching 94Gb and can replicate this issue. > > The dmesg once the issue occur: > Nov 28 16:01:27 SERVER kernel: ice :00:06.0 eth0: NIC Link is Down > Nov 28 16:01:35 SERVER kernel: ice :00:06.0 eth0: NIC Link is up 100 > Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: RS-FEC, Autoneg > Advertised: Off, Autoneg Negotiated: False, Flow Control: None Hi Assaf, sorry hear you're having problems. w.r.t. the link down events we need to determine if it is a local down or remote. Please gather the 'ethtool -S eth0' statistics for a system that has had some problems, and send to the list as text. also, 'ethtool -m eth0' The passthrough device shouldn't be any problem but I do recommend that if you're passing through the device to a VM, you try to match the destination PCIe function number to the origination ID to prevent odd issues. like if your host device is: 01:00.1 then (I'm not sure you can do this) I'd hope the VM device is 00:06.1, and not 00:06.0 So I guess with that statement I'd ask do you ever see the problem on systems with 3b:00.0 (ice PF PCIe in host) 00:06.0 (ice PF in VM) having the link down issues? Please include output from devlink dev info, and if you know it, what switch you're connected to. Also, do you see any stats or events on the switch side when link is lost? - Jesse ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://community.intel.com/t5/Ethernet-Products/bd-p/ethernet-products
Re: [e1000-devel] idg driver compilation error on Ubuntu
On 10/30/2023 3:27 AM, adelio ALVES wrote: Thanks for your report! Something happened to the content of your message when I released it to the mailing list. Please use the driver included in your kernel (igb.ko.xz or the like) and let us know if you have any problems. Was there a reason you wanted to run the out-of-tree igb-5.7.2 driver? Kernel version 5.15.0-97-generic should already have a working igb driver. Thanks, Jesse ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://community.intel.com/t5/Ethernet-Products/bd-p/ethernet-products
Re: [e1000-devel] Stablish uni-directional ethernet link
On 7/28/2023 6:26 AM, Alireza Sadeghpour wrote: > Hi, I am trying to establish a uni-directional Ethernet link where a > singular fiber is used to transmit data to the receiver where both sides > use ixgbe as driver. The Rx of the transmit side and the Tx of the receive > side are not physically connected, like in a Data diode scenario. The > problem is, as soon as I detach the tx line from one side, both side link > status goes DOWN. is it possible to mask link status in the ixgbe driver to > force it to be UP state in both side? Yes, there is a force-link-up bit, called AUTOC.FLU. You may have to set some other registers in AUTOC to force link speed, etc. I'm pretty sure this will work as I've done it in the past, but your mileage may vary and this is way outside normal for the linux driver, so I can't help you much beyond this email. If you still need help after trying the above, I recommend you contact Intel Support. Jesse ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://community.intel.com/t5/Ethernet-Products/bd-p/ethernet-products
Re: [e1000-devel] Issue with Intel Corporation 82546EB dual port card on Ubuntu 22.04
On 5/11/2023 9:54 PM, Igor Cicimov wrote: > Hi, > > I have a problem with my 8086:1010 Intel Corporation 82546EB Gigabit > Ethernet Controller (Copper) dual port ethernet card and Ubuntu 22.04.2 LTS > using e1000 driver: This card is from 2003! :-) Nice that it's still running! Did you file a bug with Canonical against ubuntu or ask for help over there yet? > that I have configured in LACP bond0: > > # cat /proc/net/bonding/bond0 > Ethernet Channel Bonding Driver: v5.15.0-69-generic > > Bonding Mode: IEEE 802.3ad Dynamic link aggregation > Transmit Hash Policy: layer2+3 (2) > MII Status: down > MII Polling Interval (ms): 100 > Up Delay (ms): 100 > Down Delay (ms): 100 > Peer Notification Delay (ms): 0 > > 802.3ad info > LACP active: on > LACP rate: fast > Min links: 0 > Aggregator selection policy (ad_select): stable > System priority: 65535 > System MAC address: MAC_BOND0 > bond bond0 has no active aggregator Did you try bonding without MII link monitoring? I'm wondering if you're getting caught up in the ethtool transition to netlink for some reason. > > Slave Interface: eth1 > MII Status: down > Speed: 1000 Mbps > Duplex: full > Link Failure Count: 0 > Permanent HW addr: MAC_ETH1 > Slave queue ID: 0 > Aggregator ID: 1 > Actor Churn State: churned > Partner Churn State: churned > Actor Churned Count: 1 > Partner Churned Count: 1 > details actor lacp pdu: > system priority: 65535 > system mac address: MAC_BOND0 > port key: 0 > port priority: 255 > port number: 1 > port state: 71 > details partner lacp pdu: > system priority: 65535 > system mac address: 00:00:00:00:00:00 > oper key: 1 > port priority: 255 > port number: 1 > port state: 1 > > Slave Interface: eth2 > MII Status: down > Speed: 1000 Mbps > Duplex: full > Link Failure Count: 0 > Permanent HW addr: MAC_ETH2 > Slave queue ID: 0 > Aggregator ID: 2 > Actor Churn State: churned > Partner Churn State: churned > Actor Churned Count: 1 > Partner Churned Count: 1 > details actor lacp pdu: > system priority: 65535 > system mac address: MAC_BOND0 > port key: 0 > port priority: 255 > port number: 2 > port state: 71 > details partner lacp pdu: > system priority: 65535 > system mac address: 00:00:00:00:00:00 > oper key: 1 > port priority: 255 > port number: 1 > port state: 1 > > that is in state down of course since both interfaces have MII Status: > down. The dmesg shows: > > # dmesg | grep -E "bond0|eth[1|2]" > [ 42.999281] e1000 :01:0a.0 eth1: (PCI:33MHz:32-bit) MAC_ETH1 > [ 42.999292] e1000 :01:0a.0 eth1: Intel(R) PRO/1000 Network Connection > [ 43.323358] e1000 :01:0a.1 eth2: (PCI:33MHz:32-bit) MAC_ETH2 > [ 43.323366] e1000 :01:0a.1 eth2: Intel(R) PRO/1000 Network Connection > [ 65.617020] bonding: bond0 is being created... > [ 65.787883] 8021q: adding VLAN 0 to HW filter on device eth1 > [ 67.790638] 8021q: adding VLAN 0 to HW filter on device eth2 > [ 70.094511] 8021q: adding VLAN 0 to HW filter on device bond0 > [ 70.558364] 8021q: adding VLAN 0 to HW filter on device eth1 > [ 70.558675] bond0: (slave eth1): Enslaving as a backup interface with a > down link > [ 70.560050] 8021q: adding VLAN 0 to HW filter on device eth2 > [ 70.560354] bond0: (slave eth2): Enslaving as a backup interface with a > down link > > So both eth1 and eth2 are UP and recognised, ethtool says "Link detected: > yes" but their links are DOWN. I have a confusing port type of FIBRE > reported by ethtool (capabilities reported by lshw are capabilities: pm > pcix msi cap_list rom ethernet physical fibre 1000bt-fd autonegotiation). > It is weird and I suspect some hardware or firmware issue. Any ideas are > welcome. You didn't post your bonding options enabled or bonding config file: did you try the use_carrier=1 option, it's the default but you're not setting it to zero are you?? > > P.S: It is not the switch or the switch ports and it is not the cables > already tested that. The same setup, switch+cables+card was working fine up > to Ubuntu 18.04 The Supported ports: [ FIBRE ] thing is strange, but it really shouldn't matter. ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://community.intel.com/t5/Ethernet-Products/bd-p/ethernet-products
Re: [e1000-devel] error in e1000e driver under ubuntu20.04 HWE with kernel 5.15.0-67
On 3/27/2023 12:32 AM, ST Cai wrote: > OS:Ubuntu Server 20.04 HWE > Kernel:5.15.0-67-generic > Adapter:Intel(6)I219-V;Vendor:0x8086;Product:0x15be; > Drivers :e1000e-3.8.4/e1000e-3.8.4; > > make install Error: > > ethtool.c:2838:19: error: initialization of ‘int (*)(struct net_device *, > struct ethtool_coalesce *, struct kernel_ethtool_coalesce *, struct > netlink_ext_ack *)’ from incompatible pointer type ‘int (*)(struct > net_device *, struct ethtool_coalesce *)’ > [-Werror=incompatible-pointer-types] > > ethtool.c:2838:19: note: (near initialization for > ‘e1000_ethtool_ops.get_coalesce’) > /home/egw/e1000e-3.8.7/src/ethtool.c:2839:19: error: initialization of ‘int > (*)(struct net_device *, struct ethtool_coalesce *, struct > kernel_ethtool_coalesce *, struct netlink_ext_ack *)’ from incompatible > pointer type ‘int (*)(struct net_device *, struct ethtool_coalesce *)’ > [-Werror=incompatible-pointer-types] > > How to solve? The e1000e driver included with the kernel you already have should work fine. Please try it and let us know. The e1000e driver from this site is not being actively maintained and was last released in 2020. ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://community.intel.com/t5/Ethernet-Products/bd-p/ethernet-products
Re: [e1000-devel] missing symbols ice and i40e for irdma modules
On 3/10/2023 12:25 AM, Dmitry Kravkov wrote: > after loading ice 1.11.14 module which compiled for 6.1.8 > irdma modules is not able to load due to missing symbols: > [1000969.082365] irdma: Unknown symbol ice_del_rdma_qset (err -2) > [1000969.082599] irdma: Unknown symbol ice_add_rdma_qset (err -2) > [1000969.082738] irdma: Unknown symbol ice_rdma_update_vsi_filter (err -2) > [1000969.082856] irdma: Unknown symbol ice_rdma_request_reset (err -2) > [1000969.082869] irdma: Unknown symbol ice_get_qos_params (err -2) > > similar happens for i40e 2.22.18 > > We recommend you only ever run matched sets of drivers, does installing the OOT (out of tree) irdma driver work? https://www.intel.com/content/www/us/en/download/19632/linux-rdma-driver-for-the-e810-and-x722-intel-ethernet-controllers.html ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://community.intel.com/t5/Ethernet-Products/bd-p/ethernet-products
Re: [e1000-devel] [ice] Intel E810-C one queue after reboot
On 1/21/2023 6:17 AM, Highload Admin wrote: Hello! FYI, general spam filter guidance is to ignore mails from admin@ mails, so our list rejects your subscription. I'd change it but we get a lot of mails from bogus admin@ accounts. I have a problem with ice driver (I tried 1.10.1.2 an1.10.1.2.2 versions). After reboot I see only one queue: # ethtool -l enp129s0 Channel parameters for enp129s0: Pre-set maximums: RX: 1 TX: 1 Other: 0 Combined: 1 Current hardware settings: RX: 0 TX: 0 Other: 0 Combined: 1 The above looks like "safe mode", which is a backup mode for the driver when something goes wrong during driver load, there should be messages in dmesg. After reloading driver ice - all fine? 128 queues # ifdown enp129s0; rmmod ice; modprobe ice; ifup enp129s0; n# ethtool -l enp129s0 This is likely because you didn't do "make install" when originally building/installing the drivers, or it didn't work to modify your initramfs (which also contains drivers). This results in one file (an older driver) being loaded at boot, and post boot if you use rmmod/modprobe, the driver is loaded from your filesystem. Channel parameters for enp129s0: Pre-set maximums: RX: 128 TX: 128 Other: 1 Combined: 128 Current hardware settings: RX: 0 TX: 0 Other: 1 Combined: 128 This is good news, it means that if you get the driver installed correctly into your initramfs/initrd, things will be fine. Hardware: Supermicro H12DSU-iN motherboard AMD EPYC 7742 64-Core Processor Software: OS Debian 10.13 Linux 4.19.0-23-amd64 # ethtool -i enp129s0 driver: ice version: 1.10.1.2 firmware-version: 3.20 0x8000d855 1.3146.0 expansion-rom-version: bus-info: :81:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes Howto resolve my problem? Please try 'sudo make install' from the ice-1.0.10.2/src directory. you can check that your problem is as I stated by using 'sudo lsinitrd | grep /ice.ko' If you continue to have issues please get back to us. ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://forums.intel.com/s/topic/0TO0P0018NbWAI/intel-ethernet
Re: [e1000-devel] i40e: Intel XL710: Linux Debian 11: rx/tx-vlan-offload seems to be broken since 2.18.9
On 12/30/2022 11:15 AM, Andrey Kulikov wrote: Hello, I've got an Intel Fortville XL710-based Ethernet controller with 4 x 10GbE SFP+ ports. Platform is based on Intel Xeon CPU E5-2697 v4 Platform running Debian 11.6, kernel 5.10.0 # uname -a Linux 5.10.0-20-amd64 #1 SMP Debian 5.10.158-2 (2022-12-13) x86_64 GNU/Linux Current i40e driver: 2.22.8 (out-of tree, self-built from sources). Are you loading the 8021q.ko module? Setup: Two identical platforms, with absolutely identical hardware and software. Connected directly with LC-LC SR patchcord using Intel 10G SFP+ transceivers (FTLX8571D3BCV-IT it makes any difference). Issue: When I configure VLAN on HW interface - it doesn't work. When pinging via VLAN the other side is just do not see anything (tcpdump shows nothing). At the same time, if I ping on HW interfaces directly - it does work perfectly well. But it was working with i40e driver 2.18.9 half of a year ago, with Debian 11.4(? here I could be wrong) kernel. Relevant fragment from my /etc/network/interfaces on one side: auto enp132s0f0 iface enp132s0f0 inet static address 192.168.33.2 netmask 255.255.255.0 mtu 1500 auto enp132s0f0.545 iface enp132s0f0.545 inet static address 192.168.44.2 netmask 255.255.255.0 mtu 1500 Once the network setup is done, what does 'ip link' show? The other side looks identical except IP-addresses (they both end with '1'). Workaround: disable tx-vlan-offload and and rx-vlan-offload: ethtool -K enp132s0f0 tx-vlan-offload off rx-vlan-offload off Checked with CISCO NEXUS 7000 and NEXUS 9000 as remote counterparts - they behave identically to described above. Based on what you said I doubt it's switches or cables, but someting is up with your config. Current XL710 firmware is 8.15. But I've got adapters with firmware 7.something - there is no difference in behavior. Does it ring a bell? I don't recall hearing reports of other issues. Does it have something to do with the i40e driver? Is any further information required? When pinging, it would be useful to see what ethtool -S shows as changing, like ethtool -S enp132s0f0 ping -c2 192.168.33.1 ethtool -S enp132s0f0 arp -an output would be useful as well. ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://forums.intel.com/s/topic/0TO0P0018NbWAI/intel-ethernet
Re: [E1000-devel] Driver issue 5.4.0-1064 Kernel
On 11/3/2022 9:05 AM, Trey Hughes via E1000-devel wrote: Good morning, I'm having an issue with installing the e1000e driver version 3.8.4 on Ubuntu 18.04 with Kernel 5.4.0-1064. When I go to make install per the readme instructions, I get an error stating the UTS_UBUNTU_RELEASE_ABI is too large.When I look at the code, it seems that it is checking to verify the release is >255, and if not it errors out on the install. Is there a way around this or is there another driver I should be using for this kernel? Any help would be greatly appreciated. Thank you!Trey Hughes Your kernel should already have an e1000e driver built-in, that works. If the in-kernel driver is not working then you should follow up with ubuntu bugzilla (but this is a pretty old release now) also, it would be useful to know what hardware you're running, like output from lspci -nn ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://forums.intel.com/s/topic/0TO0P0018NbWAI/intel-ethernet
Re: [E1000-devel] Citrix Hypervisor (XenServer) - very poor performance with X710
On 3/14/2022 8:28 AM, Kevin Bowling wrote: Fortville (700) has always been a bit of a disaster (https://cdrdv2.intel.com/v1/dl/getContent/331430?explicitVersion=true), I'd see if you can press your Intel reps into getting you the 550s or the 800-series NICs for the unnecessary troubles it's a much nicer design. It's surprising they are shipping new cards with that old of a firmware, you should be on 8.50 for the driver you are running (https://www.intel.com/content/www/us/en/download/18635/non-volatile-memory-nvm-update-utility-for-intel-ethernet-adapters-700-series-linux.html). Doing the FW update is worth a shot but most issues I've seen have been driver related and you are running a pretty recent driver. Regards, Kevin On Mon, Mar 14, 2022 at 7:44 AM Matthew Weiner wrote: I'm at my wits end with this, Citrix is stumped, Dell is stumped, and with the supply chain issues the way they are we can't just yank these out in favor of X550s. The problem is we have a group of Dell R740s with X710 dual-port NICs and the performance is, in a word, awful. Like 5-6 megabit upload and 250 megabit download awful. However, identical server hardware with any other card, be it a Broadcom or an Intel X550T, no issues. We can get line rate all day long. The latest attempt was swapping the X710 for a newer X710-T2L-t, which performed maybe 5-10 percent better. We've tried three different driver revisions, firmware, BIOS, all the available Hypervisor updates, it still performs the same. The servers in question have X550s on the motherboard mezzanine card which perform fine, and a single dual-port X710 in the PCIe riser. The X710 is set up with an LACP pair trunked with three VLANs tagged across it. In this pool we also have servers with X550s on the PCIe cards, and Broadcoms. All those with an identical configuration perform without issue, it's only the X710s that show this problem. Hi Matt, sorry to hear about this problem. Let's poke a bit (please be patient with me) and see if we can help you. Have you followed the steps like located here: https://www.thomas-krenn.com/en/wiki/Intel_Ethernet_700_Series_LACP_Configuration As there are definitely known problems with LACP mode and the driver's default settings. You can try the above workaround and see if it helps. If that does help, then there are ways to make the settings get applied by ethtool as the system comes up. Please let us know how it goes. It would be helpful to know what kernel you're running, just for good measure. PS. You may want to subscribe to e1000-devel as it is currently holding your messages because you're not a subscriber, and they have to be manually released. ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://forums.intel.com/s/topic/0TO0P0018NbWAI/intel-ethernet
Re: [E1000-devel] errors on link xl 710
On 11/22/2021 9:43 PM, Jakub Osuch wrote: I have errors on link between nexus N3K-C3064PQ-10GX and LREC9902BF-2QSFP+. driver: i40e version: 2.17.4 Take a look: shorturl.at/crCNQ Hi Jakub, please file a bug at https://sourceforge.net/p/e1000/bugs/ Which will allow you to attach relevant information. We're a little wary of clicking on random links, you can hopefully appreciate why. Thanks ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit https://forums.intel.com/s/topic/0TO0P0018NbWAI/intel-ethernet
Re: [E1000-devel] [Intel-wired-lan] Not able to create VFs on PF passthrough of ethernet interface to VM
On Mon, 13 May 2019 07:36:41 + Periyasamy wrote: > Hi, > > I’m trying to achieve PF passthrough of 40/10G ethernet interface (i40e) into > guest VM running on qemu/kvm hypervisor and then create VFs on the PF inside > the VM. > This is to have a flexibility and better manageability of VFs inside the VM > (for example, kubernetes worker node) itself and not on the host. > > > The ethernet PCI device is seen inside the VM and bound to i40e driver. But I > don’t see an option to create VFs. i.e. sriov_numvfs file is not seen under > /sys/devices/pci:00/:00:02.1/:02:00.0 directory. Hi Periyasamy, The PCI space itself is not passed-through, it is completely fake and generated by QEMU. Do you know if anyone has ever gotten what you're trying to do to work? I don't think you can do what you're trying to do with using a VM to spawn SR-IOV devices, at least I've not heard of it working. Basically you have a scoping problem. At it's core, the PCI space is owned by the host, not the VM, and the hardware is literally in the host PCI device space no matter where you pass it to. The hardware actually creates (starts decoding addresses and PCI space for) the new PCI devices when you enable the device via sriov_numvfs. Those devices will appear in space reserved by the host, for SR-IOV devices to "appear", but there is no guarantee that memory range will be passed through to the VF, and again all the VM PCI devices are "fake" PCI config space, so without some daemon monitoring and adding the devices via virsh or something, I doubt the VM would ever see them even. > Host versions: > OS: Ubuntu 16.04.5 LTS, Kernel: 4.15.0-48-generic, libvirt: 4.0.0, qemu: > 2.11.1 > i40e version: 2.1.14-k, firmware-version: 6.01 0x800034a3 1.1747.0 > > Guest versions: > OS: CentOS 7 (Core) Kernel: 3.10.0-862.14.4.el7.x86_64 > i40e version: 2.1.14-k, firmware-version: 6.01 0x800034a3 1.1747.0 > > The VM libvirt xml configuration [1], PF configuration at host [2], PF > configuration at VM [3] are attached. > The lspci output line nos. 63-75 related to SRIOV Capabilities in host [2] > are missing in VM which looks bit weird. as per above, the PCI config space is completely virtualized by QEMU. Hope this helps! Jesse ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] jitter / latency reduction
On Mon, 6 Mar 2017 08:09:42 -0800 Mahmood Qazenwrote: > greetings Leonardo > this is the slide / pdf I found and towards the end it asks if we > could help. > enjoy > Mahmood - Hi developers, thanks for your interest, we’d love to have help, but the good/bad news is that this is implemented already upstream, and known as busy_poll support in the kernel. Also, most if not all the active drivers right now, at least from heavily used drivers, support the “built-in” model that busy poll has migrated to. This allows most if not all drivers with NAPI support (normal) in the kernel to have busy_poll support if it is enabled at runtime. I believe there is currently some work to do still to get epoll working correctly, and there probably is room for refactoring/improvement to solve some of the issues with scaling. There is also a paper being presented next week at the NetDevConf.org conference about Busy Polling, by Eric Dumazet from google, and videos will be posted eventually. Please see (in the linux kernel source) Documentation/sysctl/net.txt busy_read Low latency busy poll timeout for socket reads. (needs CONFIG_NET_RX_BUSY_POLL) Approximate time in us to busy loop waiting for packets on the device queue. This sets the default value of the SO_BUSY_POLL socket option. Can be set or overridden per socket by setting socket option SO_BUSY_POLL, which is the preferred method of enabling. If you need to enable the feature globally via sysctl, a value of 50 is recommended. Will increase power usage. Default: 0 (off) busy_poll Low latency busy poll timeout for poll and select. (needs CONFIG_NET_RX_BUSY_POLL) Approximate time in us to busy loop waiting for events. Recommended value depends on the number of sockets you poll on. For several sockets 50, for several hundreds 100. For more than that you probably want to use epoll. Note that only sockets with SO_BUSY_POLL set will be busy polled, so you want to either selectively set SO_BUSY_POLL on those sockets or set sysctl.net.busy_read globally. Will increase power usage. Default: 0 (off) -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] ixgbe port missing, "PCI INT B: failed to register GSI"
On Tue, 6 Dec 2016 17:24:18 -0800 Ben Greearwrote: > On 12/06/2016 05:15 PM, Fujinaka, Todd wrote: > > Attachments don't work here. You'll have to file a bug on sourceforge, or > > file an IPS for factory support (and tell me the number so it doesn't sit). > > > > Ok, here it is inline then. lspci -vvv output is at the end of the dmesg > output. Ben, please see if you can enable 64 bit BARs (in the BIOS) and the issue might go away also could try enabling the pci= option that allows the kernel to remap BARs, but may not even be necessary if 64 bit works. The reason I suggest the above is because I saw some BAR mapping errors in your dmesg (which I believe is why you can't get MSI-X resources) Sorry I didn't have more time to be specific, but I wanted to at least get this out to you now. -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/xeonphi ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] i40e: xl710 chipset in 4.8 kernel
On Sat, 19 Nov 2016 23:05:57 +0300 Yavuz Selim Komurwrote: > Hi, > > i40e drops all UDP traffic when upgrade to 4.8 from 4.7 linux kernel. > > all DHCP, DNS traffic stop. i40e not forwards any UDP. > > is this possible Hi Yavuz, please be more specific, as there may be some reason for your issue, but we can't tell from the data you provided. output of: ethtool -i dmesg from boot after a few minutes of "no UDP" please dump ethtool -S output Are you plugged into a switch? You should probably file a bug at https://e1000.sf.net/bugs so that you can attach files, as they won't be delivered well by the list. Also, are you using a distro based kernel, and what is the exact kernel version you're using (output of uname -a) Thanks! -- ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [Intel-wired-lan] i40e card Tx resets
On Thu, 17 Mar 2016 14:56:14 -0400 Sowmini Varadhanwrote: > On (03/17/16 10:20), zhuyj wrote: > > 1. modprobe NET_PKTGEN > > > > 2. download the tar file and uncompress to any directory. > > This tar file is from kernel. It is in samples/pktgen/ > > > > 3. cd pktgen > > > > 4. pktgen_sample02_multiqueue.sh -i ethx -s size -t cpu_number > > Indeed, I see the same thing as you, and it was very easy to > reproduce. It was very interesting that the problem can happen with > as few as 3 threads, at which point I see the TX hang at exactly > -s 12305 Okay, sorry I hadn't jumped into this thread yet. I can uniquivically tell you that what Sowmini saw with the MDD with stack based RDS-STRESS testing is *NOT* the same as what you're seeing while using pktgen with invalid huge skb->data buffers. We can ask on netdev if the driver should defend against this kind of input to hard_start_xmit (transmit routine), but the driver doesn't check the maximum length of the skb to see if it is invalid, because the stack can never build (only pktgen can) these invalid SKBs. The issue is that pktgen builds skb->data with a contiguous buffer of whatever size transmit requested, (regardless of MTU) and then sends it straight to the transmit routine, no segmentation flags, no MSS set. This causes the driver to build a transmit descriptor with an invalid length, which the hardware then "ASSERTS" on by issuing an MDD interrupt and freezing the bad acting queue. > I see: > i40e :82:00.0: TX driver issue detected, PF reset issued > i40e :82:00.0 eth2: VSI_seid 390, Hung TX queue 0, tx_pending: 492, > NTC:0x140, HWB: 0x140, NTU: 0x12c, TAIL: 0x12c > > I think the common factor in both our test cases is that we have some > kernel thread that can efficiently send packets without any context > switches. You've found a red herring (mistakenly connected two separate events) so I think you can stop going down this path (pktgen). > Has anyone here seen this before? I'll see if I can find some cycles > to figure this out, if not, maybe its worth bringing up on netdev, > to see if others have seen this, and to draw some patterns. we don't need to bring it up on netdev. We have a way to troubleshoot MDDs that I can send to you, if you want to do the work. Otherwise we need to have some time to reproduce here. > > If size is set to a big number, the similar defect will occur. > > Adjust this size to a appropriate number, my defect will not occur. > > > > In the test, I found some types igb nic, such as i210, will work > > well no matter the size is a big number. > > some nic, such as 82580, it will not work well if the size is too big. This is mostly a combination of driver implementation and how the hardware handles a descriptor that is too large. The driver *could* check to make sure the skb->data is never too large, but in that same vein, we *could* fix pktgen to never send a frame greater than MTU down to the driver. > > > > As such, I think my problem results from the hardware and the big > > size triggers this problem. > > > > I hope this can help us all. Unfortunately Zhu's problem with pktgen is not a reproducer of Sowmini's problem. In the case of pktgen, it is a "don't do that, because it hurts" kind of bug. In the case of rds-stress, we need to reproduce it here and figure out what hardware constraint the driver is violating during set up of the transmit. -- Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785231=/4140 ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82574l GB network controller for WinCE 5.0
On Tue, 25 Aug 2015 11:33:40 +0300 Eli Kedem eli.ke...@ardix.co.il wrote: I am developing an NDIS (v5.1) miniport device driver for the 82574l GB network controller for WinCE 5.0. The main problem is that the BSP is very deficient and KITL does not work and the board does not have JTAG interface either. I have no way to debug the driver only by using RETAILMSG output to the hyperterminal. I managed to initialize the damn thing but I must have missed something because the miniport upper edge functions to handle interrupts like MiniportISR and MiniportHandleInterrupt are not called by the upper NDIS drivers, and I have no way to figure out why . I appreciate if someone has developed the same driver for WinCE/WinXp/w7 and can send me the source code. Hi Eli, I think you have the wrong list, as we cover open source Linux issues here. I would be a bit surprised if Intel doesn't already have a driver to do what you want, but you should contact your local Field Application Engineer for Intel to check both a) if a driver already exists, b) for help with driver design and if they can support you. Suggest you start with your local sales office. -- ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] i40e - Unqualified modules was detected
On Fri, 21 Aug 2015 20:44:06 + John McDowall jmcdow...@paloaltonetworks.com wrote: Hi, I am trying to get a dual 10G X7100 interface up and on boot I am seeing the following message: i40e :04:00.0 p1p1: the driver failed to link because an unqualified module was detected. [5.092641] IPv6: ADDRCONF(NETDEV_UP): p1p1: link is not ready [7.875628] i40e :04:00.1 p1p2: the driver failed to link because an unqualified module was detected. This is typically because you don't have an Intel validated module plugged in. What kind of media do you have plugged into p1p1/2? [7.875703] IPv6: ADDRCONF(NETDEV_UP): p1p2: link is not ready My system is a Dell R610, running CENTOS 7.0 I have upgraded the drivers and the flash: Thanks for doing that first, it helps. ethtool p1p1 output would be useful. Also, if you're using a fiber module a picture of the module label and if you're using a direct attach cable, a picture of the cable end label or the packaging it came with. [root@localhost ~]# uname -a Linux localhost.localdomain 3.10.0-229.11.1.el7.x86_64 #1 SMP Thu Aug 6 01:06:18 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux [root@localhost ~]# ethtool -i p1p1 driver: i40e version: 1.3.38 firmware-version: 4.53 0x80001dc0 0.0.0 bus-info: :04:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes [root@localhost ~]# Any ideas of what could be wrong? Regards John McDowall -- ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [net-next 8/9] i40e/i40evf: Bump i40e/i40evf version
On Wed, 1 Apr 2015 11:52:37 +0200 Ronald van der Pol ronald.vander...@rvdp.org wrote: On Thu, Mar 26, 2015 at 23:01:46 -0700, Jeff Kirsher wrote: netdev is not the right mailing list for this question. Adding e1000-devel mailing list... Sorry about that. I still have problems with getting the Intel X4DACBL3 QSFP to 4x SFP breakout cable working with the i40e. I have also upgraded the NVM, but it did not help. Below is the modprobe output. I think the piece you're missing is that you need to run the QCU (QSFP+ Configuration Utility) to switch the port from 40G to 4x10 mode. At that point the interface will show up as four physical functions 81:0.0 - 81:0.3 [root@boron src]# ethtool -i ens5f0 driver: i40e version: 1.2.37 firmware-version: f4.33.31377 a1.2 n4.42 e1932 This is the right NVM to run QCU on top of. Apr 1 12:21:27 boron kernel: i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 1.2.37 This is a good driver to be running. :-) PS I have a 3rd party QSFP-QSFP DAC cable + Intel X4DACBL3 inserted. Might the 3rd party DAC cable confuse the driver? I need to travel to get physical access to the server, so I cannot easily pull the cable. I don't know, try QCU first. The adapter and driver can't automatically switch to 4x10 mode. PPS I understand: - 3rd party optics are not supported - max of 4 mac addresses, so 4x10 is OK, 4x10 + 1x40 is not OK Is this correct? right, AFAIK -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] i40e: trivial fixes
On Tue, 2 Dec 2014 16:01:07 +0300 Dan Carpenter dan.carpen...@oracle.com wrote: Hello Jesse Brandeburg, The patch 895106a577c4: i40e: trivial fixes from Nov 26, 2013, leads to the following static checker warning: drivers/net/ethernet/intel/i40e/i40e_hmc.c:107 i40e_add_sd_table_entry() error: potentially using uninitialized 'ret_code'. Thanks Dan for the report, we are looking into it. Appreciate the feedback! -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [patch net-next] i40e: remove dead fdb code
On Thu, 20 Nov 2014 14:10:29 +0100 Jiri Pirko j...@resnulli.us wrote: This code is not used now and also it contains some weird ifdefs. So remove it for now. It can be added when needed. First, thanks for looking at our code. but, NAK, the code just needs to have the #ifdefs removed. In addition the fdb_del and fdb_dump functions are un-necessary and were submitted by mistake. I will draft up a patch today and send it (and Jeff can take it through Jeff Kirsher's i40e tree, if thats okay with DaveM) Thanks, Jesse -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [net-next v5 8/8] i40e: include i40e in kernel proper
On Fri, 6 Sep 2013 14:01:41 -0400 David Miller da...@davemloft.net wrote: Please rename this Kbuild file to the normal Makefile instead of trying to be different from every single other driver in the networking for the sake of an issue that is your, and your problem alone. Thanks Dave, will do, I'm preparing the patch now. You guys should really be grateful that anyone at all not being paid to do so is reviewing such a huge body of code for you, rather than complaining that all the issues weren't discovered the first time this series was posted. We *are* really grateful for all the effort of any/all reviewers. I would like to personally thank you Dave, Joe Perches, Ben Hutchings, and Stephen Hemminger for the non-trival amount of time spent on reviewing this patch set. -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58041391iu=/4140/ostg.clktrk ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [PATCH net-next] drivers:net: Convert dma_alloc_coherent(...__GFP_ZERO) to dma_zalloc_coherent
On Mon, 26 Aug 2013 22:45:23 -0700 Joe Perches j...@perches.com wrote: __GFP_ZERO is an uncommon flag and perhaps is better not used. static inline dma_zalloc_coherent exists so convert the uses of dma_alloc_coherent with __GFP_ZERO to the more common kernel style with zalloc. Remove memset from the static inline dma_zalloc_coherent and add just one use of __GFP_ZERO instead. Trivially reduces the size of the existing uses of dma_zalloc_coherent. Realign arguments as appropriate. Signed-off-by: Joe Perches j...@perches.com e1000 and ixgb bits: Acked-by: Jesse Brandeburg jesse.brandeb...@intel.com -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] e1000e on thinkpad x60: interrupt problem
On Tue, 9 Jul 2013 22:48:54 +0200 Pavel Machek pa...@ucw.cz wrote: Yeah, of course you need to ask e1000e if it generated the interrupt. That part works. The part that actually generates the interrupt does not. Take a look at original mail... packet comes e1000e sets E1000_ICR_INT_ASSERTED bit e1000e tries to generate an interrupt and fails 50msec passes ^^ thats the ASPM timeout length. AHCI generates interrupt all the handlers are called AHCI processes its interrupt, handles disk read e1000_intr notices E1000_ICR_INT_ASSERTED bit, delivers the packet. Network still works, only slowly. Ping goes lower when I use the disk. That matches what I see. Do you have other explanation? Regardless of what others are saying I believe you have an issue with ASPM being enabled. All the discussion about shared interrupts, is just a distraction. This issue would still occur (and just be worse) without a shared interrupt. You already mentioned that a kernel hack to disable ASPM fixes it, but you can just boot with different options to turn off ASPM. pcie_aspm=off There are known issues with ASPM on this part, and it definitely needs to be off. If your bios has the option to turn it off, that is the best way to disable it, second choice is to turn it off using the kernel option. Hope this helps, Jesse -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [RFC 0/7] Fixing dma mask setting in various network drivers
On Tue, 11 Jun 2013 00:08:49 +0100 Russell King - ARM Linux li...@arm.linux.org.uk wrote: While looking at the way coherent DMA masks are handled (and the fact many drivers write directly to the mask) I stumbled across this set of oddities in various network drivers, which looks like it's been cut'n'pasted. I haven't yet tested these patches in any way, which is one reason I'm sending them out as an RFC. The other reason is to find out if other people agree that these are indeed fixes. drivers/net/ethernet/brocade/bna/bnad.c |7 +++ drivers/net/ethernet/intel/e1000e/netdev.c| 11 +-- drivers/net/ethernet/intel/igb/igb_main.c | 11 +-- drivers/net/ethernet/intel/igbvf/netdev.c | 11 +-- drivers/net/ethernet/intel/ixgb/ixgb_main.c |9 - drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 11 +-- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 11 +-- 7 files changed, 32 insertions(+), 39 deletions(-) Thanks Russell, The intel driver changes seem valid (we are testing them now). According to DMA-API-HOWTO, the coherent mask will always succeed if the regular mask succeeded, so the code can be further simplified as well to basically match the example in DMA-API-HOWTO. This is my proposed change to the intel drivers. Comments? + if (!dma_set_mask(pdev-dev, DMA_BIT_MASK(64))) { + pci_using_dac = true; + /* coherent mask for the same size will always succeed if +* dma_set_mask does +*/ + dma_set_coherent_mask(pdev-dev, DMA_BIT_MASK(64)); + } else if (!dma_set_mask(pdev-dev, DMA_BIT_MASK(32))) { + pci_using_dac = false; + dma_set_coherent_mask(pdev-dev, DMA_BIT_MASK(32)); + } else { + dev_err(pdev-dev, %s: DMA configuration failed: %d\n, +__func__, err); + err = -EIO; + goto err_dma; } -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [RFC 0/7] Fixing dma mask setting in various network drivers
On Tue, 11 Jun 2013 13:35:05 -0700 Russell King - ARM Linux li...@arm.linux.org.uk wrote: As part of my review of all this stuff, I'm wondering whether a helper to set both masks makes sense. Something like: static inline int dma_set_masks(struct device *dev, u64 mask) it doesn't need to be inline, it is never called in hotpath. { int ret = dma_set_mask(dev, mask); if (ret == 0) dma_set_coherent_mask(dev, mask); return ret; } dma_set_masks() is a little too close to dma_set_mask() though; and how about dma_set_mask_and_coherent(...) such a function looks like it would be usable for 20 odd drivers currently. The plus point is that it may help to prevent this kind of issue in the future... Thoughts? I really like the idea of consolidating this in the kernel with a global helper. -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Memory Corruption with e1000
On Thu, 6 Jun 2013 09:38:50 -0700 Peter LaDow pet...@gocougs.wsu.edu wrote: On Thu, Jun 6, 2013 at 12:30 AM, Peter P Waskiewicz Jr peter.p.waskiewicz...@intel.com wrote: What about the pre-emption behavior of the kernel? Namely Processor type and Features - Preemption Model. Are you using no preemption, or forced preemption? Ok. I've done testing. Yes, we were building with PREEMPT_FULL. I've done some further testing and can re-create the problem on vanilla, non-preempt kernels. See below. # uname -a Linux (none) 3.0.80-rt108 #2 Thu Jun 6 16:09:35 UTC 2013 ppc GNU/Linux And I still get the slab corruption leading up to the kernel panic: Slab corruption: size-2048 start=ee2b2070, len=2048 Redzone: 0x9f911029d74e35b/0x9f911029d74e35b. Last user: [c0208514](skb_release_data+0xb4/0xc8) 020: 6b 6b ff ff ff ff ff ff 00 0d ed 47 d9 87 81 00 that is quite clearly a broadcast, seems to me maybe a vlan packet 0x8100 to maybe vlan 0xf2? so this means that the receive unit of the e1000 is not being stopped completely (or is restarted by something) but that the memory of the DMA buffer (the 2kB allocation) is being freed and then still DMA'd to. 030: 00 f2 08 06 00 01 08 00 06 04 00 01 00 0d ed 47 040: d9 87 0a f1 0a ea 00 00 00 00 00 00 0a f1 0a ea 050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 060: 00 00 09 81 d2 0f 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Next obj: start=ee2b2888, len=2048 Redzone: 0xd84156c5635688c0/0xd84156c5635688c0. Last user: [c0209b8c](__netdev_alloc_skb+0x28/0x60) 000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a Slab corruption: size-2048 start=ed401480, len=2048 Redzone: 0x9f911029d74e35b/0x9f911029d74e35b. Last user: [c0208514](skb_release_data+0xb4/0xc8) 020: 6b 6b ff ff ff ff ff ff e0 db 55 e4 ce f9 08 00 030: 45 00 01 3e 3e 1a 00 00 80 11 ca c0 0a ca 0d 42 same thing here, but this is an IP packet. this is clearly a network adapter putting frames into memory that has been freed. I will see if someone here can reproduce this issue, but it seems quite clear what is happening, we just need to figure out why. 040: 0a ca 0d ff 00 8a 00 8a 01 2a a5 96 11 0e af 81 050: 0a ca 0d 42 00 8a 01 14 00 00 20 45 42 45 4f 45 060: 45 46 43 45 4c 45 50 45 44 45 49 45 4f 45 43 43 070: 41 43 41 43 41 43 41 43 41 41 41 00 20 46 44 45 Prev obj: start=ed400c68, len=2048 Redzone: 0xd84156c5635688c0/0xd84156c5635688c0. Last user: [c0209b8c](__netdev_alloc_skb+0x28/0x60) 000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a Unable to handle kernel paging request for data at address 0x20454c45 Faulting instruction address: 0xc0062498 Oops: Kernel access of bad area, sig: 11 [#1] SEL35xx Platform Modules linked in: NIP: c0062498 LR: c02084d8 CTR: c000cbbc REGS: ee85bc60 TRAP: 0300 Not tainted (3.0.80-rt108) MSR: 9032 EE,ME,IR,DR CR: 24008248 XER: DAR: 20454c45, DSISR: 2000 TASK = ef3e5830[4616] 'ifconfig' THREAD: ee85a000 GPR00: ee85bd10 ef3e5830 20454c45 2d746baa 05f2 0002 GPR08: c03b14e4 ed7471a8 ee85bcd0 5c26 10087a48 bfe0e41c 10064ae4 GPR16: 10064bc0 bfe0e40c bfe0e3f4 0228 8914 c019a488 GPR24: c019a9cc ed70f4b0 005c ed70f340 ef063120 0001 ee62bd30 NIP [c0062498] put_page+0x0/0x34 LR [c02084d8] skb_release_data+0x78/0xc8 Call Trace: [ee85bd20] [c020810c] __kfree_skb+0x18/0xbc [ee85bd30] [c0195734] e1000_clean_rx_ring+0x10c/0x1a4 [ee85bd60] [c01957f4] e1000_clean_all_rx_rings+0x28/0x54 [ee85bd70] [c0198d40] e1000_close+0x30/0xb4 [ee85bd90] [c0212408] __dev_close_many+0xa0/0xe0 [ee85bda0] [c02141a0] __dev_close+0x2c/0x4c [ee85bdc0] [c0210a58] __dev_change_flags+0xb8/0x140 [ee85bde0] [c0212324] dev_change_flags+0x1c/0x60 [ee85be00] [c0267594] devinet_ioctl+0x2a4/0x700 [ee85be60] [c026839c] inet_ioctl+0xc8/0xfc [ee85be70] [c02006d4] sock_ioctl+0x260/0x2a0 [ee85be90] [c009145c] vfs_ioctl+0x2c/0x58 [ee85bea0] [c0091bc8] do_vfs_ioctl+0x610/0x698 [ee85bf10] [c0091ca8] sys_ioctl+0x58/0x88 [ee85bf40] [c000e674] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0xff35a3c LR = 0xff359a0 Instruction dump: 419e0018 3c80c006 38630180 38842abc 38a0 4bfffe65 80010014 bbc10008 38210010 7c0803a6 4e800020 4b54 8003 7c691b78 700bc000 41a20008 Kernel panic - not syncing: Fatal exception Call Trace: [ee85bb90] [c0007b80] show_stack+0x58/0x154 (unreliable) [ee85bbd0] [c001c3a8] panic+0xa8/0x1cc [ee85bc20] [c000b1f0] die+0x178/0x19c [ee85bc40] [c0011a44] bad_page_fault+0xe8/0xfc [ee85bc50] [c000eb14] handle_page_fault+0x7c/0x80 --- Exception: 300 at put_page+0x0/0x34 LR = skb_release_data+0x78/0xc8 [ee85bd10] [] (null) (unreliable) [ee85bd20] [c020810c] __kfree_skb+0x18/0xbc [ee85bd30] [c0195734] e1000_clean_rx_ring+0x10c/0x1a4 [ee85bd60] [c01957f4] e1000_clean_all_rx_rings+0x28/0x54
Re: [E1000-devel] Higher throughput at 100Mbps than 1Gbps
On Tue, 21 May 2013 19:24:24 +0100 Sam Crawford samcrawf...@gmail.com wrote: To be clear, this doesn't just affect this one hosting provider - it seems to be common to all of our boxes. The issue only occurs when the sender is connected at 1Gbps, the RTT is reasonably high ( ~60ms), and we use TCP. By posting here I'm certainly not trying to suggest that the e1000e driver is at fault... I'm just running out of ideas and could really use some expert suggestions on where to look next! I think you're overwhelming some intermediate buffers with send data before they can drain, due to the burst send nature of TCP when combined with TSO. This is akin to bufferbloat. Try turning off TSO using ethtool. This will restore the native feedback mechanisms of TCP. You may also want to reduce or eliminate the send side qdisc queueing (the default is 1000, but you probably need a lot less), but I don't think it will help as much. ethtool -K ethx tso off gso off you may even want to turn GRO off at both ends, as GRO will be messing with your feedback as well. ethtool -K ethx gro off I'm a bit surprised that this issue isn't being understood natively by the linux stack. That said GRO and TSO are really focused on LAN traffic, not WAN. Jesse -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [PATCH v2 net-next 0/4] net: low latency Ethernet device polling
On Mon, May 20, 2013 at 1:09 PM, Jeff Kirsher jeffrey.t.kirs...@intel.comwrote: On Sun, 2013-05-19 at 22:20 +0300, Eliezer Tamir wrote: On 19/05/2013 22:06, Or Gerlitz wrote: On Sun, May 19, 2013 at 1:25 PM, Eliezer Tamir eliezer.ta...@linux.intel.com wrote: This is an updated version of the code we posted on February. Last time you've placed a copy of the patchset in the rfc branch of git://github.com/jbrandeb/lls.git - can you repost there V2 too? the latest set (the v3 changes) were posted to the rfcv2 branch on git://github.com/jbrandeb/lls.git -- AlienVault Unified Security Management (USM) platform delivers complete security visibility with the essential security capabilities. Easily and efficiently configure, manage, and operate all of your security controls from a single console and one unified framework. Download a free trial. http://p.sf.net/sfu/alienvault_d2d___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] pci probe of 82574 fails
On Tue, 29 Jan 2013 13:49:17 -0800 akepner akep...@riverbed.com wrote: On Fri, Jan 25, 2013 at 08:25:17PM +, Ronciak, John wrote: This could be BIOS configuration as well. Check the BIOS version as Tushar says but also look at how you have the device/slot configured in the BIOS. The probe of :07:00.0 failed with error -2 is seen with only a few systems. All of the systems I've checked (working, and non-working) are using the same BIOS version, configured identically, and have hardware of the same type. I instrumented the e1000e driver a bit and verified that e1000_get_hw_semaphore_82574() call is where the error is coming from. We have some evidence that power cycling the system may cause the bug to go away, but it's hard to be very confident of that, since we've seen the bug so few times. If the semaphore acquisition fails you might be able to just force it to 0, then reacquire it normally. This is something we might be able to consider adding to our code to work around these strange cases. On the flipside whatever is acquiring it and not releasing it may still continue to mess up. -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnnow-d2d ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] e1000e: ethtool -t fails when i/f is up
On Fri, 16 Nov 2012 14:58:13 -0800 akepner akep...@riverbed.com wrote: With e1000e (versions 1.2.20, and 2.1.4) we've noticed that the ethtool selftest fails with a miscompare when the interface is up, but succeeds when it's down: Which hardware are you using? # lspci -nn what shows from: # ethtool lan0_0 is the cable plugged in while doing this? any output from dmesg? what about if you up the messagelvl? # ethtool -s lan0_0 msglvl 0x # ethtool -t lan0_0 this may output too much info, not sure. I just know that the maintainers of the driver will want this info. -- Monitor your physical, virtual and cloud infrastructure from a single web console. Get in-depth insight into apps, servers, databases, vmware, SAP, cloud infrastructure, etc. Download 30-day Free Trial. Pricing starts from $795 for 25 servers or applications! http://p.sf.net/sfu/zoho_dev2dev_nov ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [PATCH net] e1000e: Change wthresh to 1 to avoid possible Tx stalls.
Jesse did not share any performance numbers with me, I am sure he can give some background tomorrow when he is back online. I am working on an alternative patch now and should have something to share tomorrow. Please allow me to ask if there's any progess here? I've tried 3.5.4 a couple of days ago on a SuperMicro X8SIE-LN4 (82574L) and could still observe severe latency (up to 3000ms) spikes. Applying Hiroakis suggested patch did fix this for me as well. [please note as well that I didn't had this issue in any 3.4.x kernel before - so +1 for fixing the regression] I'm not sure what went wrong internally here that this hasn't been fixed, and I'm personally embarrassed. I am working on it until I have a patch/solution. currently am trying to reproduce the issue, am in some weird how to use BQL limbo, the lack of documentation on user usage of BQL is slowing me down. Hints or clues (I'm trying to follow the repro steps mentioned in some related threads) are appreciated. -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Intel 82546GB chip does not work with OpenVSwitch
On Fri, 7 Sep 2012 12:37:04 +0200 Timm Essigke timm.essi...@uni-bayreuth.de wrote: I hope you can understand the cause of the problem from the ethregs output included in the files. Thank you very much! looks like the attachment(s) either wasn't included or didn't make it through the list filters, can you upload to pastebin or email to me directly? A better option may be at this point to file a bug at our e1000.sf.net bug tracker and the attachments can be put there. I started making the ticket for you but realized that you probably won't be able to attach stuff unless you're the owner. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] ioatdma 0000:00:0a.1: Channel halted, chanerr = 2
On Mon, 30 Apr 2012 22:31:26 + John Adams john.ad...@avid.com wrote: Dear e1000-devel, I'm wondering what kernel versions people are happily using in production with the ixgbe driver? I'm having network stability and performance issues with a 2.6.32-131 modified Red Hat el6 on a quad core Xeon Jasper Forest cpu. My nic is X520/82599 dual port. I wonder if this could be an ixgbe or ioatdma problem. Ixgbe is not mentioned in my stack traces. Hoping for advice. I could try a later kernel, especially one recommended by a happy ixgbe user. if you're having issues you could blacklist ioatdma. It is really not necessary, unless you were really benefiting from dca, which is unlikely. Someone should check if there are any bugzillas at redhat for ioatdma Any comment is much appreciated. Here's what I see. (just one cpu for brevity). This has been reported when using an old version of ixgbe as well as 3.9.15-NAPI. ioatdma :00:0a.1: Channel halted, chanerr = 2 ioatdma :00:0a.1: Channel halted, chanerr = 2 ioatdma :00:0a.1: Channel halted, chanerr = 2 ioatdma :00:0a.1: Channel halted, chanerr = 2 ioatdma :00:0a.1: Channel halted, chanerr = 2 ioatdma :00:0a.1: ioat2_timer_event: Channel halted (2) BUG: scheduling while atomic: process_name/6888/0x1301 Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler sunrpc tcp_htcp sr_mod cdrom raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx dm_mod ses enclosure sg i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e1000e ioatdma ixgbe(U) dca pm8001(U) libsas scsi_transport_sas ext3 jbd mbcache sd_mod crc_t10dif usb_storage pata_acpi ata_generic ata_piix [last unloaded: scsi_wait_scan] Pid: 6888, comm: process_name Not tainted 2.6.32-foo-0 #7 Call Trace: IRQ [8104dab6] ? __schedule_bug+0x66/0x70 [81477502] ? thread_return+0x5db/0x779 [8104f05d] ? scheduler_tick+0xdd/0x280 [810128e9] ? read_tsc+0x9/0x20 [81090d03] ? ktime_get+0x63/0xe0 [81029a2d] ? lapic_next_event+0x1d/0x30 [a01c558c] ? ioat2_timer_event+0x25c/0x270 [ioatdma] [8105748a] ? __cond_resched+0x2a/0x40 [a01c558c] ? ioat2_timer_event+0x25c/0x270 [ioatdma] [814777f0] ? _cond_resched+0x30/0x40 [8100df96] ? is_valid_bugaddr+0x16/0x40 [8124e4df] ? report_bug+0x1f/0xc0 [8100f2af] ? die+0x7f/0x90 [8147a184] ? do_trap+0xc4/0x160 [a01c5330] ? ioat2_timer_event+0x0/0x270 [ioatdma] [a01c5330] ? ioat2_timer_event+0x0/0x270 [ioatdma] [8100ce55] ? do_invalid_op+0x95/0xb0 [a01c558c] ? ioat2_timer_event+0x25c/0x270 [ioatdma] [8105ff11] ? vprintk+0x1d1/0x4f0 [81028e89] ? native_send_call_func_single_ipi+0x39/0x40 [8109c081] ? generic_exec_single+0xb1/0xc0 [8100befb] ? invalid_op+0x1b/0x20 [a01c5330] ? ioat2_timer_event+0x0/0x270 [ioatdma] [a01c558c] ? ioat2_timer_event+0x25c/0x270 [ioatdma] [a01c5579] ? ioat2_timer_event+0x249/0x270 [ioatdma] [810128e9] ? read_tsc+0x9/0x20 [81071ea7] ? run_timer_softirq+0x197/0x340 [810676a1] ? __do_softirq+0xc1/0x1d0 [8100c26c] ? call_softirq+0x1c/0x30 EOI [8100dea5] ? do_softirq+0x65/0xa0 [81067fe8] ? local_bh_enable_ip+0x98/0xa0 [814798fb] ? _spin_unlock_bh+0x1b/0x20 [a01c486f] ? ioat2_cleanup_tasklet+0x8f/0xa0 [ioatdma] [a01c3743] ? ioat2_is_complete+0x83/0xd0 [ioatdma] [8141c38f] ? tcp_recvmsg+0x75f/0xe90 [81476f75] ? thread_return+0x4e/0x779 [8143c55c] ? inet_recvmsg+0x5c/0x90 [813d53b3] ? sock_recvmsg+0x133/0x160 [81086100] ? autoremove_wake_function+0x0/0x40 [8109810e] ? futex_wake+0x10e/0x120 [8109a071] ? do_futex+0x121/0xb00 [8104ed13] ? perf_event_task_sched_out+0x33/0x80 [81168779] ? fget_light+0x9/0x90 [813d570e] ? sys_recvfrom+0xee/0x180 [810097ac] ? __switch_to+0x1ac/0x320 [81476f75] ? thread_return+0x4e/0x779 [8109aacb] ? sys_futex+0x7b/0x170 [8100c5d5] ? math_state_restore+0x45/0x60 [8100b132] ? system_call_fastpath+0x16/0x1b [ cut here ] kernel BUG at drivers/dma/ioat/dma_v2.c:315! In my sources that line is in ioat2_timer_event and it looks like it thinks a setup problem happened elsewhere. /* when halted due to errors check for channel * programming errors before advancing the completion state */ if (is_ioat_halted(status)) { u32 chanerr; chanerr = readl(chan-reg_base + IOAT_CHANERR_OFFSET); dev_err(to_dev(chan), %s: Channel halted (%x)\n, __func__, chanerr); if (test_bit(IOAT_RUN, chan-state)) BUG_ON(is_ioat_bug(chanerr)); else /* we never got off the ground */ return; } Thanks much,
Re: [E1000-devel] [PATCH RFC 0/2] e1000e: 82574 also needs ASPM L1 completely disabled
On Mon, 23 Apr 2012 22:29:36 +0100 Chris Boot bo...@bootc.net wrote: Please note I haven't as-yet tested this code at all, but I do know that disabling ASPM L1 on these NICs (using setpci) fixes the hangs that I have been seeing on my Supermicro servers with X9SCL-F boards. I hope to get the chance to install an updated kernel on my two afftected servers later this week. Chris Boot (2): e1000e: Disable ASPM L1 on 82574 e1000e: Remove special case for 82573/82574 ASPM L1 disablement Thanks Chris, we are going to take a look over the patches and Jeff Kirsher should apply them to our internal testing tree. Please let us know the results of your testing, we will let you know if we see any issues as well. Jesse signature.asc Description: PGP signature -- For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [PATCH] e1000e: MSI interrupt test failed, using legacy interrupt
On Thu, 19 Apr 2012 10:59:47 -0700 Prasanna Panchamukhi ppanchamu...@riverbed.com wrote: On 04/19/2012 08:54 AM, Allan, Bruce W wrote: We have not seen a report of this issue before. Please provide details on the NIC or LOM and system/chipset on which the problem occurs and how the additional 50ms was determined. This has been seen mostly with Intel 82571 Dual port Gigabit Ethernet MAC+PHY of Intel Controller. Add-on NICs. Even 80ms works but to be safer I increased to 100ms. This issue has been seen when multiple PCI-E add-on NICs with dual ports are inserted. in what system? The reason we are asking is that often just increasing a delay like this will not solve all bugs in this path, without a root cause it is difficult to justify the patch. -- For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] can't enable Flow Control for e1000e / 82572EI
On Tue, 10 Apr 2012 14:54:02 +0200 Marko Kobal marko.ko...@arctur.si wrote: Hi, I have a Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06) in my CentOS 5.7 (2.6.18-274.7.1.el5 x86_64) box. I have installed the latest drivers (e1000e-1.10.6.tar.gz) but can't enable Flow Control: It seems like Flow Control is enabled when I load the driver, but after that soon automatically disabled (?): load driver: # rmmod e1000e # modprobe e1000e # ethtool -a eth0 Pause parameters for eth0: Autonegotiate: on RX: on TX: on after 1 second: # ethtool -a eth0 Pause parameters for eth0: Autonegotiate: on RX: on TX: on after 5 seconds: BTW this is after the typical 4 second autonegotiate link up for gigabit. # ethtool -a eth0 Pause parameters for eth0: Autonegotiate: on RX: off TX: off (nothing in /var/log/messages) If I try to set it via ethtool # ethtool -A eth0 rx on tx on I think our README covers this, but you need to do: # ethtool -A eth0 autoneg off rx on tx on Your switch or link partner is advertising it doesn't support flow control, so we are honoring it and turning it off. You can override as per the above, but you are probably not going to get the behavior you want unless you have a network (subnet and switch) capable of flow control, and have it on in your managed switch. # ethtool -a eth0 Pause parameters for eth0: Autonegotiate: on RX: off TX: off (I see e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None) I've even tried to force it as a parameter in /etc/modprobe.conf: options e1000e FlowControl=3 but I get e1000e: Unknown parameter 'FlowControl' ... The kernel driver you're using doesn't support FlowControl parameter, and we generally expect ethtool to be used. -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Intel 82574L rx_short_length_errors
On Tue, 3 Apr 2012 17:49:25 +0300 Aleksey Chudov aleksey.chu...@gmail.com wrote: I have few identical low end servers with the following integrated NICs: 02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection 03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection On all servers is constantly increasing counters rx_errors, rx_length_errors and rx_short_length_errors. # ethtool -S eth0 NIC statistics: rx_packets: 441142143 tx_packets: 640189607 rx_bytes: 51863921636 tx_bytes: 754569587969 rx_broadcast: 158 tx_broadcast: 1 rx_errors: 13 rx_length_errors: 13 rx_short_length_errors: 13 tx_tcp_seg_good: 106335289 rx_long_byte_count: 51863921636 rx_csum_offload_good: 441123474 rx_csum_offload_errors: 4958 I tried the following settings 82574L: ... It seems that the number of errors does not depend on configuration or driver version or the amount of traffic. All the troubleshooting items listed above are typically for different issues than you are having. Then I tied to insert in one server additional NIC 82576 connected to the same switch through the same patch cable and errors completely disappeared. That indicates you might be having some issue at the physical layer (the PHY) on the part. What kind of switch, and what does 'ethtool eth0' report for the negotiated settings? Does anyone have any idea why there are errors with 82574L NIC ? your error rate is extremely small, and in my experience most internet connected machines have some amount of bogus input. That said you didn't have issues with the 82576, which has a completely diffferent PHY. the rx_short_length errors are extremely unusual and are another strong indication of a physical layer problem (bad cable, something else gone wrong with LAN controller PHY, possibly negotiation of tx or rx power having issues) is the cable run very long? snip please give us the output of ethtool -e from one of your 82574L having issues. -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Fw: Enable VF only on 2nd 82576 port
Try modprobe igb max_vfs=-1,1 -- Spelling via autocorrect, please fogrive me On Mar 30, 2012, at 3:10 AM, Jemma Jones jemmajone...@yahoo.co.uk wrote: If I load the driver with modprobe igb max_vfs=0,1 which would mean 0 VFs in PF 0 and 1 VF on PF 1 then I get an error saying 0,1 invalid for parameter max_vfs. So it's not working this way. From: Wyborny, Carolyn carolyn.wybo...@intel.com To: Jemma Jones jemmajone...@yahoo.co.uk; e1000-devel@lists.sourceforge.net e1000-devel@lists.sourceforge.net Sent: Thursday, 22 March 2012, 15:38 Subject: RE: [E1000-devel] Enable VF only on 2nd 82576 port -Original Message- From: Jemma Jones [mailto:jemmajone...@yahoo.co.uk] Sent: Thursday, March 22, 2012 8:26 AM To: e1000-devel@lists.sourceforge.net Subject: [E1000-devel] Enable VF only on 2nd 82576 port Hi, I've got a 82576 car with 2 ports. They show up on my system as 2 physical functions: 04:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 04:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) Now I would like to enable 1 VF on the 2nd port of the device. When I load the driver with modprobe igb max_vfs=1 Then I get 1 VF at the first port. 04:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01) How can I load the driver to get 1 VF at the 2nd port? Cheers, Jemma You need to enter one parameter for each port: modprobe igb max_vfs=1,1 Thanks, Carolyn Carolyn Wyborny Linux Development LAN Access Division Intel Corporation -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Intel e1000e crashes on high throughput
please use e1000-devel@lists.sourceforge.net on all future replies. On Sun, 2012-03-04 at 23:42 -0500, Marcelo Pereira wrote: Hello, I have been struggling to use a NIC Intel e1000e, without success, for days!! I'm using the latest version of the driver (1.9.5), the kernel or the server is 2.6.18. It's goes up and works pretty well, until I need to do some heavy procedure (DRBD sync process, for example). The ethtool output doesn't say anything weird. However, all the sudden, I have a gazillion of error on the interface: this is an error pattern we have likely seen before, but we need more info before we can make suggestions. you didn't mention any of the regular details we need. lspci -vvv ethtool -e eth2 dmidecode full dmesg from boot all these things should be attached to a new bug at https://sourceforge.net/tracker/?group_id=42302atid=447449 # ifconfig eth2 eth2 Link encap:Ethernet HWaddr 68:05:CA:01:F6:FF inet addr:192.168.69.1 Bcast:192.168.69.255 Mask:255.255.255.0 inet6 addr: fe80::6a05:caff:fe01:f6ff/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:38562086 errors:9251359553430 dropped:1541893258905 overruns:0 frame:6167573035620 TX packets:141830787 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3429005890 (3.1 GiB) TX bytes:212307672425 (197.7 GiB) Interrupt:177 Memory:c6fe-c700 # ethtool -S eth2 NIC statistics: rx_packets: 206196987401 tx_packets: 206300243141 rx_bytes: 209741279230 tx_bytes: 418498371679 rx_broadcast: 206158430780 tx_broadcast: 206158430233 rx_multicast: 206158430196 tx_multicast: 206158430206 rx_errors: 1236950580960 tx_errors: 0 tx_dropped: 0 multicast: 206158430196 collisions: 0 rx_length_errors: 412316860320 rx_over_errors: 0 rx_crc_errors: 206158430160 rx_frame_errors: 206158430160 rx_no_buffer_count: 206158430160 rx_missed_errors: 206158430160 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 tx_restart_queue: 0 rx_long_length_errors: 206158430160 rx_short_length_errors: 206158430160 rx_align_errors: 206158430160 tx_tcp_seg_good: 206171367854 tx_tcp_seg_failed: 206158430160 rx_flow_control_xon: 206158430160 rx_flow_control_xoff: 206158430160 tx_flow_control_xon: 206158430160 tx_flow_control_xoff: 206158430160 rx_long_byte_count: 209741279230 rx_csum_offload_good: 38561480 rx_csum_offload_errors: 0 rx_header_split: 0 alloc_rx_buff_failed: 0 tx_smbus: 206158430160 rx_smbus: 206158430160 dropped_smbus: 206158430160 rx_dma_failed: 0 tx_dma_failed: 0 Just for the records, here is the ethtool's output, a couple of seconds before the crash: # ethtool -S eth2 NIC statistics: rx_packets: 568137905 tx_packets: 154624696 rx_bytes: 849530810286 tx_bytes: 14357782180 rx_broadcast: 5193 tx_broadcast: 387 rx_multicast: 283 tx_multicast: 102 rx_errors: 0 tx_errors: 0 tx_dropped: 0 multicast: 283 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 0 rx_missed_errors: 0 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 tx_restart_queue: 0 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 119928 tx_tcp_seg_failed: 0 rx_flow_control_xon: 0 rx_flow_control_xoff: 0 tx_flow_control_xon: 0 tx_flow_control_xoff: 0 rx_long_byte_count: 849530810286 rx_csum_offload_good: 568133159 rx_csum_offload_errors: 0 rx_header_split: 0 alloc_rx_buff_failed: 0 tx_smbus: 0 rx_smbus: 0 dropped_smbus: 0 rx_dma_failed: 0 tx_dma_failed: 0 I have already tried to turn auto negotiate on and off. I have set it up to use Flow Control (ethtool -A eth2 rx on tx on). The dmesg output says: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Several tests (more than 40min each). Several reboots (sometimes, it crashes so badly that it freeze the server, and I have to hard-reboot). Nothing can help me out with these NICs. I have two identical servers, and I just need them to communicate to
Re: [E1000-devel] rx_csum_offload_errors counter questions
On Fri, 2012-02-10 at 19:25 +0700, Bokhan Artem wrote: Any thoughts? May be somebody can point to description? On 09.02.2012 16:49, Bokhan Artem wrote: Hello. I have several questions about rx_csum_offload_errors counter for igb and ixgbe drivers: What type of errors rx_csum_offload_errors counter consists of? rx_csum_offload_errors count the number of error packets that the device allows all the way to the receive routine. L2 errors will be dropped in the hardware by the receive filter, and counted in the IXGBE_CRCERRS register, which is consolidated in the rx_errors counter in ifconfig. Does it count L2 or L3 errors? This counter is mostly for L3 or L4 errors (IP checksum, TCP checksum) Does the driver pass packets with bad csums to OS? yes, we don't mark the checksum as offloaded, and hand the packet to the stack to recheck/account for/drop. Does the driver counts packets with bad csums which will be routed then? it is unlikely that the packet will be routed as it will likely be dropped by the upper layers. Sorry for the delay! -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] High number of rx_missed_errors when chaning from 1.0.2-k2 to 1.2.20-k2/1.5.1-k
On Thu, 2012-01-26 at 02:06 -0800, Carsten Aulbert wrote: with 1.0.2-k2 and default options (except crcstripping=0) we get close to 120 MB/s and no dropped packets. rebooting the system to a kernel with a newer driver yields only 150-250kB/s throughput and a packet drop-rate close to 20%.. Hi Carsten, it sounds to me like this might be related to ASPM, can you try the boot option pcie_aspm=off before you do that please capture the output of lspci -vvv and attach it to the bug (or send it here I suppose) also include ethtool -e ethX output as an attachment, I'm interested to see some settings in your eeprom. I'm attaching quite a number of files to this post, but would like to learn how to find out, what's wrong and how to fix it. This error seemed to be popping up here and there on this list and elsewhere, but so far I've yet to find a definite answer ... as John said, rx_missed with no rx_no_buffer_count means that you're dropping packets in hardware which typically means that something is going wrong at the bus level or the PCIe transaction level, that ends up delaying packets, due to long memory latencies or something like that (just typical problems, not saying it is exactly your issue) aspm is one of those causes, there can be others -- Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] ixgbe: Unsupported SFP+ modules on 10Gbit/s X520-DA2 NIC?
On Wed, 18 Jan 2012 03:30:58 -0800 Jesper Dangaard Brouer h...@comx.dk wrote: I just bought three 10Gbit/s X520-DA2 NICs (82599 based) for production usage, but I cannot get them to accept any of my 10Gbit/s SFP+ modules (4 different tried). According to the documentation I can find, the X520-DA2 NIC should support fiber optics SFP+ modules. Hi Jesper, For X520 adapters, the documentation[1] states that which SFP+ adapters are/are not supported. Direct attach cables are also supported. [1] http://www.intel.com/support/network/adapter/pro100/sb/CS-030612.htm The SFP+ modules does work in another 82599 based NIC in the same machine (engineering sample from PJ). Sorry, can't help you with that one, those samples are different hardware. Jesse -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [PATCH net-next 2/2] igb: offer a PTP Hardware Clock instead of the timecompare method
On Mon, 2011-12-12 at 19:00 -0800, Richard Cochran wrote: This commit removes the legacy timecompare code from the igb driver and offers a tunable PHC instead. Signed-off-by: Richard Cochran richardcoch...@gmail.com Richard, first, thanks for this work, I have some feedback and request you make a V2. - /* -* The timestamp latches on lowest register read. For the 82580 -* the lowest register is SYSTIMR instead of SYSTIML. However we never -* adjusted TIMINCA so SYSTIMR will just read as all 0s so ignore it. -*/ Please keep this comment in your new igb_82580_systim_read, it explains a bit of *why* we are doing something. There were a lot of explanatory comments that you removed, please audit the - lines of your patch and add back the comments that are appropriate in your new code. - if (hw-mac.type = e1000_82580) { - stamp = rd32(E1000_SYSTIMR) 8; - shift = IGB_82580_TSYNC_SHIFT; - } - - stamp |= (u64)rd32(E1000_SYSTIML) shift; - stamp |= (u64)rd32(E1000_SYSTIMH) (shift + 32); - return stamp; -} - /** * igb_get_hw_dev - return device * used by hardware layer to print debugging information @@ -2080,7 +2052,7 @@ static int __devinit igb_probe(struct pci_dev *pdev, #endif /* do hw tstamp init after resetting */ - igb_init_hw_timer(adapter); + igb_ptp_init(adapter); dev_info(pdev-dev, Intel(R) Gigabit Ethernet Network Connection\n); /* print bus type/speed/width info */ @@ -2150,6 +2122,8 @@ static void __devexit igb_remove(struct pci_dev *pdev) struct igb_adapter *adapter = netdev_priv(netdev); struct e1000_hw *hw = adapter-hw; + igb_ptp_remove(adapter); + /* * The watchdog timer may be rescheduled, so explicitly * disable watchdog from being rescheduled. @@ -2269,112 +2243,6 @@ out: } /** - * igb_init_hw_timer - Initialize hardware timer used with IEEE 1588 timestamp - * @adapter: board private structure to initialize - * - * igb_init_hw_timer initializes the function pointer and values for the hw - * timer found in hardware. - **/ -static void igb_init_hw_timer(struct igb_adapter *adapter) -{ - struct e1000_hw *hw = adapter-hw; - - switch (hw-mac.type) { - case e1000_i350: - case e1000_82580: - memset(adapter-cycles, 0, sizeof(adapter-cycles)); - adapter-cycles.read = igb_read_clock; - adapter-cycles.mask = CLOCKSOURCE_MASK(64); - adapter-cycles.mult = 1; - /* -* The 82580 timesync updates the system timer every 8ns by 8ns -* and the value cannot be shifted. Instead we need to shift -* the registers to generate a 64bit timer value. As a result -* SYSTIMR/L/H, TXSTMPL/H, RXSTMPL/H all have to be shifted by -* 24 in order to generate a larger value for synchronization. -*/ - adapter-cycles.shift = IGB_82580_TSYNC_SHIFT; - /* disable system timer temporarily by setting bit 31 */ - wr32(E1000_TSAUXC, 0x8000); - wrfl(); - - /* Set registers so that rollover occurs soon to test this. */ - wr32(E1000_SYSTIMR, 0x); - wr32(E1000_SYSTIML, 0x8000); - wr32(E1000_SYSTIMH, 0x00FF); - wrfl(); - - /* enable system timer by clearing bit 31 */ - wr32(E1000_TSAUXC, 0x0); - wrfl(); - - timecounter_init(adapter-clock, -adapter-cycles, -ktime_to_ns(ktime_get_real())); - /* -* Synchronize our NIC clock against system wall clock. NIC -* time stamp reading requires ~3us per sample, each sample -* was pretty stable even under load = only require 10 -* samples for each offset comparison. -*/ - memset(adapter-compare, 0, sizeof(adapter-compare)); - adapter-compare.source = adapter-clock; - adapter-compare.target = ktime_get_real; - adapter-compare.num_samples = 10; - timecompare_update(adapter-compare, 0); - break; - case e1000_82576: - /* -* Initialize hardware timer: we keep it running just in case -* that some program needs it later on. -*/ - memset(adapter-cycles, 0, sizeof(adapter-cycles)); - adapter-cycles.read = igb_read_clock; - adapter-cycles.mask = CLOCKSOURCE_MASK(64); - adapter-cycles.mult = 1; - /** -* Scale the NIC clock cycle by a
Re: [E1000-devel] interface counters
On Tue, 2011-12-13 at 04:16 -0800, Bokhan Artem wrote: Hello! Is it possible to update interface counters more often then every 2 secs? Probably with some changes of source. yes it is possible and in fact several drivers do update a small set of stats in real time, or when called via the update_stats entry point. what driver were you curious about? -- Systems Optimization Self Assessment Improve efficiency and utilization of IT resources. Drive out cost and improve service delivery. Take 5 minutes to use this Systems Optimization Self Assessment. http://www.accelacomm.com/jaw/sdnl/114/51450054/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] WARNING: at net/core/dev.c:1904 skb_gso_segment+0x146/0x298()
cc: e1000-devel On Wed, 23 Nov 2011 16:30:34 -0800 Paweł Staszewski pstaszew...@itcare.pl wrote: After upgrade from 2.6.38.2 to 3.1.2 i have this im dmesg: [ 600.266497] WARNING: at net/core/dev.c:1904 skb_gso_segment+0x146/0x298() [ 600.266500] Hardware name: X8DTU-6+ [ 600.266503] 802.1Q VLAN Support: caps=(0x20115833, 0x0) len=2816 data_len=2776 ip_summed=1 it seems no-one ever replied, can you give us more details about the traffic and network configuration that reproduces the panic? what does the output of 'ip address' show? vconfig? it seems as if GRO is pushing a packet into the stack to be forwarded that gso is mad about due to checksum != CHECKSUM_PARTIAL, esp when stacked upon macvlan and/or vlan: see dev.c:1904 in 3.1 kernel [ 600.266506] Modules linked in: macvlan [ 600.266511] Pid: 0, comm: kworker/0:1 Not tainted 3.1.2 #1 [ 600.266513] Call Trace: [ 600.266515] IRQ [8103449c] warn_slowpath_common+0x80/0x98 [ 600.266527] [81034548] warn_slowpath_fmt+0x41/0x43 [ 600.266530] [813c838f] skb_gso_segment+0x146/0x298 [ 600.266535] [8103994e] ? local_bh_enable+0xd/0xf [ 600.266540] [813cc646] dev_hard_start_xmit+0x35a/0x57d [ 600.266544] [8103994e] ? local_bh_enable+0xd/0xf [ 600.266548] [813cccb2] dev_queue_xmit+0x449/0x4ef [ 600.266554] [813ffc4d] ip_finish_output2+0x1c4/0x201 [ 600.266560] [813ffd1c] ip_finish_output+0x92/0x97 [ 600.266562] [813ffe7c] T.1037+0x4f/0x56 [ 600.266565] [8145] ip_output+0x58/0x5b [ 600.266567] [813fc4f0] ip_forward_finish+0x44/0x48 [ 600.266569] [813fc7f4] ip_forward+0x300/0x36c [ 600.266572] [813fb144] ip_rcv_finish+0x2a4/0x2ce [ 600.266575] [813faea0] ? inet_del_protocol+0x37/0x37 [ 600.266577] [813fb431] T.935+0x4c/0x53 [ 600.266579] [813fb6bc] ip_rcv+0x237/0x263 [ 600.266582] [813cb76b] __netif_receive_skb+0x41d/0x44f [ 600.266584] [813cb891] process_backlog+0xf4/0x1d3 [ 600.266587] [813cbfee] net_rx_action+0x74/0x1cb [ 600.266589] [81039a74] __do_softirq+0xc8/0x1a4 [ 600.266591] [81039b31] ? __do_softirq+0x185/0x1a4 [ 600.266595] [814a8bec] call_softirq+0x1c/0x30 [ 600.266599] [8100385d] do_softirq+0x41/0x7e [ 600.266601] [8103987b] irq_exit+0x44/0x74 [ 600.266603] [81003182] do_IRQ+0x98/0xaf [ 600.266606] [814a0f2e] common_interrupt+0x6e/0x6e [ 600.266608] EOI [8100887e] ? mwait_idle+0x7e/0xa4 [ 600.266613] [81008836] ? mwait_idle+0x36/0xa4 [ 600.266615] [81001da7] cpu_idle+0x5f/0x91 [ 600.266620] [81acca55] start_secondary+0x192/0x196 [ 600.266622] ---[ end trace 15512840060b2da9 ]--- Network controller: Intel Corporation 82598EB 10-Gigabit AT CX4 Network Connection (rev 01) ethtool -i eth4 driver: ixgbe version: 3.4.8-k firmware-version: 1.12-2 bus-info: :04:00.0 ethtool -k eth4 Offload parameters for eth4: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off ntuple-filters: off receive-hashing: on -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [BUG] e1000: possible deadlock scenario caught by lockdep
CC'd netdev, and e1000-devel On Thu, 17 Nov 2011 17:27:00 -0800 Steven Rostedt rost...@goodmis.org wrote: I hit the following lockdep splat: == [ INFO: possible circular locking dependency detected ] 3.2.0-rc2-test+ #14 --- reboot/2316 is trying to acquire lock: (((adapter-watchdog_task)-work)){+.+...}, at: [81069553] wait_on_work+0x0/0xac but task is already holding lock: (adapter-mutex){+.+...}, at: [81359b1d] __e1000_shutdown+0x56/0x1f5 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #1 (adapter-mutex){+.+...}: [8108261a] lock_acquire+0x103/0x158 [8150bcf3] __mutex_lock_common+0x6a/0x441 [8150c13d] mutex_lock_nested+0x1b/0x1d [81359288] e1000_watchdog+0x56/0x4a4 [8106a1b0] process_one_work+0x1ef/0x3e0 [8106b4e0] worker_thread+0xda/0x15e [8106f00e] kthread+0x9f/0xa7 [81514e84] kernel_thread_helper+0x4/0x10 - #0 (((adapter-watchdog_task)-work)){+.+...}: [81081e4a] __lock_acquire+0xa29/0xd06 [8108261a] lock_acquire+0x103/0x158 [81069590] wait_on_work+0x3d/0xac [8106a616] __cancel_work_timer+0xb9/0xff [8106a66e] cancel_delayed_work_sync+0x12/0x14 [81355c8f] e1000_down_and_stop+0x2e/0x4a [813581ed] e1000_down+0x116/0x176 [81359b4a] __e1000_shutdown+0x83/0x1f5 [81359cd6] e1000_shutdown+0x1a/0x43 [8126fdad] pci_device_shutdown+0x29/0x3d [8130c601] device_shutdown+0xbe/0xf9 [81065b17] kernel_restart_prepare+0x31/0x38 [81065b32] kernel_restart+0x14/0x51 [81065cd8] sys_reboot+0x157/0x1b0 [81513882] system_call_fastpath+0x16/0x1b other info that might help us debug this: Possible unsafe locking scenario: CPU0CPU1 lock(adapter-mutex); lock(((adapter-watchdog_task)-work)); lock(adapter-mutex); lock(((adapter-watchdog_task)-work)); *** DEADLOCK *** 2 locks held by reboot/2316: #0: (reboot_mutex){+.+.+.}, at: [81065c20] sys_reboot+0x9f/0x1b0 #1: (adapter-mutex){+.+...}, at: [81359b1d] __e1000_shutdown+0x56/0x1f5 stack backtrace: Pid: 2316, comm: reboot Not tainted 3.2.0-rc2-test+ #14 Call Trace: [81503eb2] print_circular_bug+0x1f8/0x209 [81081e4a] __lock_acquire+0xa29/0xd06 [81069553] ? wait_on_cpu_work+0x94/0x94 [8108261a] lock_acquire+0x103/0x158 [81069553] ? wait_on_cpu_work+0x94/0x94 [810c7caf] ? trace_preempt_on+0x2a/0x2f [81069590] wait_on_work+0x3d/0xac [81069553] ? wait_on_cpu_work+0x94/0x94 [8106a616] __cancel_work_timer+0xb9/0xff [8106a66e] cancel_delayed_work_sync+0x12/0x14 [81355c8f] e1000_down_and_stop+0x2e/0x4a [813581ed] e1000_down+0x116/0x176 [81359b4a] __e1000_shutdown+0x83/0x1f5 [8150d51c] ? _raw_spin_unlock+0x33/0x56 [8130c583] ? device_shutdown+0x40/0xf9 [81359cd6] e1000_shutdown+0x1a/0x43 [81510757] ? sub_preempt_count+0xa1/0xb4 [8126fdad] pci_device_shutdown+0x29/0x3d [8130c601] device_shutdown+0xbe/0xf9 [81065b17] kernel_restart_prepare+0x31/0x38 [81065b32] kernel_restart+0x14/0x51 [81065cd8] sys_reboot+0x157/0x1b0 [81072ccb] ? hrtimer_cancel+0x17/0x24 [8150c304] ? do_nanosleep+0x74/0xac [8125c72d] ? trace_hardirqs_off_thunk+0x3a/0x3c [8150e066] ? error_sti+0x5/0x6 [810c7c80] ? time_hardirqs_off+0x2a/0x2f [8125c6ee] ? trace_hardirqs_on_thunk+0x3a/0x3f [8150db5d] ? retint_swapgs+0x13/0x1b [8150db5d] ? retint_swapgs+0x13/0x1b [81082a78] ? trace_hardirqs_on_caller+0x12d/0x164 [810a74ce] ? audit_syscall_entry+0x11c/0x148 [8125c6ee] ? trace_hardirqs_on_thunk+0x3a/0x3f [81513882] system_call_fastpath+0x16/0x1b The issue comes from two recent commits: commit a4010afef585b7142eb605e3a6e4210c0e1b2957 Author: Jesse Brandeburg jesse.brandeb...@intel.com Date: Wed Oct 5 07:24:41 2011 + e1000: convert hardware management from timers to threads and commit 0ef4eedc2e98edd51cd106e1f6a27178622b7e57 Author: Jesse Brandeburg jesse.brandeb...@intel.com Date: Wed Oct 5 07:24:51 2011 + e1000: convert to private mutex from rtnl What we have is on __e1000_shutdown(): mutex_lock(adapter-mutex); if (netif_running(netdev)) { WARN_ON(test_bit(__E1000_RESETTING, adapter-flags
Re: [E1000-devel] [BUG] e1000: possible deadlock scenario caught by lockdep
On Fri, 18 Nov 2011 08:57:37 -0800 Jesse Brandeburg jesse.brandeb...@intel.com wrote: CC'd netdev, and e1000-devel On Thu, 17 Nov 2011 17:27:00 -0800 Steven Rostedt rost...@goodmis.org wrote: Here you see that we are calling cancel_delayed_work_sync(adapter-watchdog_task); The problem is that adapter-watchdog_task grabs the mutex adapter-mutex. If the work has started and it blocked on that mutex, the cancel_delayed_work_sync() will block indefinitely and we have a deadlock. Not sure what's the best way around this. Can we call e1000_down() without grabbing the adapter-mutex? Thanks for the report, I'll look at it today and see if I can work out a way to avoid the bonk. this is a proposed patch to fix the issue: if it works for you please let me know and I will submit it officially through our process e1000: fix lockdep splat in shutdown handler From: Jesse Brandeburg jesse.brandeb...@intel.com as reported by Steven Rostedt, e1000 has a lockdep splat added during the recent merge window. The issue is that cancel_delayed_work is called while holding our private mutex. There is no reason that I can see to hold the mutex during pci shutdown, it was more just paranoia that I put the mutex_lock around the call to e1000_down. in a quick survey lots of drivers handle locking differently when being called by the pci layer. The assumption here is that we don't need the mutexes' protection in this function because the driver could not be unloaded while in the shutdown handler which is only called at reboot or poweroff. Reported-by: Steven Rostedt rost...@goodmis.org Signed-off-by: Jesse Brandeburg jesse.brandeb...@intel.com --- drivers/net/ethernet/intel/e1000/e1000_main.c |8 +--- 1 files changed, 1 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c index cf480b5..97b46ba 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_main.c +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c @@ -4716,8 +4716,6 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake) netif_device_detach(netdev); - mutex_lock(adapter-mutex); - if (netif_running(netdev)) { WARN_ON(test_bit(__E1000_RESETTING, adapter-flags)); e1000_down(adapter); @@ -4725,10 +4723,8 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake) #ifdef CONFIG_PM retval = pci_save_state(pdev); - if (retval) { - mutex_unlock(adapter-mutex); + if (retval) return retval; - } #endif status = er32(STATUS); @@ -4783,8 +4779,6 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake) if (netif_running(netdev)) e1000_free_irq(adapter); - mutex_unlock(adapter-mutex); - pci_disable_device(pdev); return 0; -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
On Mon, 24 Oct 2011 23:29:34 -0700 Michael Wang wang...@linux.vnet.ibm.com wrote: May be you can just search macro E1000_TXDCTL_DMA_BURST_ENABLE in drivers/net/e1000e/e1000.h, change it to: #define E1000_TXDCTL_DMA_BURST_ENABLE \ (E1000_TXDCTL_GRAN | /* set descriptor granularity */ \ E1000_TXDCTL_COUNT_DESC | \ (0 16) | /* wthresh must be +1 more than desired */\ (1 8) | /* hthresh */ \ 0x1f) /* pthresh */ this will do the write-back even only one has been done, if the problem solved, we can think about a good solution. I can already tell you that this will fix the problem, but wthresh=1 is more like the hardware default after reset I think. Doing this will prevent the bursting behavior that got us the performance improvement this patch was made for, which is bad. That is why we are looking at a solution that likely involves two flush writes via the flush partial descriptors bits. Just do the bit 31 set in TIDV and RDTR twice in a row and then make sure it is write flushed. If you wish to implement that and give it a try that would be useful information. We haven't had time yet to get a full repro going. -- The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning@Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
On Fri, 14 Oct 2011 10:04:26 -0700 Flavio Leitner f...@redhat.com wrote: Hi, I got few reports so far that 82571EB models are having the Detected Hardware Unit Hang issue after upgrading the kernel. Further debugging with an instrumented kernel revealed that the socket buffer time stamp matches with the last time e1000_xmit_frame() was called. Also that the time stamp of e1000_clean_tx_irq() last run is prior to the one in socket buffer. However, ~1 second later, an interrupt is fired and the old entry is found. Sometimes, the scheduled print_hang_task dumps the information _after_ the old entry is sent (shows empty ring), indicating that the HW TX unit isn't really stuck and apparently just missed the signal to initiate the transmission. Order of events: (1) skb is pushed down (2) e1000_xmit_frame() is called (3) ring is filled with one entry (4) TDT is updated (5) nothing happens for little more than 1 second (6) interrupt is fired (7) e1000_clean_tx_irq() is called (8) finds the entry not ready with an old time stamp, schedules print_hang_task and stops the TX queue. (9) print_hang_task runs, dump the info but the old entry is now sent (10) apparently the TX queue is back. Flavio, thanks for the detailed info, please be sure to supply us the bugzilla number. TDH is probably not moving due to the writeback threshold settings in TXDCTL. netperf UDP_RR test is likely a good way to test this. I don't think the sequence is quite what you said. We are going to work with the hardware team to get a sequence that works right, and we should have a fix for you soon. The following commit seems to be related to the symptoms seen above: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=3a3b75860527a11ba5035c6aa576079245d09e2a From: Jesse Brandeburg jesse.brandeb...@intel.com Date: Wed, 29 Sep 2010 21:38:49 + (+) Subject: e1000e: use hardware writeback batching X-Git-Tag: v2.6.37-rc1~147^2~299 X-Git-Url: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftorvalds%2Flinux-2.6.git;a=commitdiff_plain;h=3a3b75860527a11ba5035c6aa576079245d09e2a e1000e: use hardware writeback batching Most e1000e parts support batching writebacks. The problem with this is that when some of the TADV or TIDV timers are not set, Tx can sit forever. This is solved in this patch with write flushes using the Flush Partial Descriptors (FPD) bit in TIDV and RDTR. This improves bus utilization and removes partial writes on e1000e, particularly from 82571 parts in S5500 chipset based machines. Only ES2LAN and 82571/2 parts are included in this optimization, to reduce testing load. We have modified the instrumented kernel to include the following patch disabling writeback batching feature to narrow down the problem: --- debug/drivers/net/e1000e/82571.c.orig 2011-10-11 14:00:44.0 -0300 +++ debug/drivers/net/e1000e/82571.c 2011-10-11 15:02:51.0 -0300 @@ -2028,8 +2028,7 @@ struct e1000_info e1000_82571_info = { | FLAG_RESET_OVERWRITES_LAA /* errata */ | FLAG_TARC_SPEED_MODE_BIT /* errata */ | FLAG_APME_CHECK_PORT_B, - .flags2 = FLAG2_DISABLE_ASPM_L1 /* errata 13 */ -| FLAG2_DMA_BURST, + .flags2 = FLAG2_DISABLE_ASPM_L1, /* errata 13 */ .pba= 38, .max_hw_frame_size = DEFAULT_JUMBO, and the customer confirmed that the issue has disappeared since then. Board info: 1e:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06) 1e:00.0 0200: 8086:10bc (rev 06) Subsystem: 103c:704b Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin B routed to IRQ 224 Region 0: Memory at fd4e (32-bit, non-prefetchable) [size=128K] Region 1: Memory at fd40 (32-bit, non-prefetchable) [size=512K] Region 2: I/O ports at 7000 [size=32] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: fee0 Data: 4073 Capabilities: [e0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 512ns, L1 64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported
Re: [E1000-devel] 82574 DMA Burst Mode Enablement
On Wed, 28 Sep 2011 11:39:54 -0700 Denis Radovanovic denis.radovano...@riverbed.com wrote: We are currently testing small packet performance on 82574, comparing it to 82571. Initial pktgen measurements have shown a significant difference in performance that is the most visible when running bidirectional traffic with 256 byte packets. Looking at the e1000e driver, we noticed that flag FLAG2_DMA_BURST is enabled for 82571 and 82572 but it is not enabled for 82574. After enabling the flag, the 82574 performance significantly improved, approaching the one on 82571. At the time the feature was implemented we didn't have the bandwidth to validate it on other parts besides 82571/2 As it stands, yes you can enable it, but there will likely be some bugs that you will run into that we already know about but don't fully have fixed in the code. The bugs might result in tx hangs or other issues. I do agree that there are significant performance gains to be had via this feature, if the bugs can all be worked out. if this is a feature that you would really like implemented please use your Intel Field Agent or TME contacts in order to document your requirement so we can consider it for future releases. Thanks, Jesse -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Possible to stop external IPMI/BMC access (port 623) by bringing iface up?
On Wed, 28 Sep 2011 11:23:50 -0700 Carsten Aulbert carsten.aulb...@aei.mpg.de wrote: But now we reinstalled several machines with Debian Squeeze and suddenly we can only query the BMC when eth0 is down. The kernel we use is exactly the same (2.6.32.28 or 2.6.32.46 currently), i.e. same binary .deb package, same config, only the userland is changed. This is probably the driver touching a register that prevents IPMI traffic from flowing to the bmc. It may be a patch that Debian made that broke it, I don't generally track debian's forks of the kernel. :-) Can you send the output from the ethregs tool before down/after down. ethregs is available on e1000.sf.net in the downloads area. -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [PATCH net-next-2.6] e1000: don't enable dma receives until after dma address has been setup
On Wed, 14 Sep 2011 17:31:38 -0700 Dean Nelson dnel...@redhat.com wrote: Doing an 'ifconfig ethN down' followed by an 'ifconfig ethN up' on a qemu-kvm guest system configured with two e1000 NICs can result in an 'unable to handle kernel paging request at 0001' or 'bad page map in process ...' or something similar. snip The corruption appears to result from the following... . An 'ifconfig ethN down' gets us into e1000_close(), which through a number of subfunctions results in: 1. E1000_RCTL_EN being cleared in RCTL register. [e1000_down()] 2. dma_free_coherent() being called. [e1000_free_rx_resources()] . An 'ifconfig ethN up' gets us into e1000_open(), which through a number of subfunctions results in: 1. dma_alloc_coherent() being called. [e1000_setup_rx_resources()] 2. E1000_RCTL_EN being set in RCTL register. [e1000_setup_rctl()] 3. E1000_RCTL_EN being cleared in RCTL register. [e1000_configure_rx()] 4. RDLEN, RDBAH and RDBAL registers being set to reflect the dma page allocated in step 1. [e1000_configure_rx()] 5. E1000_RCTL_EN being set in RCTL register. [e1000_configure_rx()] During the 'ifconfig ethN up' there is a window opened, starting in step 2 where the receives are enabled up until they are disabled in step 3, in which the address of the receive descriptor dma page known by the NIC is still the previous one which was freed during the 'ifconfig ethN down'. If this memory has been reallocated for some other use and the NIC feels so inclined, it will write to that former dma page with predictably unpleasant results. I realize that in the guest, we're dealing with an e1000 NIC that is software emulated by qemu-kvm. The problem doesn't appear to occur on bare-metal. Andy suspects that this is because in the emulator link-up is essentially instant and traffic can start flowing immediately. Whereas on bare-metal, link-up usually seems to take at least a few milliseconds. And this might be enough to prevent traffic from flowing into the device inside the window where E1000_RCTL_EN is set. nice analysis dean, yes, we shouldn't enable rx before we have the hardware all ready. You didn't mention however that the hardware is reset in e1000_down, which will clear the RDBAL/RDBAH in real hardware. So perhaps a modification needs to be made to the qemu-kvm e1000 NIC emulator to delay the link-up. But in defense of the emulator, it seems like a bad idea to enable dma operations before the address of the memory to be involved has been made known. the hardware reset code in kvm should also reset to default many registers (almost all of them in fact) which may also end up solving the problem. The following patch no longer enables receives in e1000_setup_rctl() but leaves them however they were. It only enables receives in e1000_configure_rx(), and only after the dma address has been made known to the hardware. I still like your patch better as it is more correct. We could also correct the kvm virtual hardware driver. There are two places where e1000_setup_rctl() gets called. The one in e1000_configure() is followed immediately by a call to e1000_configure_rx(), so there's really no change functionally (except for the removal of the problem window. The other is in __e1000_shutdown() and is not followed by a call to e1000_configure_rx(), so there is a change functionally. But consider... . An 'ifconfig ethN down' (just as described above). . A 'suspend' of the system, which (I'm assuming) will find its way into e1000_suspend() which calls __e1000_shutdown() resulting in: 1. E1000_RCTL_EN being set in RCTL register. [e1000_setup_rctl()] And again we've re-opened the problem window for some unknown amount of time. Signed-off-by: Andy Gospodarek a...@greyhouse.net Signed-off-by: Dean Nelson dnel...@redhat.com --- The patch below is Andy's version of a patch I came up with to address this problem. I liked his version better. Functionally there was no difference between the two. Running my version of the patch, the reproducer (see script below) ran for 5 days without issue before I stopped it. Without the patch, former dma pages were corrupted in a very short timeframe and fairly frequently (relatively speaking). Note that I'm also running with a debug patch that after step 5 has completed (mentioned above under an 'ifconfig ethN up'...), the previous dma page is scanned to see if it had been 'corrupted'. So I found a higher percentage of occurrences then one would find if one waits for a kernel BUG. The reproducer for this problem is: cat reproducer.sh EOF #!/bin/bash typeset -i i=0 echo eth1:down ifconfig eth1 down sleep 2 while :; do i=$i+1 ifconfig eth0 down ifconfig eth1 up echo $i | eth0:down eth1:up wait sleep 2 ifconfig eth0 up ifconfig eth1 down echo $i | eth0:up eth1:down wait sleep 2 done EOF The e1000e looks to have the
Re: [E1000-devel] e1000e: NIC not working (after resume?)
On Fri, Sep 9, 2011 at 6:43 AM, Frederik Himpe fhi...@telenet.be wrote: [Crossposting to e1000 mailing list] I have a Dell Latitude E6400 which has a network card supported by the e1000e driver. Often (I think after a suspend/resume cycle), the network card does not work at all: the NIC is correctly seen by ifconfig, but running ethtool just returns: No such device. dhclient -v gives the impression that it's correctly sending out DHCPDISCOVER packets on the NIC, but a tcpdump running on the same machine does not see any packets going out. I'm using Debian's 3.0.0-3 kernel (corresponding with Linux 3.0.3). Full lspci, .config and dmesg output at http://artipc10.vub.ac.be/~frederik/e1000e/ Here is some relevant summary. How can I find out what is going wrong? # ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:21:70:e1:bb:4c UP BROADCAST PROMISC MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Interrupt:22 Memory:f6ae-f6b0 # ethtool eth0 Settings for eth0: Cannot get device settings: No such device Cannot get wake-on-lan settings: No such device Cannot get message level: No such device Cannot get link status: No such device No data available # lspci -vvnn 00:19.0 Ethernet controller [0200]: Intel Corporation 82567LM Gigabit Network Connection [8086:10f5] (rev 03) Subsystem: Dell Device [1028:0233] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Interrupt: pin A routed to IRQ 22 Region 0: Memory at f6ae (32-bit, non-prefetchable) [disabled] [size=128K] Region 1: Memory at f6adb000 (32-bit, non-prefetchable) [disabled] [size=4K] Region 2: I/O ports at efe0 [disabled] [size=32] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=1 PME+ The above is why nothing seems to work right. The kernel runtime power management thinks there is no link on the port, and so is putting the port into D3. We need to make sure that our driver wakes the device to read link state (the problem is that wake from D3, plus time to get link can take ~4 seconds). Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: fee0300c Data: 4182 Capabilities: [e0] PCI Advanced Features AFCap: TP+ FLR+ AFCtrl: FLR- AFStatus: TP- Kernel driver in use: e1000e # dmesg | grep -E e1000e|eth0 [ 1.027437] e1000e: Intel(R) PRO/1000 Network Driver - 1.3.10-k2 [ 1.027441] e1000e: Copyright(c) 1999 - 2011 Intel Corporation. [ 1.027480] e1000e :00:19.0: PCI INT A - GSI 22 (level, low) - IRQ 22 [ 1.027491] e1000e :00:19.0: setting latency timer to 64 [ 1.027605] e1000e :00:19.0: irq 43 for MSI/MSI-X [ 1.231440] e1000e :00:19.0: eth0: (PCI Express:2.5GT/s:Width x1) 00:21:70:e1:bb:4c [ 1.231444] e1000e :00:19.0: eth0: Intel(R) PRO/1000 Network Connection [ 1.231470] e1000e :00:19.0: eth0: MAC: 7, PHY: 8, PBA No: 1004FF-0FF [ 22.896268] e1000e :00:19.0: irq 43 for MSI/MSI-X [ 22.952097] e1000e :00:19.0: irq 43 for MSI/MSI-X [ 22.954132] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 24.504903] e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx [ 24.506413] e1000e :00:19.0: eth0: 10/100 speed: disabling TSO [ 24.508402] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 34.788022] eth0: no IPv6 routers present [ 41.444922] e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx [ 41.446325] e1000e :00:19.0: eth0: 10/100 speed: disabling TSO [25136.918393] e1000e :00:19.0: PME# enabled [25142.488050] e1000e :00:19.0: BAR 0: set to [mem 0xf6ae-0xf6af] (PCI address [0xf6ae-0xf6af]) [25142.488058] e1000e :00:19.0: BAR 1: set to [mem 0xf6adb000-0xf6adbfff] (PCI address [0xf6adb000-0xf6adbfff]) [25142.488066] e1000e :00:19.0: BAR 2: set to [io 0xefe0-0xefff] (PCI address [0xefe0-0xefff]) [25142.488085] e1000e :00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) [25142.488110] e1000e :00:19.0: restoring config space at offset 0x1 (was 0x10, writing 0x100107) [25142.488167] e1000e :00:19.0: PME# disabled [25142.510966] e1000e :00:19.0: PCI INT A disabled [25142.510971] e1000e :00:19.0: PME# enabled [25143.468668] e1000e :00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a)
Re: [E1000-devel] vlan steering
On Tue, 6 Sep 2011 02:19:44 -0700 bill4carson bill4car...@gmail.com wrote: Hi, guys Just a quick question about vlan steering, does 82599 support this feature? I didn't see any description about it in the 82576/82599 specification I think what you're looking for is the VMDQ mode of the hardware, where either VLAN id or MAC address selects which queue. Our drivers currently don't implement this support fully. The bellow description is my understanding of vlan steering, correct me if I'm wrong about this concept. +- Queue 1 ++| packets from wire| |+- Queue 2 - | sort method| ---| | |+- Queue x ++| +- The picture is kinda hosed, but what happens is that a packet is received by the hardware, the queue is picked based on hardware configuration, and the packet is delivered to a descriptor in that queue. Last queue sort method matters most. Vlan steering sort packets from wire based on the vlan ID in L2 packet into different queues, this could relieve up layer protocol from the burden sorting by software. We currently already offload the vlan ID (and strip it from the packet) so there isn't a whole lot of offload overhead, AFAIK. -- Doing More with Less: The Next Generation Virtual Desktop What are the key obstacles that have prevented many mid-market businesses from deploying virtual desktops? How do next-generation virtual desktops provide companies an easier-to-deploy, easier-to-manage and more affordable virtual desktop model.http://www.accelacomm.com/jaw/sfnl/114/51426474/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] e1000e check_mng_mode issue
On Tue, 23 Aug 2011 14:38:18 -0700 Andy Cress andy.cr...@us.kontron.com wrote: Tushar, Thanks for running this down. So that means that the current driver implementation would never allow a NIC which has a BMC sideband connection physically to ever power off the PHY. That doesn't seem like the right approach. I realize that it would be difficult to convey whether an IPMI session is active or not and it I'm not an ipmi expert, but as I understand it, if the BMC firmware is *running on our networking chip* then I think we could identify that. If we are running as JUST the smbus transport layer, we just have a simple wire interface for smbus, that can receive packets from an external BMC at any time. The NIC actually knows nothing about the smbus connection except maybe that it *might* have added a MAC address to our receive filter. may not be desirable to power down the PHY if there could be incoming IPMI management traffic, but there should be a way to detect if the IPMI configuration has the channel enabled or not. If IPMI LAN is not enabled, the PHY can safely be powered down. In the case of an external BMC, this becomes really difficult to know (I'm not sure it isn't possible, however) Obviously it would be simple to ignore these bits in the driver to get it to work, but that's not optimal. If the BMC is asserting these bits without regard to the configuration of the IPMI LAN channels, perhaps that is where the bug could be pursued, to fix the firmware? Or should the driver use another mechanism to discern whether IPMI LAN is enabled or not (KCS, ...)? we have had zero luck getting external BMC vendors to update their code. One time we managed to get Intel to fix its external BMC. The lesson we've learned so far is changing BMCs is really really hard. I think the bits are actually decided by the type of NVRAM image we are loading, it is statically set at manufacturing time. Andy -Original Message- From: Dave, Tushar N [mailto:tushar.n.d...@intel.com] Sent: Tuesday, August 23, 2011 5:07 PM To: Andy Cress; e1000-devel@lists.sourceforge.net Subject: RE: e1000e check_mng_mode issue Andy, These bits gets set by internal BMC firmware when the code is initialized. It does not depend on whether there is currently an active IPMI session. -Tushar -Original Message- From: Andy Cress [mailto:andy.cr...@us.kontron.com] Sent: Friday, August 12, 2011 2:30 PM To: Dave, Tushar N; e1000-devel@lists.sourceforge.net Subject: RE: e1000e check_mng_mode issue Tushar, But if 'management mode' means that IPMI LAN is enabled or in use, then this indication is yielding a false result, because IPMI LAN is disabled. Those bits are always set regardless of the state of the IPMI LAN configuration. So what drives those bits? Does the IPMI firmware drive them, or do they depend on the NIC firmware, or ...? Andy -Original Message- From: Dave, Tushar N [mailto:tushar.n.d...@intel.com] Sent: Friday, August 12, 2011 4:58 PM To: Andy Cress; e1000-devel@lists.sourceforge.net Subject: RE: e1000e check_mng_mode issue Andy, The define constant name (i.e. E1000_MNG_IAMT_MODE) is little confusing. The objective of e1000_check_mng_mode_generic() is to check if Management is enabled or not. It doesn't care about what MNG mode is enabled(e.g AMT or IPMI). If Management is enabled then FMSW (bit 3:1) should have value 0x3 (This value is loaded from EEPROM word 13h). So all e1000_check_mng_mode_generic() does is check if the FMSW's bit 3:1 is equivalent to value 0x3. Let me know if you have any more queries. -Tushar -Original Message- From: Andy Cress [mailto:andy.cr...@us.kontron.com] Sent: Friday, August 12, 2011 1:23 PM To: Dave, Tushar N; e1000-devel@lists.sourceforge.net Subject: RE: e1000e check_mng_mode issue Tushar, Right, my eth0 is 80003ES2LAN. Attached is the 'ethtool -e eth0' output (eth0.e). Andy -Original Message- From: Dave, Tushar N [mailto:tushar.n.d...@intel.com] Sent: Friday, August 12, 2011 12:48 PM To: Andy Cress; e1000-devel@lists.sourceforge.net Subject: RE: e1000e check_mng_mode issue Andy, Thanks for your patience. I am looking into this. (assuming your eth0 device is 80003ES2LAN) Can you provide 'ethtool -e eth0' output? -Tushar -Original Message- From: Andy Cress [mailto:andy.cr...@us.kontron.com] Sent: Tuesday, August 09, 2011 2:21 PM To: e1000-devel@lists.sourceforge.net Subject: [E1000-devel] e1000e check_mng_mode issue This may apply to other NICs with an IPMI BMC instead of AMT, but here's my configuration: Baseboard: Intel S5000PAL Onboard NICs (2): 80003ES2LAN And this has an IPMI BMC on the baseboard with sideband connections to the onboard NICs. # ethtool -I eth0 driver: e1000e version: 1.0.2.5-NAPI firmware-version: 1.0-0 bus-info: :07:00.0 For the e1000e driver, the
Re: [E1000-devel] e1000e check_mng_mode issue
On Wed, 24 Aug 2011 10:02:22 -0700 Andy Cress andy.cr...@us.kontron.com wrote: Thanks Jesse, that helps. The upshot of all this is that I believe we need an alternative way of detecting mng_mode for onboard IPMI BMCs, which could be implemented in a check_mng_mode_ipmi() routine perhaps. The cases I'm most interested in are Intel S5000PAL and S5520UR motherboards (80003ES2LAN and 82575EB NICs, respectively). Option 1: check_mng_mode_ipmi() I know that this information could be queried using a local KCS interface with the IPMI GetChannelAccess command, which should not be too difficult if the OpenIPMI driver is already there (#ifdef CONFIG_IPMI). what about something not so kernel based. It seems to me all we are missing is a user-space override that communicates to the driver yeah, I know (as the administrator) that I'm not using IPMI, so port power down is okay. We could do this with a small driver enhancement possibly using ethtool private flags that would allow the driver to override the check_mng_mode_generic result, and power down the phy anyway. This functionality in the driver would allow for a user space script to query ipmi, make sure it was disabled, and then enable the override via ethtool. Option 2: Allow a compile-time driver option to toggle whether or not the check_mng_mode_generic() returns according to the existing hard-coded bits, or returns a hard-coded zero (allowing those functions to occur), which would require the user to ensure that IPMI LAN had been disabled before exercising this. Would you like me to take a stab at implementing option 1, or do you have a better idea? Andy -Original Message- From: Jesse Brandeburg [mailto:jesse.brandeb...@intel.com] Sent: Wednesday, August 24, 2011 12:25 PM To: Andy Cress Cc: Dave, Tushar N; e1000-devel@lists.sourceforge.net Subject: Re: [E1000-devel] e1000e check_mng_mode issue On Tue, 23 Aug 2011 14:38:18 -0700 Andy Cress andy.cr...@us.kontron.com wrote: Tushar, Thanks for running this down. So that means that the current driver implementation would never allow a NIC which has a BMC sideband connection physically to ever power off the PHY. That doesn't seem like the right approach. I realize that it would be difficult to convey whether an IPMI session is active or not and it I'm not an ipmi expert, but as I understand it, if the BMC firmware is *running on our networking chip* then I think we could identify that. If we are running as JUST the smbus transport layer, we just have a simple wire interface for smbus, that can receive packets from an external BMC at any time. The NIC actually knows nothing about the smbus connection except maybe that it *might* have added a MAC address to our receive filter. may not be desirable to power down the PHY if there could be incoming IPMI management traffic, but there should be a way to detect if the IPMI configuration has the channel enabled or not. If IPMI LAN is not enabled, the PHY can safely be powered down. In the case of an external BMC, this becomes really difficult to know (I'm not sure it isn't possible, however) Obviously it would be simple to ignore these bits in the driver to get it to work, but that's not optimal. If the BMC is asserting these bits without regard to the configuration of the IPMI LAN channels, perhaps that is where the bug could be pursued, to fix the firmware? Or should the driver use another mechanism to discern whether IPMI LAN is enabled or not (KCS, ...)? we have had zero luck getting external BMC vendors to update their code. One time we managed to get Intel to fix its external BMC. The lesson we've learned so far is changing BMCs is really really hard. I think the bits are actually decided by the type of NVRAM image we are loading, it is statically set at manufacturing time. Andy -Original Message- From: Dave, Tushar N [mailto:tushar.n.d...@intel.com] Sent: Tuesday, August 23, 2011 5:07 PM To: Andy Cress; e1000-devel@lists.sourceforge.net Subject: RE: e1000e check_mng_mode issue Andy, These bits gets set by internal BMC firmware when the code is initialized. It does not depend on whether there is currently an active IPMI session. -Tushar -Original Message- From: Andy Cress [mailto:andy.cr...@us.kontron.com] Sent: Friday, August 12, 2011 2:30 PM To: Dave, Tushar N; e1000-devel@lists.sourceforge.net Subject: RE: e1000e check_mng_mode issue Tushar, But if 'management mode' means that IPMI LAN is enabled or in use, then this indication is yielding a false result, because IPMI LAN is disabled. Those bits are always set regardless of the state of the IPMI LAN configuration. So what drives those bits? Does the IPMI firmware drive them, or do they depend on the NIC firmware, or ...? Andy -Original Message- From: Dave
Re: [E1000-devel] Spam
2011/8/1 CLOSE Dave dave.cl...@us.thalesgroup.com: I've tried asking privately to the owner of this list but have seen no response. Is there some reason why we can't filter this crap? Does anyone manage the list and remove offenders? Hi Dave, yeah, sorry about the spam to this list, but we don't want to make it a closed list because it is a community support mailing list, and because of that we don't limit the traffic to members only. Unfortunately sourceforge's spam control via mailman leaves a lot to be desired and we've configured it as much as we can to block spam, but it still lets ~5 or so spams a week through. I hardly ever see them however because my local spam filter (spambayes) catches them. If you have any suggestions or anything else we can do let us know. - Jesse -- BlackBerryreg; DevCon Americas, Oct. 18-20, San Francisco, CA The must-attend event for mobile developers. Connect with experts. Get tools for creating Super Apps. See the latest technologies. Sessions, hands-on labs, demos much more. Register early save! http://p.sf.net/sfu/rim-blackberry-1 ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] non_eop_descs
On Tue, Aug 2, 2011 at 6:12 PM, Richard Scobie rich...@sauce.co.nz wrote: I have a NAS box set up with bridged interfaces, a couple of which are 82598EB AF. The host boxes direct attached to these are performing more or less identical tasks, but one interface shows no non_eop_descs and the other many. snip What is this measuring and/or what causes them please? this is a statistic to count how many times the hardware chained buffers together in order to make a single frame for receive to the host. For instance, if every RX buffer is 2kB, and you have jumbos enabled, OR, in the case of 82599, RSC (receive side coalescing - like hardware LRO) then to receive 9kB you might need 5 * 2kB buffers. each of the first four would not have EOP (end of packet) set, and the 5th would. This is basically a debug statistic to help us developers have a better picture of what kind of receives are being done by the hardware/driver. -- BlackBerryreg; DevCon Americas, Oct. 18-20, San Francisco, CA The must-attend event for mobile developers. Connect with experts. Get tools for creating Super Apps. See the latest technologies. Sessions, hands-on labs, demos much more. Register early save! http://p.sf.net/sfu/rim-blackberry-1 ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Getting information from Users
On Thu, Jun 16, 2011 at 12:21 PM, Martin Owens docto...@gmail.com wrote: Hey devels, I'm updating my bug report with requested information and thought I might as well make a script to automatically pull all the information together. This is awesome, thank you, I think that it needs some minor tweaks however to make sure we get all relevant info. http://paste.ubuntu.com/628133/ This script is very simple to use: sudo ./collect-info.sh ethX Then just post the tar.gz file containing all the required info, it submits only the relevant section of the dmesg log from the modprobe as well as taking care over selecting the driver the eth device is using. Should work for any eth driver, not just e1000e. It would be good if we could point users to the script when they try and report bugs. Please pass upstream to the linux-networking mailing list as required. Best Regards, Martin Owens -- EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking. http://p.sf.net/sfu/ephox-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired -- Simplify data backup and recovery for your virtual environment with vRanger. Installation's a snap, and flexible recovery options mean your data is safe, secure and there when you need it. Data protection magic? Nope - It's vRanger. Get your free trial download today. http://p.sf.net/sfu/quest-sfdev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Question on net_stats-rx_dropped setting to 0
You have to overrun the fifo on the hardware to see rx_dropped error from hardware. Currently your cpu is fast enough to keep up with the packet load. On Wed, May 25, 2011 at 7:32 PM, Filo FeFi j11...@yahoo.com wrote: Ah! I've been looking for it in kernel version 2.6.18 which doesn't seem to have the function. I should have mentioned it. In my test, I'm send many packets to the ixgbe: kernel 2.6.18 ixgbe 3.3.9 w/o NAPI I'm seeing ixgbe's call to netif_rx() returning NET_RX_DROP, and it is incrementing adapter-rx_dropped_backlog. However, this value isn't reported by ifconfig's rx dropped. I can see ixgbe_ethtool.c sends it to ethtool, so I can use that; also, as per Eric Dumazet's earlier email, I see that /proc/net/softnet_stat drop count being incremented in the netif_rx() function. But so far, I keep seeing 0 in ifconfig's RX dropped. I'm wondering under what situation can I see something other than 0. Thanks, Ching --- On Wed, 5/25/11, Alexander Duyck alexander.h.du...@intel.com wrote: From: Alexander Duyck alexander.h.du...@intel.com Subject: Re: [E1000-devel] Question on net_stats-rx_dropped setting to 0 To: Filo FeFi j11...@yahoo.com Cc: e1000-devel@lists.sourceforge.net e1000-devel@lists.sourceforge.net, Skidmore, Donald C donald.c.skidm...@intel.com Date: Wednesday, May 25, 2011, 11:57 AM The function should be around line 1500 in /net/core/dev.c of the Linux kernel. I've included a link to it in lxr below. http://lxr.linux.no/#linux+v2.6.39/net/core/dev.c#L1498 Thanks, Alex On 05/25/2011 02:41 PM, Filo FeFi wrote: Hi Don, Could you please elaborate a little on the dev_forward_skb() ? Where can I find that function? I was about to conclude that ixgbe always report 0 for RX drop, but I would like to know the correct answer. Thanks, Ching --- On Mon, 5/23/11, Skidmore, Donald Cdonald.c.skidm...@intel.com wrote: From: Skidmore, Donald Cdonald.c.skidm...@intel.com Subject: RE: [E1000-devel] Question on net_stats-rx_dropped setting to 0 To: Filo FeFij11...@yahoo.com, e1000-devel@lists.sourceforge.nete1000-devel@lists.sourceforge.net Date: Monday, May 23, 2011, 5:55 PM Hi Ching, As you noted we (ixgbe) doesn't modify this value, other than initialing it to zero. However elsewhere in the stack it is modified. One example being dev_forward_skb(). So ixgbe devices may report rx_dropped as something other than 0. Thanks, -Don -Original Message- From: Filo FeFi [mailto:j11...@yahoo.com] Sent: Thursday, May 19, 2011 7:19 PM To: e1000-devel@lists.sourceforge.net Subject: [E1000-devel] Question on net_stats-rx_dropped setting to 0 Dear ixgbe developers: I'm debugging a problem where some frames get dropped by the ixgbe driver (version 2.0.44-k2), i.e. /proc/net/dev drop is not 0. Reading the ixgbe-3.3.9/2.0.44.13/2.0.44.14 source, I see the line (in ixgbe_main.c ixgbe_update_stats()): net_stats-rx_dropped = 0; So, does this mean that ixgbe always reports 0 for RX dropped? Under what circumstances would /proc/net/dev's drop count for ixgbe be incremented/changed from 0? Thank you, Ching Tai (650) 506-1454 -- What Every C/C++ and Fortran developer Should Know! Read this article and learn how Intel has extended the reach of its next-generation tools to help Windows* and Linux* C/C++ and Fortran developers boost performance applications - including clusters. http://p.sf.net/sfu/intel-dev2devmay ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired -- vRanger cuts backup time in half-while increasing security. With the market-leading solution for virtual backup and recovery, you get blazing-fast, flexible, and affordable data protection. Download your free trial now. http://p.sf.net/sfu/quest-d2dcopy1 ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired -- vRanger cuts backup time in half-while increasing security. With the market-leading solution for virtual backup and recovery, you get blazing-fast, flexible, and affordable data protection. Download your free trial now. http://p.sf.net/sfu/quest-d2dcopy1 ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net
Re: [E1000-devel] 82754L spontaneous freeze networking woes continue in 2.6.37
On 1/31/2011 4:06 PM, Allan, Bruce W wrote: -Original Message- From: Nix [mailto:n...@esperi.org.uk] Sent: Monday, January 31, 2011 3:31 PM To: Allan, Bruce W Cc: e1000-devel@lists.sourceforge.net Subject: Re: [E1000-devel] 82754L spontaneous freeze networking woes continue in 2.6.37 On 31 Jan 2011, Bruce W. Allan spake thusly: From: Nix [mailto:n...@esperi.org.uk] I'm not so sure anymore. In 2.6.35.4, everything works -- but in 2.6.35.4, the lspci output is *exactly the same*, i.e. even there lspci claims that ASPM L0s and L1 are enabled. This seems unlikely, since even if the L0s/L1 state persists across a poweroff, the problem disappears upon a simple reboot into 2.6.35.4, and does not recur in that kernel release. Which kernel versions? The above mentioned are all the same??? Yes. 2.6.35.4..2.6.37 have no differences whatsoever in their lspci output for my 82574L cards. I am... confuzzled, but am happy to try turning L0s/L1 off (if I can figure out how to do it: setpci is... not the most friendly of tools and I've never even looked at its manpage before). ASPM is enabled/disabled via bits 1:0 of byte 16 in the Express Endpoint capability register. First see what is in this byte with the following: # setpci -s domain]:]bus]:][slot][.[func]] CAP_EXP+10.b where domain]:]bus]:][slot][.[func]] is the slot information for your 82574. I'm guessing that command will return 43 (hex) to indicate ASPM L0s (bit 0) and ASPM L1 (bit 1) are both enabled based on your previous lspci output. Now, re-write the byte with bits 1:0 set to 10b (or 42 hex) to disable ASPM L0s: # setpci -s domain]:]bus]:][slot][.[func]] CAP_EXP+10.b=42 or 00b (40 hex) to disable both ASPM L0s and L1: # setpci -s domain]:]bus]:][slot][.[func]] CAP_EXP+10.b=40 and verify with 'lspci -vvv' that ASPM L0s [and L1] are disabled. Please, for our benefit, file a bug at e1000.sf.net (if you have not already) so you can attach the .config and full dmesg file from a non-working kernel, also please attach the full lspci -vvv output. The reason I'm asking for this is that the kernel may actually be configured to not do aspm at all (CONFIG_ASPM=n), but it still is helpful by printing strings like it did something[1] [1] http://lxr.linux.no/linux+v2.6.37/include/linux/pci-aspm.h#L41 -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] NAPI in e1000e
2010/11/1 xiaolin chan...@yeah.net: In e1000 driver, there is ew32(IMC, ~0) in the function of e1000_intr before scheduling adapter-napi. However, there is no such kind operation in e1000e. My question is whether NIC hardware irq is disabled during the NAPI/ksoftirqd processing? yes, it is disabled by the IAM (auto-mask) register, when the interrupt is asserted and the ICR register is read. -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [PATCH] e1000e: Intel 82571EB: Don't wait for MNG cycle on unmanaged chips
On Fri, Aug 27, 2010 at 12:10 PM, Kyle Moffett kyle.d.moff...@boeing.com wrote: The Intel 82571EB chipset can be used in an unmanaged configuration as a fast dual-port Gig-E controller. Unfortunately a board constructed that way would fail to correctly come up because the driver polls for the completion of a management cycle that will never occur. To resolve this problem, we disable the poll and error return on chips whose EEPROMs indicate no management. As a protection against misconfigured chipsets, we still delay for the entire management poll timeout. Signed-off-by: Kyle Moffett kyle.d.moff...@boeing.com Hi Kyle, thanks for submitting this patch. Are you fixing this problem for a device that is a LOM? The reason I ask is that most if not all of our current eeprom images require some firmware interaction to correctly initialize the PHY when the part is reset, even for the no_mng (no managability) case. Your code below will avoid reading of and waiting for the cfg_done bit, which means that the firmware could end up racing with the driver, with them both trying to configure the part. Was there a specific bug you were trying to fix, and can you reply (if you want to me in private) with your ethtool -e ethX output? The concern here is that you may simply have an out of date eeprom image, which might fix the original issue and get the driver to work correctly, as the behavior you are describing is not how it should work according to our design. At the very least we would like to reproduce your issue here so we can investigate further. Jesse -- This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] carrier detection issues at 10GB on XAUI with ixgbe driver on 2.6.27 x86 board
On Tue, 2010-05-18 at 13:53 -0700, Chris Friesen wrote: I'm seeing some strange behaviour with an 82599 using XAUI at 10GB. Intermittently we get a scenario where it seems to get stuck in the following loop: link detected as up 40-45ms delay link detected as down 2 sec delay The following is a detailed timeline for one specific event. Timestamps are in microseconds: 762926422: link tests as down, LINKS register 0x34480100 (100ms gap) 763026224: audit detects link up, LINKS register 0x744bef80 763026254: receive link state change interrupt via pci message, triggers watchdog to run (but it's already running) 763026260: link up message printed to log stream 763026383: link tests as up, LINKS register 0x744bef80 (45ms gap) 763071115: receive link state change interrupt via pci message 763071134: link tests as down, LINKS register 0x34480f00 763170935: link tests as down, LINKS register 0x34480100 Basically, as far as I can tell the LINKS register values match what we would expect to see if the far end was going up and down. However, the logs we have from the switch card (which admittedly don't give register-level information) don't show it bouncing the link up and down this fast. Any ideas what might be happening here? None that immediately come to mind, I forwarded this to our hardware engineering however to take a look. To save some time looking at the datasheet, the relevent bits in the LINKS register are interpreted as follows. For the 2nd and 3rd values I'll only give the deltas against the previous one. 744bef80: link is up 10G align status good 10G lane sync status good all lanes signal detected on all four lanes of 10G parallel link status is up 34480f00: link not up 10G align status failed 10g parallel lane sync status is failed link status is down 34480100: signal detected only on lane 0 of 10G parallel (lanes 1/2/3 no signal detected) So basically we start out with no signal, then after 100ms we transition to the proper register values for a normal up link, then 45ms later we lose the alignment status (but still have good lane sync status), and finally 100ms later we lose signal detect on 3 of the 4 lanes. Does the link eventually come up? We may need to get an eeprom dump from the 82599 part you're working on as well. ethtool -e ethX should suffice. -- Jesse Brandeburg This email sent via Evolution, powered by Linux -- ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Missing VLAN Header
On Mon, Mar 22, 2010 at 7:57 AM, Andreas Grau andreas.g...@ipvs.uni-stuttgart.de wrote: Hi, We are currently experimenting with vlan on a 10GE i82599 nic. Linux 2.6.18 with ixgbe version 2.0.62.4-NAPI is used on top of XEN 3.1.2. For the experiments we are using the following scenario: - --- --- - | domU.1 | | dom0.1 | | dom0.2 | | domU.2 | | | | | | | | | | 1.0.0.1 | | | | | | 1.0.0.2 | | vlan100 | | bridge | | bridge | | vlan100 | | | | | | | | | | | | | | | | eth0 | | vif eth0 | | vif eth0 | | eth0 | - --- --- - | | | | | | -- ---crossover--- --- 10GE-kable There is a bug with respect to vlan header stripping (that is not disabled correctly in promisc mode) and 82599, the fix is pretty simple, but is not released yet. We now execute in domU.1 ping 1.0.0.2. Unfortunately the ping-request is not answered. Running on dom0.1 tcpdump -i eth0 gives (as expected): 16:44:21 vlan 100, p 0, ARP, Request who-has 1.0.0.2 tell 1.0.0.1, length 28 Running on dom0.2 tcpdump -i eth0 gives: 16:44:21 ARP, Request who-has 1.0.0.2 tell 1.0.0.1, length 42 For some reason the vlan header is removed? Could anyone tell me why? Cheers Andreas PS: Running the same scenario using another gigabit nic and the igb driver, everything works. We'll have a release out soon with that fix. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] recent e100 fixes cause kernel panic?
Added netdev, the place to talk about in-kernel driver problems. On Thu, 2010-03-11 at 22:39 -0700, Stephen Hemminger wrote: - Ed Ravin era...@panix.com wrote: I'm using the Vyatta kenwood Linux distribution, which is currently at 2.6.31-1. I upgraded to their latest version, and began seeing kernel panics shortly after starting to use ssh/scp on the network connected to an e100 NIC. I was able to reproduce the problem immediately after booting up - sometimes it even crashed during the boot. One of the crash logs is attached. Ed, thanks for the report, looks like these patches introduced a new problem. e100 hardware has a tricky data structure that seems to cause some problems for (particularly arm) some cpu architectures. Since the problem seemed to be related to e100.c, I reverted the two commits to e100.c that had taken place since I last built the kernel for this box: Author: Roger Oksanen roger.oksa...@cs.helsinki.fi Date: Fri Dec 18 20:18:21 2009 -0800 e100: Fix broken cbs accounting due to missing memset. Author: Roger Oksanen roger.oksa...@cs.helsinki.fi Date: Sun Nov 29 17:17:29 2009 -0800 e100: Use pci pool to work around GFP_ATOMIC order 5 memory allocation failu I rebuilt the kernel and it's not panicking anymore. so you just reverted both, and its good news things are working again, but can you try one or the other and let us know if things still break for you? The Vyatta kernel for 2.6.31 is based on the 2.6.31.10 + unionfs. These two patches came from the 2.6.31.10 -stable update. This is the only report of this issue I have heard so far, so something must be a little unique to your system or workload such that the driver works mostly. I'm looking more closely into the panic trace now, maybe I can figure it out from there. -- Jesse Brandeburg This email sent via Evolution, powered by Linux -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Intel 2598EB 10-Gigabit AT dropped rx packet
://www.ge.infn.it -- -- Se tutto sembra venirti incontro, probabilmente sei nella corsia sbagliata. -- -- -- Mirko Corosu Network and system administrator Computing Center Istituto Nazionale Fisica Nucleare Via Dodecaneso 33 16146 Genova, Italy http://www.ge.infn.it -- -- Se tutto sembra venirti incontro, probabilmente sei nella corsia sbagliata. -- --- --- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired -- -- Mirko Corosu Network and system administrator Computing Center Istituto Nazionale Fisica Nucleare Via Dodecaneso 33 16146 Genova, Italy http://www.ge.infn.it -- -- Se tutto sembra venirti incontro, probabilmente sei nella corsia sbagliata. -- -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired -- Jesse Brandeburg This email sent via Evolution, powered by Linux -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] e1000_clean_tx_irq: Detected Tx Unit Hang
On Thu, Mar 4, 2010 at 3:19 AM, Metal Thrashing Mad thrash.d...@gmail.com wrote: Just read the mail from Nikita, about fixeep-82573-dspd.sh. I didn't see that mail. That script is only for 82573. Running the script returns - No appropriate hardware found for this fixup. Knowing full well that doing the following could render my card useless, void the warranty I modified the script to return true for the model I have. okay, but why? Running iperf with a total of 128 inbound connections with a -t of 6000 a few times has not broke anything. Looks like this script may have fixed things. Iptraf was showing consistent 80,xxx kbit/s did this test usually fail before? Here's an eeprom dump (after the script was ran) ethtool -e eth0 Offset Values -- -- 0x 00 0e 0c c2 82 04 10 02 ff ff 00 10 ff ff ff ff 0x0010 60 d2 03 00 0b 64 76 14 86 80 7c 10 86 80 85 b2 so the 0x85 1 bytes from the end changed from 0x84 when you ran that script. looking in the handy dandy manual for your 82541 posted at sourceforge, EEPROM address map section, I see that bit you changed is for uh, word, 0xF, 0xb284 became 0xb285 (aka bit 0) bit zero is: reserved looking into our internal documentation, that bit really shouldn't be doing anything if you are at 1Gb/s link. My guess is you're going to see the problem again. 0x0020 dd 20 55 55 00 00 90 2f 00 32 12 00 20 1e 12 00 0x0030 20 1e 12 00 20 1e 12 00 20 1e 09 00 00 02 00 00 0x0040 0c 00 a6 93 0b 28 00 00 00 04 ff ff ff ff ff ff 0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 02 06 0x0060 00 01 00 40 16 12 07 40 ff ff ff ff ff ff ff ff 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff d4 19 If that doesn't show up correctly http://pastebin.ca/1822468 Here's an ethtool -S ethtool -S eth0 NIC statistics: rx_packets: 102700084 tx_packets: 72630664 rx_bytes: 136903466843 tx_bytes: 44351377340 rx_broadcast: 3743 tx_broadcast: 93 rx_multicast: 0 tx_multicast: 6 rx_errors: 0 tx_errors: 0 tx_dropped: 0 multicast: 0 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 3 rx_missed_errors: 0 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 2 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 tx_restart_queue: 821779 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 1573008 tx_tcp_seg_failed: 0 rx_flow_control_xon: 2 rx_flow_control_xoff: 2 tx_flow_control_xon: 71929941 tx_flow_control_xoff: 71896901 rx_long_byte_count: 136903466843 rx_csum_offload_good: 102696253 rx_csum_offload_errors: 0 alloc_rx_buff_failed: 0 tx_smbus: 0 rx_smbus: 0 dropped_smbus: 0 Another pastebin link for the above http://pastebin.ca/1822475 If you need anymore hardware information to update that script, let me know. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] e1000_clean_tx_irq: Detected Tx Unit Hang
On Mon, Mar 1, 2010 at 3:37 AM, Thrash Dude thrash.d...@gmail.com wrote: Seems to be a rather common issue with the e1000 module. I searched the archives back to 2005. Plenty of reports, no solutions. There are some solutions, one of which is to try loading the driver with TxDescriptorStep=4 TxDescriptors=1024 The NIC does drop the link, PC does not hang. The link does become active again. Wouldn't be such an issue, although this PC is a file server for streaming audio and video files exported across nfs and cifs shares. Quite an annoying problem to get 55minutes into a movie to have the link die. for some of the recent times have you been streaming using cifs or NFS? what version of NFS? what client machine /os did you test with? What streaming software were you using to play the movie on the remote machine? NOTE: No the link does not die with every movie. This seems to be completely random. I can flood the _server_ with 15 incoming connections continuously for 30 minutes and there's no problem. Or I can simply ping - c4 server and receive a Tx Unit Hang. so maybe its not actually related to traffic levels? Machine specs - Slackware x86_64 -current Pure Virgin Kernel 2.6.32.8 (have noticed issue with previous kernels) 7GB Ram AMD RS780 Migrated same card to another machine to rule out +4GB question that is always. And another Chipset to test. Intel P45, 2GB Ram - same issue This is actually a promising development because we might actually have something close to that system here. What slot did you plug in? what is the barcode number on your adapter? XX-XXX. The other (bad) option is that since the problem follows the adapter it could be the adapter. have you double checked cooling of the NIC? Do you have another identical NIC you can try? You can probably get warranty support for the one you have, to get a replacement. VMware Player is currently installed. Issue presents itself when VMware is removed and/or VMware modules are not loaded. See below for modinfo, dmesg, IRQ's, lspci and some ethtool output Partial dmesg [43503.704198] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang [43503.704199] Tx Queue 0 [43503.704200] TDH c7 [43503.704201] TDT da [43503.704201] next_to_use da [43503.704202] next_to_clean c8 [43503.704202] buffer_info[next_to_clean] [43503.704203] time_stamp 1029335c6 [43503.704203] next_to_watch c9 [43503.704204] jiffies 102933c78 [43503.704205] next_to_watch.status 0 [43505.704209] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang [43505.704211] Tx Queue 0 [43505.704212] TDH c7 [43505.704212] TDT da [43505.704213] next_to_use da [43505.704214] next_to_clean c8 [43505.704214] buffer_info[next_to_clean] [43505.704215] time_stamp 1029335c6 [43505.704215] next_to_watch c9 [43505.704216] jiffies 102934448 [43505.704216] next_to_watch.status 0 [43507.704182] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang [43507.704183] Tx Queue 0 [43507.704184] TDH c7 [43507.704185] TDT da [43507.704185] next_to_use da [43507.704186] next_to_clean c8 [43507.704186] buffer_info[next_to_clean] [43507.704187] time_stamp 1029335c6 [43507.704187] next_to_watch c9 [43507.704188] jiffies 102934c18 [43507.704189] next_to_watch.status 0 wow, thats a mess, please fix your mail client next time. What I do see in the above is is appears to be a legitimate tx hang. We have some debug code you can run that can help us diagnose, would you be able to run that? modinfo e1000|grep ^version version: 7.3.21-k5-NAPI ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: g Current message level: 0x0007 (7) Link detected: yes ethtool -i eth0 driver: e1000 version: 7.3.21-k5-NAPI firmware-version: N/A bus-info: :02:06.0 ethtool -g eth0 Ring parameters for eth0: Pre-set maximums: RX: 4096 RX Mini: 0 RX Jumbo: 0 TX: 4096 Current hardware settings: RX: 256 RX Mini: 0 RX Jumbo: 0 TX: 256 ethtool -k eth0 Offload parameters for eth0: rx-checksumming: on
Re: [E1000-devel] New thread: page allocation failure with E1000 (seems to be reproducible)
in the future please copy net...@vger.kernel.org on networking issues. On Mon, Mar 1, 2010 at 9:34 AM, Richard Hartmann richih.mailingl...@gmail.com wrote: Hi Jesse, the memory allocation (order:0), while unexpected, are not fatal, and the e1000 driver is written to handle the failures during allocation. Does something else happen to the system after this or does operation continue? I can not be sure, but I _think_ some bogus data made it into userspace. I did have some binary in a text string I received logged, which is a tad unusual. hm, if that did occur it would be bad. But it does sound like operation continued, which is good. You might be able to try the sysctl tweak to reserve a little more memory for driver allocations. # sysctl vm.min_free_kbytes # sysctl -e vm.min_free_kbytes=double what you have I will try that, thanks. have you increased the number of rx/tx descriptors in use by e1000? No. Should I? I wouldn't recommend it if you're already having issues getting order:0 allocations, it would just make the problem worse. I wanted to make sure you were not. Jesse -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] e1000e-1.1.2 Compile errors with 2.4.37 and gcc 2.95.3
removed netdev, On Tue, Feb 9, 2010 at 3:21 AM, Ben Hutchings bhutchi...@solarflare.com wrote: On Tue, 2010-02-09 at 10:58 +0100, Marco Schwarz wrote: Hi, I get the following output when trying to compile e1000e-1.1.2 with Linux Kernel 2.4.37 and gcc 2.95.3 (e1000-8.0.16 compiles fine): [...] netdev only deals with recent 2.6 kernels. I'm amazed that Intel still wastes time on 2.4. this didn't make it to my intel address for some reason. I'll figure out the build issues and we may re-release a driver with the fix. -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Ixgbe and VLAN filtering
Port Server Adapter, 82598 controller OpenSuse 11.2, 64 bit, kernel version: 2.6.31.5 ixgbe-2.0.44.14-NAPI I have 4 ixgbe interfaces (eth1, eth2, eth3, eth4), and I would like to bridge them. I would like to bridge only some specified VLANs (101 and 102). I have to cope with mass traffic, so effective VLAN filtering is very important. I would like to use the 82598 controller's HW VLAN filtering. I use the following script: input_eths=eth1 eth2 eth3 eth4 input_vlans=101 102 echo echo Setting up input interfaces ... for eth in $input_eths do echo $eth ifconfig $eth 0.0.0.0 up for vlan in $input_vlans do vconfig add $eth $vlan ifconfig $eth.$vlan up done done echo echo Setting up bridge ... brctl addbr br0 for eth in $input_eths do for vlan in $input_vlans do brctl addif br0 $eth.$vlan done done ifconfig br0 up My question is the following: If I use the vconfig utility to specify VLANs, does it result HW vlan filtering in the 82598 Controller, or VLAN filtering is expressen only in the Linux (in the ixgbe driver or in the Linux network stack)? Thanks, Gyorgy Szaniszlo Ericsson Hungary Ltd. Yes, when using vconfig, the ixgbe driver is given the vlan information and sets the appropriate bits in the HW to do the filtering in the hardware. sln == Mr. Shannon Nelson LAN Access Division, Intel Corp. shannon.nel...@intel.commailto:shannon.nel...@intel.com I don't speak for Intel (503) 712-7659Parents can't afford to be squeamish. == Mr. Shannon Nelson LAN Access Division, Intel Corp. shannon.nel...@intel.commailto:shannon.nel...@intel.comI don't speak for Intel (503) 712-7659Parents can't afford to be squeamish. -- Jesse Brandeburg This email sent via Evolution, powered by Linux -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] [Bugme-new] [Bug 14748] New: e1000e NIC not working after reboot
On Mon, Dec 7, 2009 at 2:01 PM, Brandeburg, Jesse jesse.brandeb...@intel.com wrote: On Mon, 7 Dec 2009, Andrew Morton wrote: When I power up my system the NIC is working properly. After every reboot the NIC is not working. I mean the eth0 is created, but neither dhcpcd gets IP nor static setup helps We have a userspace tool called ethregs downloadable from http://downloads.sourceforge.net/project/e1000/Register%20Dump%20Tool/1.7.2/ethregs-1.7.2.tar.gz?use_mirror=iweb if it is not too much trouble can you build this tool and run it before (when the port is working) and after (when the link didn't come up) you can attach them to the bug, and reply to this thread would be best. I've looked at the ethregs dumps, the good news is it looks like the hardware succeeds to self-init, but on the ethregs-fails.txt did you load the driver? it appears you did not, or at least didn't do # ip link set eth0 up # ethregs regs.txt also looked at the lspci -vvv information and in both cases MSI was enabled, but in the fails case the value in the data field for the MSI vector is different, which seems a a little strange but I'm not sure if it is responsible for failure if the driver was loaded, and failed dhcp, what happens when you run ethtool -t eth0 offline? when the driver is loaded, and the dhcp fails, can you assign an address manually (and bring the interface up) and have it work? one more thing to note please, can you send cat /proc/interrupts from 10 seconds apart when the driver is loaded and the port is UP, but not working. dhcpcd or dhclient both have a tendency to put the port DOWN after they fail to get address, so thats why you may need to do # ip link command above before gathering /proc/interrupts. is your bios up to date? Thanks, sorry for the delay, lets see if we can figure out what is up. Jesse -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82567V-3 PXE boot
On Mon, 2010-01-11 at 01:12 -0800, kelon.hu...@emerson.com wrote: Hi, Support: These days I encounter an issue and urgently need your support. When I make a new Initrd.img of RHEL5.3 in order to boot from PXE, 82567V-3' driver cannot be found during the RHEL5.3 installation. I make sure that I have amended the files such as modules.alias, module-info and pci.ids. And I add the 82567V-3' driver--e1000e.ko to overwrite the old e1000e.ko in the file modules.cgz. Could you please give me some support? Is the 82567V-3 not supported for PXE boot? Thanks!! BTW, e1000e.ko is made from e1000e-1.1.2.tar. Did you force the module to be loaded with --preload in the initrd creation? in redhat the mkinitrd has a --preload=e1000e option to force a module to be loaded out of the initrd. Also may need --with=e1000e -- Jesse Brandeburg This email sent via Evolution, powered by Linux -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] Excessive frame dropping on 82574L
On Mon, 2009-12-21 at 18:53 -0700, Richard Scobie wrote: I have a low end server, Core 2 Duo 2.8, 4GB used to backup using rsync over a 82574L interface. Kernel 2.6.30.9-102.fc11.x86_64 (e1000e 0.3.3.4-k4). It is using MSI-X interrupts. It's suffering somewhat due to dropping frames: RX packets:294914332 errors:0 dropped:95203 overruns:0 frame:0 TX packets:355842341 errors:0 dropped:0 overruns:0 carrier:0 and ethtool shows rx_missed_errors: 95203. Googling shows these are caused by the RX FIFO filling up. Hi Richard, can you give the whole ethtool -S output? depending on the value of rx_no_buffer_count, you may be able to do something. The other thing to send is the output of lspci -vvv for your system, I'm curious if ASPM is enabled for the ethernet port or its upstream port. The other thing we may be able to do is provide a patch to enable GRO if at all possible (which should help significantly if it is not already enabled,) you can check with ethtool -k ethX, but I guess it may already be on. Is flow control enabled to your switch? Are you using jumbo frames? There was a fifo (flow control) configuration issue in several versions of the e1000e driver in the kernel. If that was the case disabling flow control might help you, ethtool -A ethX autoneg off rx off tx off ethtool -G ethX rx 4096 will max out the number of rx descriptors. you also may benefit from decreasing the interrupt rate using ethtool -C ethX rx-usecs 125 (8000 interrupts per second) because you're not doing a latency sensitive workload Please also provide /proc/interrupts and ethtool -e ethX, and if you are feeling gung-ho, the output of the ethregs utility available at sourceforge (you'll have to build it) in the Register Dump utility section. -- Jesse Brandeburg This email sent via Evolution, powered by Linux -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] LRO botch with 82598EB 2.0.44.14-NAPI
-eth2.44115: Flags [.], cksum 0x99b4 (correct), ack 39426, win 382, length 0 17:21:51.463283 IP (tos 0x0, ttl 64, id 101, offset 0, flags [DF], proto TCP (6), length 1500) nh2-eth2.44115 nh1-eth2.55716: Flags [.], ack 1, win 382, length 1460 17:21:51.463288 IP (tos 0x0, ttl 64, id 56746, offset 0, flags [DF], proto TCP (6), length 40) nh1-eth2.55716 nh2-eth2.44115: Flags [.], cksum 0x99b4 (correct), ack 39426, win 382, length 0 17:21:52.484305 IP (tos 0x0, ttl 64, id 56747, offset 0, flags [DF], proto TCP (6), length 1500) nh1-eth2.55716 nh2-eth2.44115: Flags [.], ack 39426, win 382, length 1460 17:21:52.484327 IP (tos 0x0, ttl 64, id 102, offset 0, flags [DF], proto TCP (6), length 40) nh2-eth2.44115 nh1-eth2.55716: Flags [.], cksum 0x99b6 (correct), ack 1, win 382, length 0 17:21:52.484332 IP (tos 0x0, ttl 64, id 56748, offset 0, flags [DF], proto TCP (6), length 40) nh1-eth2.55716 nh2-eth2.44115: Flags [.], cksum 0x99b4 (correct), ack 39426, win 382, length 0 -- Jesse Brandeburg This email sent via Evolution, powered by Linux -- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] Intel e1000e: eth0: Detected Tx Unit Hang
On Thu, Jan 1, 2009 at 11:56 AM, Soeren Sonnenburg ker...@nn7.de wrote: Dear list, I just recently observed a strange problem with an onboard 82567LF-2 Intel ethernet controller. It completely stopped working and this desktop machine required a powerdown to get it to work again. This happened with 2.6.27.10. However, the machine was working for weeks before using an older kernel version (2.6.27.*, no binary modules, intel skyburg mainboard, e8400 c2d cpu). Relevant details follow, can someone make sense of this? what does cat /proc/interrupts say? please also include at least ethtool -e eth0 length 256, do you have the IOMMU enabled? your full dmesg would let me know. also, please double check you're running the latest BIOS for your motherboard. 00:19.0 Ethernet controller: Intel Corporation 82567LF-2 Gigabit Network Connection $ dmesg | grep relevant parts Intel(R) PRO/1000 Network Driver - version 7.3.20-k3-NAPI e1000e: Intel(R) PRO/1000 Network Driver - 0.3.3.3-k6 :00:19.0: eth0: Intel(R) PRO/1000 Network Connection Intel(R) Gigabit Ethernet Network Driver - version 1.2.45-k2 :00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) XX:XX:XX:XX:XX:XX :00:19.0: eth0: Intel(R) PRO/1000 Network Connection :00:19.0: eth0: MAC: 5, PHY: 8, PBA No: ff-0ff ADDRCONF(NETDEV_UP): eth0: link is not ready :00:19.0: eth0: Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [...] [uptime of a couple of days, with lots of network i/o] [...] saa7146 (1) saa7146_i2c_writeout [irq]: timed out waiting for end of xfer saa7146 (1) saa7146_i2c_writeout [irq]: timed out waiting for end of xfer :00:19.0: eth0: Detected Tx Unit Hang: TDH ff TDT 1 next_to_use 1 next_to_cleanff buffer_info[next_to_clean]: time_stamp 1104ec2c3 next_to_watchff jiffies 1104ec8f0 next_to_watch.status 0 snip lots of hangs... what is the frequency of the hangs? What kind of traffic are you using? -- ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] how to repair correupted EEPROM/NVM?
On Fri, 2008-08-29 at 11:13 +0200, Pierre Ossman wrote: Brandeburg, Jesse [EMAIL PROTECTED] wrote: You have contacted your laptop vendor and told them about this right? I've tried, but they're currently giving me the whole have you checked the cable runaround. I'll see if I can get anyone on the phone today... No luck for a new EEPROM image. All they could do is replace the entire motherboard, which means they want to have access to the machine for two weeks. Not really a workable solution as I need it on a daily basis... yeah, I certainly understand that. I'll try to get my paws on that other R61 and copy its EEPROM. Is there anything other than the EEPROM that are specific to each machine? The machines are identical hardware-wise. I need to know what to change in his image before I burn it into my device. The only change should be the MAC address, I would only mess with the first few bytes. I'm concerned that you're actually having some other driver problem that isn't getting reported, like a wierd semaphore issue or something. can you build the e1000e-0.4.1.7.tar.gz driver from sourceforge and try it? While you're doing that, please build (and install if you so choose) with this --- e1000_osdep.h~ 2008-08-20 15:03:54.0 -0700 +++ e1000_osdep.h 2008-08-29 09:02:05.0 -0700 @@ -63,8 +63,8 @@ #define ETH_ADDR_LEN ETH_ALEN -#define DEBUGOUT(S) -#define DEBUGOUT1(S, A...) +#define DEBUGOUT(S) printk(KERN_DEBUG S) +#define DEBUGOUT1(S, A...) printk(KERN_DEBUG S, A) #define DEBUGFUNC(F) DEBUGOUT(F \n) #define DEBUGOUT2 DEBUGOUT1 when you rebuild and load the driver it will log a ton of stuff, please send it in a reply. The thing that most disturbed me is that you said your BIOS could still read the MAC address (if it wasn't pulling it from somewhere else). This indicates that it is likely your eeprom is still there and intact. Jesse - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel
Re: [E1000-devel] Port showing link down at random times
On Wed, 2008-08-27 at 09:26 +1000, Leigh Sharpe wrote: Hi All, I'm having a problem with my e1000 cards intermittently shutting down. This is happening across multiple cards, on multiple systems. I get the following messages in the syslog at the time: Aug 26 22:58:36 ElizaQOS kernel: e1000: eth13: e1000_watchdog: NIC Link is Down ethtool and mii-tool both show a different status for the affected port: --- [EMAIL PROTECTED]:~$ sudo mii-tool -v eth13 eth13: negotiated 100baseTx-FD flow-control, link ok product info: vendor 00:50:43, model 2 rev 5 basic mode: autonegotiation enabled basic status: autonegotiation complete, link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control what does ethtool -i eth13 say? also, can you tell us if you see the Intel AMT bios pre-boot screen come up? we have heard lots of reports of people having interaction problems with AMT, but believe the driver to have solved most of them now. ethtool -i will say but you didn't report your driver version. Would you be willing to try the e1000e driver from sourceforge? version 0.4.1.7 would be the best. you would have to manually remove e1000 and install e1000e in its place, changing modprobe.conf is probably necessary. Another option is to run a more recent kernel, but that is much harder to get set up than just upgrading our driver. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel