Hi all, I'm running a 3.10 kernel with the builtin ixgbe driver.
I have an 82599 device (eth5) that is being added to a bond (bond0), and there is an application which is trying to listen for multicast packets on 224.0.0.1 port 2106. Somehow (and I'm not sure how yet) we get into a state where on bootup the multicast packets are being dropped by the hardware. Running "ethtool -S eth5|grep multicast" gives a constant zero count. If I run "ip link set dev eth5 promisc off;ip link set dev eth5 promisc on" then the multicast stuff starts working. (Actually just the first command is enough to make it start working.) Comparing the ethtool register dump for the 82599 between working and non-working systems I noticed that the MCSTCTRL registers were different--on the non-working system it was 0x0, on the working system it was 0x4. I instrumented the ixgbe driver around the areas where we touch the MTA and MCSTCTRL registers and it showed the following on boot for the non-working system: [ 11.457825] bonding: bond0: enslaving eth5 as a backup interface with a down link. [ 11.476557] ixgbe 0000:05:00.1 eth5: Clearing MTA [ 11.480451] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses: [ 11.486198] ixgbe 0000:05:00.1 eth5: bit-vector = 0x010 [ 11.490677] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses: [ 11.496418] ixgbe 0000:05:00.1 eth5: bit-vector = 0x010 [ 11.500892] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses: [ 11.506631] ixgbe 0000:05:00.1 eth5: bit-vector = 0x020 [ 11.511122] ixgbe 0000:05:00.1 eth5: ixgbe_update_mc_addr_list_generic, writing 4 to MCSTCTRL [ 11.519189] ixgbe 0000:05:00.1 eth5: ixgbe_update_mc_addr_list_generic Complete [ 11.526096] device eth5 entered promiscuous mode [ 11.529958] ixgbe: ixgbe_set_rx_mode: promiscuous mode enable without vlan tags complete. [ 11.545349] ixgbe 0000:05:00.1 eth5: changing MTU from 1500 to 9216 [ 11.668964] ixgbe 0000:05:00.1 eth5: ixgbe_init_rx_addrs_generic, writing 0 to MCSTCTRL [ 11.676450] ixgbe 0000:05:00.1 eth5: Clearing MTA When I ran "ip link set dev eth5 promisc off", I got the following: [ 1984.953818] device eth5 left promiscuous mode [ 1984.953842] ixgbe 0000:05:00.1 eth5: Clearing MTA [ 1984.957745] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses: [ 1984.957748] ixgbe 0000:05:00.1 eth5: bit-vector = 0x010 [ 1984.957749] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses: [ 1984.957751] ixgbe 0000:05:00.1 eth5: bit-vector = 0x010 [ 1984.957753] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses: [ 1984.957755] ixgbe 0000:05:00.1 eth5: bit-vector = 0x020 [ 1984.957757] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses: [ 1984.957759] ixgbe 0000:05:00.1 eth5: bit-vector = 0x39B [ 1984.957760] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses: [ 1984.957769] ixgbe 0000:05:00.1 eth5: bit-vector = 0xFB0 [ 1984.957771] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses: [ 1984.957772] ixgbe 0000:05:00.1 eth5: bit-vector = 0x020 [ 1984.957805] ixgbe 0000:05:00.1 eth5: ixgbe_update_mc_addr_list_generic, writing 4 to MCSTCTRL [ 1984.957806] ixgbe 0000:05:00.1 eth5: ixgbe_update_mc_addr_list_generic Complete [ 1984.957808] ixgbe: ixgbe_set_rx_mode: promiscuous mode disabled. Interestingly, in both the working and non-working systems, after initial boot "ethtool -d eth5" shows: Unicast Promiscuous: disabled Multicast Promiscuous: disabled In both cases "ip link" shows: eth5: <BROADCAST,MULTICAST,PROMISC,SLAVE,UP,LOWER_UP> so why doesn't the hardware show them as "enabled? On the non-working system if I run "ip link set dev eth5 promisc off;ip link set dev eth5 promisc on" then the multicast starts working and "ethtool -d eth5" shows Unicast Promiscuous: enabled Multicast Promiscuous: enabled So...anyone got any ideas? I'm confused why the MULTICAST/PROMISC flags aren't represented by the underlying hardware initially. I'm also confused why after the slave is part of the bond that registering for multicast membership doesn't seem to call ixgbe_update_mc_addr_list_generic(). (How does that work anyway? I'm having a hard time figuring out how setsockopt(IP_ADD_MEMBERSHIP) actually adds multicast addresses to the hardware table.) Is it possible that there's a mismatch between what the hardware actually has enabled and what the networking core thinks is enabled? Lastly, I've had one report of a similar issue with hardware using the tg3 driver, and running "ip link set dev eth5 promisc off;ip link set dev eth5 promisc on" cleared the issue there as well. If that's repeatable it would seem to point at something above the hardware driver layer as the culprit. Chris ------------------------------------------------------------------------------ _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired