Hi all,
I'm running a 3.10 kernel with the builtin ixgbe driver.
I have an 82599 device (eth5) that is being added to a bond (bond0), and there
is an application which is trying to listen for multicast packets on 224.0.0.1
port 2106.
Somehow (and I'm not sure how yet) we get into a state where on bootup the
multicast packets are being dropped by the hardware. Running "ethtool -S
eth5|grep multicast" gives a constant zero count.
If I run "ip link set dev eth5 promisc off;ip link set dev eth5 promisc on"
then
the multicast stuff starts working. (Actually just the first command is enough
to make it start working.)
Comparing the ethtool register dump for the 82599 between working and
non-working systems I noticed that the MCSTCTRL registers were different--on
the
non-working system it was 0x0, on the working system it was 0x4.
I instrumented the ixgbe driver around the areas where we touch the MTA and
MCSTCTRL registers and it showed the following on boot for the non-working
system:
[ 11.457825] bonding: bond0: enslaving eth5 as a backup interface with a down
link.
[ 11.476557] ixgbe 0000:05:00.1 eth5: Clearing MTA
[ 11.480451] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses:
[ 11.486198] ixgbe 0000:05:00.1 eth5: bit-vector = 0x010
[ 11.490677] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses:
[ 11.496418] ixgbe 0000:05:00.1 eth5: bit-vector = 0x010
[ 11.500892] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses:
[ 11.506631] ixgbe 0000:05:00.1 eth5: bit-vector = 0x020
[ 11.511122] ixgbe 0000:05:00.1 eth5: ixgbe_update_mc_addr_list_generic,
writing 4 to MCSTCTRL
[ 11.519189] ixgbe 0000:05:00.1 eth5: ixgbe_update_mc_addr_list_generic
Complete
[ 11.526096] device eth5 entered promiscuous mode
[ 11.529958] ixgbe: ixgbe_set_rx_mode: promiscuous mode enable without vlan
tags complete.
[ 11.545349] ixgbe 0000:05:00.1 eth5: changing MTU from 1500 to 9216
[ 11.668964] ixgbe 0000:05:00.1 eth5: ixgbe_init_rx_addrs_generic, writing 0
to MCSTCTRL
[ 11.676450] ixgbe 0000:05:00.1 eth5: Clearing MTA
When I ran "ip link set dev eth5 promisc off", I got the following:
[ 1984.953818] device eth5 left promiscuous mode
[ 1984.953842] ixgbe 0000:05:00.1 eth5: Clearing MTA
[ 1984.957745] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses:
[ 1984.957748] ixgbe 0000:05:00.1 eth5: bit-vector = 0x010
[ 1984.957749] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses:
[ 1984.957751] ixgbe 0000:05:00.1 eth5: bit-vector = 0x010
[ 1984.957753] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses:
[ 1984.957755] ixgbe 0000:05:00.1 eth5: bit-vector = 0x020
[ 1984.957757] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses:
[ 1984.957759] ixgbe 0000:05:00.1 eth5: bit-vector = 0x39B
[ 1984.957760] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses:
[ 1984.957769] ixgbe 0000:05:00.1 eth5: bit-vector = 0xFB0
[ 1984.957771] ixgbe 0000:05:00.1 eth5: Adding the multicast addresses:
[ 1984.957772] ixgbe 0000:05:00.1 eth5: bit-vector = 0x020
[ 1984.957805] ixgbe 0000:05:00.1 eth5: ixgbe_update_mc_addr_list_generic,
writing 4 to MCSTCTRL
[ 1984.957806] ixgbe 0000:05:00.1 eth5: ixgbe_update_mc_addr_list_generic
Complete
[ 1984.957808] ixgbe: ixgbe_set_rx_mode: promiscuous mode disabled.
Interestingly, in both the working and non-working systems, after initial boot
"ethtool -d eth5" shows:
Unicast Promiscuous: disabled
Multicast Promiscuous: disabled
In both cases "ip link" shows:
eth5: <BROADCAST,MULTICAST,PROMISC,SLAVE,UP,LOWER_UP>
so why doesn't the hardware show them as "enabled?
On the non-working system if I run "ip link set dev eth5 promisc off;ip link
set
dev eth5 promisc on" then the multicast starts working and "ethtool -d eth5"
shows
Unicast Promiscuous: enabled
Multicast Promiscuous: enabled
So...anyone got any ideas? I'm confused why the MULTICAST/PROMISC flags aren't
represented by the underlying hardware initially. I'm also confused why after
the slave is part of the bond that registering for multicast membership doesn't
seem to call ixgbe_update_mc_addr_list_generic(). (How does that work anyway?
I'm having a hard time figuring out how setsockopt(IP_ADD_MEMBERSHIP) actually
adds multicast addresses to the hardware table.)
Is it possible that there's a mismatch between what the hardware actually has
enabled and what the networking core thinks is enabled?
Lastly, I've had one report of a similar issue with hardware using the tg3
driver, and running "ip link set dev eth5 promisc off;ip link set dev eth5
promisc on" cleared the issue there as well. If that's repeatable it would
seem
to point at something above the hardware driver layer as the culprit.
Chris
------------------------------------------------------------------------------
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired