Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
Dave Johnson writes: Ben Greear writes: Currently, VLAN devices offer the ability to 'reorder' the header and explicitly remove the VLAN header. I assume we keep this feature and have the AF_PACKET logic check the device flags to see if it should insert the VLAN header for hw-accel vlans? Either way, if we sniff the underlying device, we should always get the VLAN header. Yes, but it's more than just a packet socket issue. A quick look through the hwaccel capable drivers (in 2.6.23) and most are doing something like: if (foo-vlgrp packet_is_tagged) vlan_hwaccel_receive_skb(skb, foo-vlgrp, vlan_tag); else netif_receive_skb(skb); The important thing here is if the vlan group is NULL, the MAC must be configured to NOT strip the tag. users of NETIF_F_HW_VLAN_RX: --- ./drivers/net/8139cp.c: looks ok ./drivers/net/acenic.c: *1 ./drivers/net/amd8111e.c: unsure, probably *1 ./drivers/net/atl1/atl1_main.c: looks ok ./drivers/net/bnx2.c: *2 ./drivers/net/bonding/bond_main.c: unsure, probably ok ./drivers/net/chelsio/cxgb2.c: looks ok ./drivers/net/cxgb3/cxgb3_main.c: looks ok ./drivers/net/e1000/e1000_main.c: looks ok ./drivers/net/ehea/ehea_main.c: unsure, probably ok ./drivers/net/forcedeth.c: looks ok ./drivers/net/gianfar.c:looks ok ./drivers/net/ixgb/ixgb_main.c: looks ok ./drivers/net/ns83820.c:unsure, probably ok ./drivers/net/r8169.c: looks ok ./drivers/net/s2io.c: *1 ./drivers/net/sky2.c: looks ok ./drivers/net/starfire.c: unsure, probably ok ./drivers/net/tg3.c:*2 ./drivers/net/typhoon.c:unsure, probably ok ./drivers/s390/net/qeth_main.c: unsure, probably ok *1: Driver configures the MAC to strip TAGs even if vlan group is NULL. MAC strips the tag, but driver calls netif_rx() or netif_receive_skb() with the packet as untagged. Kernel processes tagged packet as if it was received untagged. Possible security issue. *2: If chip supports 'ASF', tag is always stripped (see *1 above). Looks ok if ASF is not supported. Michael, These changes seems to cause this issue: [BNX2]: Fix VLAN on ASF Always set up the device to strip incoming VLAN tags when ASF is enabled. ASF firmware will not parse packets correctly if VLAN tags are not stripped. Signed-off-by: Michael Chan [EMAIL PROTECTED] Signed-off-by: David S. Miller [EMAIL PROTECTED] GIT: e29054f92d7d575631691865c1b95bee5bc974cc and [EMAIL PROTECTED], 2003-12-02 02:34:13-08:00, [EMAIL PROTECTED] +1 -0 [TG3]: Do not set RX_MODE_KEEP_VLAN_TAG when ASF is enabled. Could you elaborate if this is really needed, if so is there some workaround that could be done instead? Simply removing the check seemed to work for me, but I'm unsure if this is actually a valid thing to do with these MACs. -- Dave Johnson Starent Networks - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
On Fri, 2007-11-02 at 14:08 -0400, Dave Johnson wrote: *2: If chip supports 'ASF', tag is always stripped (see *1 above). Looks ok if ASF is not supported. Michael, These changes seems to cause this issue: [BNX2]: Fix VLAN on ASF Always set up the device to strip incoming VLAN tags when ASF is enabled. ASF firmware will not parse packets correctly if VLAN tags are not stripped. Signed-off-by: Michael Chan [EMAIL PROTECTED] Signed-off-by: David S. Miller [EMAIL PROTECTED] GIT: e29054f92d7d575631691865c1b95bee5bc974cc and [EMAIL PROTECTED], 2003-12-02 02:34:13-08:00, [EMAIL PROTECTED] +1 -0 [TG3]: Do not set RX_MODE_KEEP_VLAN_TAG when ASF is enabled. Could you elaborate if this is really needed, if so is there some workaround that could be done instead? This is needed for management firmware to work properly. Management firmware expects any VLAN tags to be stripped. Unfortunately, VLAN stripping cannot be done independently between the driver and the firmware. The workaround is to disable management firmware. Simply removing the check seemed to work for me, but I'm unsure if this is actually a valid thing to do with these MACs. Most of these on-board devices are shipped with management firmware enabled. Removing the check will make the firmware not functional. We realize this VLAN limitation is causing problems to many users and we are looking for ways to address it. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
From: Dave Johnson [EMAIL PROTECTED] Date: Fri, 2 Nov 2007 14:08:32 -0400 These changes seems to cause this issue: [BNX2]: Fix VLAN on ASF Always set up the device to strip incoming VLAN tags when ASF is enabled. ASF firmware will not parse packets correctly if VLAN tags are not stripped. Signed-off-by: Michael Chan [EMAIL PROTECTED] Signed-off-by: David S. Miller [EMAIL PROTECTED] GIT: e29054f92d7d575631691865c1b95bee5bc974cc and [EMAIL PROTECTED], 2003-12-02 02:34:13-08:00, [EMAIL PROTECTED] +1 -0 [TG3]: Do not set RX_MODE_KEEP_VLAN_TAG when ASF is enabled. Could you elaborate if this is really needed, if so is there some workaround that could be done instead? Simply removing the check seemed to work for me, but I'm unsure if this is actually a valid thing to do with these MACs. Unfortunately the ASF firmware is very picky. I think were are stuck with this behavior. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
David Miller wrote: From: Stephen Hemminger [EMAIL PROTECTED] Date: Wed, 31 Oct 2007 18:23:37 -0700 The code in AF_PACKET should fix the skb before passing to user space so that there is no difference between accel and non-accel hardware. Internal choices shouldn't leak to user space. Ditto, the receive checksum offload should be fixed up as well. The hardware has stripped the VLAN header completely and has not provided it to us at all. Do the NICs not save the QoS bits in the VLAN header anywhere that we could use to reconstitute the header? Thanks, Ben -- Ben Greear [EMAIL PROTECTED] Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
From: Ben Greear [EMAIL PROTECTED] Date: Thu, 01 Nov 2007 08:04:31 -0700 David Miller wrote: The hardware has stripped the VLAN header completely and has not provided it to us at all. Do the NICs not save the QoS bits in the VLAN header anywhere that we could use to reconstitute the header? You get the 16-bit VLAN tag. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
Ben Greear writes: We should also define what a NIC should do with VLANs it doesn't explicitly know about. I think it should pass them up the stack with VLAN tag intact, but again, perhaps there are reasons not to do that? Unless the device also supports NETIF_F_HW_VLAN_FILTER, it has no idea which vlans the kernel cares about, it's up to __vlan_hwaccel_rx(). Stephen Hemminger writes: The code in AF_PACKET should fix the skb before passing to user space so that there is no difference between accel and non-accel hardware. Internal choices shouldn't leak to user space. Ditto, the receive checksum offload should be fixed up as well. yep. bad csum on tx packets as reported by tcpdump is also an issue. Ben Greear writes: Currently, VLAN devices offer the ability to 'reorder' the header and explicitly remove the VLAN header. I assume we keep this feature and have the AF_PACKET logic check the device flags to see if it should insert the VLAN header for hw-accel vlans? Either way, if we sniff the underlying device, we should always get the VLAN header. Yes, but it's more than just a packet socket issue. A quick look through the hwaccel capable drivers (in 2.6.23) and most are doing something like: if (foo-vlgrp packet_is_tagged) vlan_hwaccel_receive_skb(skb, foo-vlgrp, vlan_tag); else netif_receive_skb(skb); The important thing here is if the vlan group is NULL, the MAC must be configured to NOT strip the tag. users of NETIF_F_HW_VLAN_RX: --- ./drivers/net/8139cp.c: looks ok ./drivers/net/acenic.c: *1 ./drivers/net/amd8111e.c: unsure, probably *1 ./drivers/net/atl1/atl1_main.c: looks ok ./drivers/net/bnx2.c: *2 ./drivers/net/bonding/bond_main.c: unsure, probably ok ./drivers/net/chelsio/cxgb2.c: looks ok ./drivers/net/cxgb3/cxgb3_main.c: looks ok ./drivers/net/e1000/e1000_main.c: looks ok ./drivers/net/ehea/ehea_main.c: unsure, probably ok ./drivers/net/forcedeth.c: looks ok ./drivers/net/gianfar.c:looks ok ./drivers/net/ixgb/ixgb_main.c: looks ok ./drivers/net/ns83820.c:unsure, probably ok ./drivers/net/r8169.c: looks ok ./drivers/net/s2io.c: *1 ./drivers/net/sky2.c: looks ok ./drivers/net/starfire.c: unsure, probably ok ./drivers/net/tg3.c:*2 ./drivers/net/typhoon.c:unsure, probably ok ./drivers/s390/net/qeth_main.c: unsure, probably ok *1: Driver configures the MAC to strip TAGs even if vlan group is NULL. MAC strips the tag, but driver calls netif_rx() or netif_receive_skb() with the packet as untagged. Kernel processes tagged packet as if it was received untagged. Possible security issue. *2: If chip supports 'ASF', tag is always stripped (see *1 above). Looks ok if ASF is not supported. Ben Greear writes: Do the NICs not save the QoS bits in the VLAN header anywhere that we could use to reconstitute the header? Most likely, __vlan_hwaccel_rx() gets the whole 16bit tag and sets skb-priority from on it. Besides the accidental removal with the drivers listed above when there is no vlan group registerd, we're still back to the original issue. Having __vlan_hwaccel_rx() send to the base device would likely require a copy of the skb (at least the head). That completely defeats the point of hwaccel. At a minimum, __vlan_hwaccel_rx() should probably add the vlan header back on if doesn't find a vlan device. This way no copy is needed, just shove the header back on (we already have the full 16bits). Once re-added, send to the base device instead of dropping. That would fix the unknown vlan issue, but known vlans would only go to the vlan device not the base device. Not sure of an easy fix for this as af_packet can specifically bind to a specified base device. I don't this this would be much of an issue and probably doesn't need fixing. -- Dave Johnson Starent Networks - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
The code in AF_PACKET should fix the skb before passing to user space so that there is no difference between accel and non-accel hardware. Internal choices shouldn't leak to user space. Ditto, the receive checksum offload should be fixed up as well. yep. bad csum on tx packets as reported by tcpdump is also an issue. With TX CKO enabled, there isn't any checksum to fixup when a tx packet is sniffed, so I'm not sure what can be done in the kernel apart from an unpalatable disable CKO and all which depend upon it when entering promiscuous mode. Having the tap calculate a checksum would be equally bad for performance, and would frankly be incorrect anyway because it would give the user the false impression that was the checksum which went-out onto the wire. One could I suppose try to ammend the information passed to allow tcpdump to say oh, this was a tx packet on the same machine on which I am tracing so don't worry about checksum mismatch but I have to wonder if it is _really_ worth it. Already someone has to deal with seeing TCP segments the MSS thanks to TSO. (Actually tcpdump got rather confused about that too since the IP length of those was 0, but IIRC we got that patched to use the length of zero as a ah, this was TSO so wing it heuristic.) rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
From: Dave Johnson [EMAIL PROTECTED] Date: Thu, 1 Nov 2007 17:36:22 -0400 bad csum on tx packets as reported by tcpdump is also an issue. We provide a tag to userspace that tcpdump should use to see that the HW is going to checksum the packet, and therefore it should elide trying to verify the checksums. It's not a kernel issue. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
From: Rick Jones [EMAIL PROTECTED] Date: Thu, 01 Nov 2007 14:48:45 -0700 One could I suppose try to ammend the information passed to allow tcpdump to say oh, this was a tx packet on the same machine on which I am tracing so don't worry about checksum mismatch We do this already! - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
David Miller wrote: From: Rick Jones [EMAIL PROTECTED] Date: Thu, 01 Nov 2007 14:48:45 -0700 One could I suppose try to ammend the information passed to allow tcpdump to say oh, this was a tx packet on the same machine on which I am tracing so don't worry about checksum mismatch We do this already! I'll try to go pester folks in tcpdump-workers then. rick - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
From: Rick Jones [EMAIL PROTECTED] Date: Thu, 01 Nov 2007 15:04:12 -0700 David Miller wrote: From: Rick Jones [EMAIL PROTECTED] Date: Thu, 01 Nov 2007 14:48:45 -0700 One could I suppose try to ammend the information passed to allow tcpdump to say oh, this was a tx packet on the same machine on which I am tracing so don't worry about checksum mismatch We do this already! I'll try to go pester folks in tcpdump-workers then. The thing to check is TP_STATUS_CSUMNOTREADY. When using mmap(), it will be provided in the descriptor. When using recvmsg() it will be provided via a PACKET_AUXDATA control message when enabled via the PACKET_AUXDATA socket option. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
David Miller wrote: From: Rick Jones [EMAIL PROTECTED] I'll try to go pester folks in tcpdump-workers then. The thing to check is TP_STATUS_CSUMNOTREADY. When using mmap(), it will be provided in the descriptor. When using recvmsg() it will be provided via a PACKET_AUXDATA control message when enabled via the PACKET_AUXDATA socket option. Figures... the dailies and weeklies for tar files of tcpdump and libpcap source are fubar... again. I've email in to tcpdump-workers on that one. If that isn't resolved quickly I'll learn how to access their CVS (pick an SCM, any SCM...) I did an apt-get of debian lenny's tcpdump and sources: hpcpc103:~# tcpdump -V tcpdump version 3.9.8 libpcap version 0.9.8 and that seems to show the false checksum failure and not use the TP_STATUS_CSUMNOTREADY - at least that didn't appear in a grepping of the sources. At first I thought it might be, but then I realized that my snaplen was too short to get the whole TSO'ed frame so tcpdump wasn't even trying to verify. After disabling TSO on the NIC, leaving CKO on, and making my snaplen 1500 I could see it was doing undesirable stuff. I'll see what top of trunk has at some point and what the folks there think of adding-in a change. rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
Depending on the network driver, I'm seeing different behavior if a .1q packet is received to an PF_PACKET, SOCK_RAW, ETH_P_ALL socket. On devices what do not use NETIF_F_HW_VLAN_RX, the packet socket gets the complete packet with vlan tag included as the driver simply calls netif_receive_skb() or equivilant. packet_rcv() then gets the whole thing vlan tag included and sends this through the socket. vlan_skb_recv() also gets these all and will drop them because there are no vlans configured. Example, e100 driver gives this to tcpdump: # ifconfig eth1 up # tcpdump -s 2000 -e -n -i eth1 tcpdump: WARNING: eth1: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 2000 bytes 14:11:03.707178 00:0b:82:05:22:0a ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, arp who-has 192.168.101.191 tell 192.168.101.131 14:11:04.215164 00:0b:82:05:22:05 ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, arp who-has 192.168.101.191 tell 192.168.101.130 14:11:04.658940 00:0b:82:05:22:0c ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, arp who-has 192.168.101.191 tell 192.168.101.135 14:11:05.706070 00:0b:82:05:22:0a ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, arp who-has 192.168.101.191 tell 192.168.101.131 14:11:05.939195 00:b0:c2:e8:d8:1c 33:33:00:00:00:01, ethertype 802.1Q (0x8100), length 122: vlan 108, p 0, ethertype IPv6, fe80::2b0:c2ff:fee8:d81c ff02::1: icmp6: router advertisement [class 0xe0] 14:11:07.222302 00:b0:c2:e8:d8:1c 33:33:00:00:00:01, ethertype 802.1Q (0x8100), length 122: vlan 110, p 0, ethertype IPv6, fe80::2b0:c2ff:fee8:d81c ff02::1: icmp6: router advertisement [class 0xe0] 14:11:08.486953 00:b0:c2:e8:d8:1c 01:00:5e:00:00:05, ethertype 802.1Q (0x8100), length 134: vlan 110, p 0, ethertype IPv4, IP 192.168.110.20 224.0.0.5: OSPFv2, Hello (1), length: 80 14:11:11.528569 00:30:48:22:63:50 ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 154: vlan 208, p 0, ethertype IPv4, IP 195.180.3.200.33350 195.180.3.255.111: UDP, length: 108 14:11:12.642762 00:0b:82:05:22:05 ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, arp who-has 192.168.101.191 tell 192.168.101.130 14:11:12.642766 00:0b:82:05:22:05 ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, arp who-has 192.168.101.191 tell 192.168.101.130 The packet socket gets everything including the vlan tag as I'd expect. But on the bnx2 driver (for example) I get 2 different behaviors: 1) If no vlan interfaces are configured, it calls netif_receive_skb() because there isn't a vlan group registered via bnx2_vlan_rx_register(). # ifconfig eth1 up # tcpdump -s 2000 -e -n -i eth1 tcpdump: WARNING: eth1: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 2000 bytes 14:21:27.170505 00:0b:82:05:22:05 ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: arp who-has 192.168.101.191 tell 192.168.101.130 14:21:27.170577 00:0b:82:05:22:05 ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: arp who-has 192.168.101.191 tell 192.168.101.130 14:21:27.495814 00:0b:82:05:22:0c ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: arp who-has 192.168.101.191 tell 192.168.101.135 14:21:27.495881 00:0b:82:05:22:0c ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: arp who-has 192.168.101.191 tell 192.168.101.135 14:21:28.151070 00:0b:82:05:22:05 ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: arp who-has 192.168.101.191 tell 192.168.101.130 14:21:28.166780 00:b0:c2:e8:d8:1c 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 118: fe80::2b0:c2ff:fee8:d81c ff02::1: icmp6: router advertisement [class 0xe0] 14:21:28.476404 00:0b:82:05:22:0c ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: arp who-has 192.168.101.191 tell 192.168.101.135 14:21:28.492099 00:b0:c2:e8:d8:1c 01:00:5e:00:00:05, ethertype IPv4 (0x0800), length 130: IP 192.168.110.20 224.0.0.5: OSPFv2, Hello (1), length: 80 14:21:28.631439 00:19:b9:e7:8a:d7 33:33:ff:e7:8a:d7, ethertype IPv6 (0x86dd), length 78: :: ff02::1:ffe7:8ad7: icmp6: neighbor sol: who has fd4d:5643:2886:67:219:b9ff:fee7:8ad7 14:21:28.671611 00:0b:82:05:22:0a ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: arp who-has 192.168.101.191 tell 192.168.101.131 14:21:28.671684 00:0b:82:05:22:0a ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: arp who-has 192.168.101.191 tell 192.168.101.131 the packet handed to netif_receive_skb() does not have the vlan tag on it. this allows all these packets to be processed by not only the packet ptype handler, but also ip, arp, etc... this seems very wrong as all vlan packets are stripped
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
On Wed, 31 Oct 2007 14:43:51 -0400 Dave Johnson [EMAIL PROTECTED] wrote: Depending on the network driver, I'm seeing different behavior if a .1q packet is received to an PF_PACKET, SOCK_RAW, ETH_P_ALL socket. On devices what do not use NETIF_F_HW_VLAN_RX, the packet socket gets the complete packet with vlan tag included as the driver simply calls netif_receive_skb() or equivilant. packet_rcv() then gets the whole thing vlan tag included and sends this through the socket. vlan_skb_recv() also gets these all and will drop them because there are no vlans configured. The VLAN acceleration grabs and hides the tag. It is a design flaw that should be fixed, feel free to post a patch. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
From: Ben Greear [EMAIL PROTECTED] Date: Wed, 31 Oct 2007 18:06:34 -0700 DaveM did the HW Accel for VLANs if I remember correctly...perhaps he has some input? Not really, I'm busy and also not motivated to work on this, someone else will need to. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
Stephen Hemminger wrote: On Wed, 31 Oct 2007 14:43:51 -0400 Dave Johnson [EMAIL PROTECTED] wrote: Depending on the network driver, I'm seeing different behavior if a .1q packet is received to an PF_PACKET, SOCK_RAW, ETH_P_ALL socket. On devices what do not use NETIF_F_HW_VLAN_RX, the packet socket gets the complete packet with vlan tag included as the driver simply calls netif_receive_skb() or equivilant. packet_rcv() then gets the whole thing vlan tag included and sends this through the socket. vlan_skb_recv() also gets these all and will drop them because there are no vlans configured. The VLAN acceleration grabs and hides the tag. It is a design flaw that should be fixed, feel free to post a patch. There may be several ways to 'fix' this. Perhaps it would be worth discussing what we want the end result to be at least? Should we always pass the vlan header up to raw sockets as part of the data payload? Or, maybe pass it in an auxiliary message such as how timestamps may be passed? The first option seems cleaner, but maybe there are performance problems with this approach? We should also define what a NIC should do with VLANs it doesn't explicitly know about. I think it should pass them up the stack with VLAN tag intact, but again, perhaps there are reasons not to do that? DaveM did the HW Accel for VLANs if I remember correctly...perhaps he has some input? Thanks, Ben -- Ben Greear [EMAIL PROTECTED] Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
Stephen Hemminger wrote: The code in AF_PACKET should fix the skb before passing to user space so that there is no difference between accel and non-accel hardware. Internal choices shouldn't leak to user space. Ditto, the receive checksum offload should be fixed up as well. Ok, I guess that will fix the sniffing issues and any user-space bridging type applications. Currently, VLAN devices offer the ability to 'reorder' the header and explicitly remove the VLAN header. I assume we keep this feature and have the AF_PACKET logic check the device flags to see if it should insert the VLAN header for hw-accel vlans? Either way, if we sniff the underlying device, we should always get the VLAN header. What about drivers and filtering VLANs? It seems there is still a difference between software vlans and hw-accel in this case. Thanks, Ben -- Ben Greear [EMAIL PROTECTED] Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
Ben Greear wrote: Stephen Hemminger wrote: On Wed, 31 Oct 2007 14:43:51 -0400 Dave Johnson [EMAIL PROTECTED] wrote: Depending on the network driver, I'm seeing different behavior if a .1q packet is received to an PF_PACKET, SOCK_RAW, ETH_P_ALL socket. On devices what do not use NETIF_F_HW_VLAN_RX, the packet socket gets the complete packet with vlan tag included as the driver simply calls netif_receive_skb() or equivilant. packet_rcv() then gets the whole thing vlan tag included and sends this through the socket. vlan_skb_recv() also gets these all and will drop them because there are no vlans configured. The VLAN acceleration grabs and hides the tag. It is a design flaw that should be fixed, feel free to post a patch. There may be several ways to 'fix' this. Perhaps it would be worth discussing what we want the end result to be at least? Should we always pass the vlan header up to raw sockets as part of the data payload? Or, maybe pass it in an auxiliary message such as how timestamps may be passed? The first option seems cleaner, but maybe there are performance problems with this approach? We should also define what a NIC should do with VLANs it doesn't explicitly know about. I think it should pass them up the stack with VLAN tag intact, but again, perhaps there are reasons not to do that? DaveM did the HW Accel for VLANs if I remember correctly...perhaps he has some input? The code in AF_PACKET should fix the skb before passing to user space so that there is no difference between accel and non-accel hardware. Internal choices shouldn't leak to user space. Ditto, the receive checksum offload should be fixed up as well. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: expected behavior of PF_PACKET on NETIF_F_HW_VLAN_RX device?
From: Stephen Hemminger [EMAIL PROTECTED] Date: Wed, 31 Oct 2007 18:23:37 -0700 The code in AF_PACKET should fix the skb before passing to user space so that there is no difference between accel and non-accel hardware. Internal choices shouldn't leak to user space. Ditto, the receive checksum offload should be fixed up as well. The hardware has stripped the VLAN header completely and has not provided it to us at all. In my opinion trying to cobble one up by hand using the known TAG is worse than not providing a VLAN header at all. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html