[PATCH] rhashtable: don't attempt to grow when at max_size
From: Johannes Berg johannes.b...@intel.com The conversion of mac80211's station table to rhashtable had a bug that I found by accident in code review, that hadn't been found as rhashtable apparently managed to have a maximum hash chain length of one (!) in all our testing. In order to test the bug and verify the fix I set my rhashtable's max_size very low (4) in order to force getting hash collisions. At that point, rhashtable WARNed in rhashtable_insert_rehash() but didn't actually reject the hash table insertion. This caused it to lose insertions - my master list of stations would have 9 entries, but the rhashtable only had 5. This may warrant a deeper look, but that WARN_ON() just shouldn't happen. Fix this by not returning true from rht_grow_above_100() when the rhashtable's max_size has been reached - in this case the user is explicitly configuring it to be at most that big, so even if it's now above 100% it shouldn't attempt to resize. This fixes the lost insertion issue and consequently allows my code to display its error (and verify my fix for it.) Signed-off-by: Johannes Berg johannes.b...@intel.com --- include/linux/rhashtable.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h index e23d242d1230..dbcbcc59aa92 100644 --- a/include/linux/rhashtable.h +++ b/include/linux/rhashtable.h @@ -282,7 +282,8 @@ static inline bool rht_shrink_below_30(const struct rhashtable *ht, static inline bool rht_grow_above_100(const struct rhashtable *ht, const struct bucket_table *tbl) { - return atomic_read(ht-nelems) tbl-size; + return atomic_read(ht-nelems) tbl-size + (!ht-p.max_size || tbl-size ht-p.max_size); } /* The bucket lock is selected based on the hash and protects mutations -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rhashtable: don't attempt to grow when at max_size
From: Johannes Berg johan...@sipsolutions.net Date: Thu, 23 Apr 2015 16:38:43 +0200 From: Johannes Berg johannes.b...@intel.com The conversion of mac80211's station table to rhashtable had a bug that I found by accident in code review, that hadn't been found as rhashtable apparently managed to have a maximum hash chain length of one (!) in all our testing. In order to test the bug and verify the fix I set my rhashtable's max_size very low (4) in order to force getting hash collisions. At that point, rhashtable WARNed in rhashtable_insert_rehash() but didn't actually reject the hash table insertion. This caused it to lose insertions - my master list of stations would have 9 entries, but the rhashtable only had 5. This may warrant a deeper look, but that WARN_ON() just shouldn't happen. Fix this by not returning true from rht_grow_above_100() when the rhashtable's max_size has been reached - in this case the user is explicitly configuring it to be at most that big, so even if it's now above 100% it shouldn't attempt to resize. This fixes the lost insertion issue and consequently allows my code to display its error (and verify my fix for it.) Signed-off-by: Johannes Berg johannes.b...@intel.com It looks fine to me, but I'll let Herbert and Thomas review this. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CCM/GCM implementation defect
On 04/23/2015 07:45 AM, Steffen Klassert wrote: On Thu, Apr 23, 2015 at 11:26:20AM +0800, Herbert Xu wrote: Hi: It looks like our IPsec implementations of CCM and GCM are buggy in that they don't include the IV in the authentication calculation. Seems like crypto_rfc4106_crypt() passes the associated data it got from ESP directly to gcm, without chaining with the IV. This definitely breaks interoperability with anyone who implements them correctly. The fact that there have been no reports on this probably means that nobody has run into this in the field yet. On the security side, this is probably not a big deal for CCM because it always verifies the authentication tag after decryption. But for GCM this may be a DoS issue as an attacker could modify the IV without triggering the authentication check and thus cause an unnecessary decryption. For both CCM and GCM this will result in random data injected as a packet into the network stack which hopefully will be dropped. In order to fix this without breaking backwards compatibility, my plan is to introduce new templates such as rfc4106v2 which implement the RFC correctly. The existing templates will be retained so that current users aren't broken by the fix. Adding a second template for the correct implementation is probaply the only thing that we can do if we don't want to break backwards compatibility. But maybe we can add a warning to the old implementation, such that users notice that they use a broken version. Unless we have a cryptographer indicate to us how that this mistake does not significantly reduce or break the confidentiality and authentication, I do not think we should keep a known broken implementation around. Lets say this brings GCM security down to ROT13, then it just needs to die without keeping interoperability. Additionally, looking at how long we suffered and still suffer from defaulting to something broken (sha2_256 truncation) I'm really not in favour of keeping broken crypto implementations around. Paul -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] ibmveth: Fix off-by-one error in ibmveth_change_mtu()
From: David Gibson da...@gibson.dropbear.id.au Date: Thu, 23 Apr 2015 14:43:05 +1000 AFAIK the PAPR document which defines the virtual device interface used by the ibmveth driver doesn't specify a specific maximum MTU. So, in the ibmveth driver, the maximum allowed MTU is determined by the maximum allocated buffer size of 64k (corresponding to one page in the common case) minus the per-buffer overhead IBMVETH_BUFF_OH (which has value 22 for 14 bytes of ethernet header, plus 8 bytes for an opaque handle). This suggests a maximum allowable MTU of 65514 bytes, but in fact the driver only permits a maximum MTU of 65513. This is because there is a instead of an = in ibmveth_change_mtu(), which only permits an MTU which is strictly smaller than the buffer size, rather than allowing the buffer to be completely filled. This patch fixes the buglet. Signed-off-by: David Gibson da...@gibson.dropbear.id.au Acked-by: Thomas Falcon tlfal...@linux.vnet.ibm.com Applied, thank you. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rhashtable: don't attempt to grow when at max_size
On 04/23/2015 06:09 PM, Johannes Berg wrote: On Thu, 2015-04-23 at 11:59 -0400, David Miller wrote: This fixes the lost insertion issue and consequently allows my code to display its error (and verify my fix for it.) It looks fine to me, but I'll let Herbert and Thomas review this. Oh, sorry, didn't know Herbert was also involved with this. Would be good to add Fixes tags, commit in question seems to be: ccd57b1bd324 (rhashtable: Add immediate rehash during insertion) Anyway, that's a good idea. I really went for this looks obvious and patched it, and it started working for me thereafter. I don't know anything about side effects here. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix tcp fin memory accounting
From: Eric Dumazet eric.duma...@gmail.com Date: Thu, 23 Apr 2015 07:02:47 -0700 + if (!tcp_send_head(sk)) { + tp-snd_nxt++; + return; + } I'm not so sure about this. Why is this needed? Otherwise patch looks fine to me. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rhashtable: don't attempt to grow when at max_size
On Thu, 2015-04-23 at 11:59 -0400, David Miller wrote: This fixes the lost insertion issue and consequently allows my code to display its error (and verify my fix for it.) It looks fine to me, but I'll let Herbert and Thomas review this. Oh, sorry, didn't know Herbert was also involved with this. Anyway, that's a good idea. I really went for this looks obvious and patched it, and it started working for me thereafter. I don't know anything about side effects here. johannes -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC] openvswitch: add support for netpoll
This patch simply forwards unicast netpoll packets via one of physical interface in datapath depending on source mac address from the skb. It seems possible to use common net flow classification for netpoll but there is no way to guarantee presence of right flow in kernel cache. Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru --- net/openvswitch/vport-internal_dev.c | 74 ++ net/openvswitch/vport-netdev.c | 63 - net/openvswitch/vport-netdev.h | 15 +++ 3 files changed, 148 insertions(+), 4 deletions(-) diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c index 6a55f7105505..d1eb09ac09e8 100644 --- a/net/openvswitch/vport-internal_dev.c +++ b/net/openvswitch/vport-internal_dev.c @@ -23,6 +23,7 @@ #include linux/etherdevice.h #include linux/ethtool.h #include linux/skbuff.h +#include linux/netpoll.h #include net/dst.h #include net/xfrm.h @@ -66,11 +67,77 @@ static struct rtnl_link_stats64 *internal_dev_get_stats(struct net_device *netde return stats; } +#ifdef CONFIG_NET_POLL_CONTROLLER + +static void internal_dev_poll_controller(struct net_device *dev) +{ +} + +static struct netdev_vport *get_local_netdev_vport(struct net_device *dev) +{ + struct datapath *dp = internal_dev_priv(dev)-vport-dp; + + return netdev_vport_priv(ovs_lookup_vport(dp, OVSP_LOCAL)); +} + +static int internal_dev_netpoll_setup(struct net_device *internal_dev, + struct netpoll_info *info) +{ + struct netdev_vport *local = get_local_netdev_vport(internal_dev); + struct netdev_vport *lower; + struct list_head *iter; + int ret = -EOPNOTSUPP; + + ASSERT_RTNL(); + /* succeed if at least one lower device can handle netpoll */ + netdev_for_each_lower_private(local-dev, lower, iter) + if (!ovs_netdev_netpoll_enable(lower)) + ret = 0; + /* enable netpoll on local device as mark for new devices */ + if (!ret local-dev != internal_dev) + ovs_netdev_netpoll_enable(local); + return ret; +} + +static void internal_dev_netpoll_cleanup(struct net_device *internal_dev) +{ + struct netdev_vport *local = get_local_netdev_vport(internal_dev); + struct netdev_vport *lower; + struct list_head *iter; + + /* FIXME needs reference counting for more than one netpoll in dp */ + + ASSERT_RTNL(); + netdev_for_each_lower_private(local-dev, lower, iter) + ovs_netdev_netpoll_disable(lower); + ovs_netdev_netpoll_disable(local); +} + +static void internal_dev_netpoll_xmit(struct sk_buff *skb, + struct net_device *internal_dev) +{ + struct netdev_vport *local = get_local_netdev_vport(internal_dev); + struct netdev_vport *lower; + struct list_head *iter; + + netdev_for_each_lower_private_rcu(local-dev, lower, iter) + if (!ovs_netdev_netpoll_send(lower, skb)) + return; /* Unicast only, first gets all. */ + dev_kfree_skb_irq(skb); +} + +#endif /* CONFIG_NET_POLL_CONTROLLER */ + /* Called with rcu_read_lock_bh. */ static int internal_dev_xmit(struct sk_buff *skb, struct net_device *netdev) { rcu_read_lock(); - ovs_vport_receive(internal_dev_priv(netdev)-vport, skb, NULL); +#ifdef CONFIG_NET_POLL_CONTROLLER + if (unlikely(netpoll_tx_running(netdev))) + internal_dev_netpoll_xmit(skb, netdev); + else +#endif + ovs_vport_receive(internal_dev_priv(netdev)-vport, skb, NULL); rcu_read_unlock(); return 0; } @@ -122,6 +189,11 @@ static const struct net_device_ops internal_dev_netdev_ops = { .ndo_set_mac_address = eth_mac_addr, .ndo_change_mtu = internal_dev_change_mtu, .ndo_get_stats64 = internal_dev_get_stats, +#ifdef CONFIG_NET_POLL_CONTROLLER + .ndo_poll_controller = internal_dev_poll_controller, + .ndo_netpoll_setup = internal_dev_netpoll_setup, + .ndo_netpoll_cleanup = internal_dev_netpoll_cleanup, +#endif }; static struct rtnl_link_ops internal_dev_link_ops __read_mostly = { diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c index 4776282c6417..324fb078d32a 100644 --- a/net/openvswitch/vport-netdev.c +++ b/net/openvswitch/vport-netdev.c @@ -26,6 +26,7 @@ #include linux/rtnetlink.h #include linux/skbuff.h #include linux/openvswitch.h +#include linux/netpoll.h #include net/llc.h @@ -86,10 +87,59 @@ static struct net_device *get_dpdev(const struct datapath *dp) return netdev_vport_priv(local)-dev; } +#ifdef CONFIG_NET_POLL_CONTROLLER +int ovs_netdev_netpoll_enable(struct netdev_vport *netdev_vport) +{ + struct netpoll *np; + int err; + + if (netdev_vport-np) + return 0; + + np = kzalloc(sizeof(*np), GFP_KERNEL);
Re: [PATCH] net: phy: micrel: don't do clock-mode-select if we got NULL clock
From: Niklas Cassel niklas.cas...@axis.com Date: Thu, 23 Apr 2015 15:37:11 +0200 Signed-off-by: Niklas Cassel nikl...@axis.com --- drivers/net/phy/micrel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 1190fd8..a422036 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -548,7 +548,7 @@ static int kszphy_probe(struct phy_device *phydev) } clk = devm_clk_get(phydev-dev, rmii-ref); - if (!IS_ERR(clk)) { + if (!IS_ERR_OR_NULL(clk)) { unsigned long rate = clk_get_rate(clk); bool rmii_ref_clk_sel_25_mhz; I do not see anyone in any other networking driver checking for a NULL return from devm_clk_get(). So either everyone else is wrong, or your change is. I want to find out which before applying anything. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net 0/3] tipc: three bug fixes
From: Jon Maloy jon.ma...@ericsson.com Date: Thu, 23 Apr 2015 09:37:37 -0400 A set of unrelated corrections; one for the tipc netns implementation, one regarding problems with random link resets, and one removing a an erroneous refcount decrement when reading link statistsics via netlink. Series applied, thanks Jon. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix tcp fin memory accounting
On Thu, 2015-04-23 at 11:54 -0400, David Miller wrote: From: Eric Dumazet eric.duma...@gmail.com Date: Thu, 23 Apr 2015 07:02:47 -0700 + if (!tcp_send_head(sk)) { + tp-snd_nxt++; + return; + } I'm not so sure about this. Why is this needed? Otherwise patch looks fine to me. I guess I need to add a comment then ;) If we want to pretend FIN was sent, we also need to tweak tp-snd_nxt to match new tskb-end_seq (or tp-write_seq). I tested following packetdrill script and confirmed that if I do not tweak snd_nxt, last packet sent is incorrect : . 5001:5001(0) ack 2 This might be because our stack relies that we never coalesce something on one already sent skb (we do this check in tcp_sendmsg() for example) Also note the funny thing : The FIN that is finally sent also has a Push flag. I believe its fine to leave this, as it is not incorrect at protocol perspective, and removing the push would add some logic when we split the packet to remove the ACKed part. // Initialize a server socket. 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 S 0:0(0) win 32792 mss 1000,sackOK,nop,nop,nop,wscale 7 +0 S. 0:0(0) ack 1 mss 1460,nop,nop,sackOK,nop,wscale 6 +.010 . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 +0.010 write(4, ..., 5000) = 5000 +0 P. 1:5001(5000) ack 1 // this assumes kernel is not able to allocate fresh skb to // hold FIN +.010 shutdown(4, SHUT_RDWR) = 0 +0 . 1:1(0) ack 5001 win 257 // Yeah ! note the FIN that we 'retransmit' also gets a Push +0.209 FP. 5001:5001(0) ack 1 +0 read(4, ..., 1000) = 0 +0 write(4, ..., 1000) = -1 EPIPE (Broken pipe) +.010 close(4) = 0 // TLP at 2*RTT after the original send of FIN. +.201 FP. 5001:5001(0) ack 1 // Receive an ACK for the remaining outstanding data. +.010 . 1:1(0) ack 5002 win 257 // Receive a FIN. +.010 F. 1:1(0) ack 5002 win 257 // Make sure final packet is correct +0 . 5002:5002(0) ack 2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: NAPI: Polling function requesting extra calls - difference between 3.18.11 and 4.0.0
On Thu, 2015-04-23 at 20:18 +0200, Rafał Miłecki wrote: Can you help us with this, please? Does bgmac use NAPI incorrectly? Were there some important changes in 3.19 or 4.0? Right they were important changes in NAPI indeed : 3b47d30396ba net: gro: add a per device gro flush timer d75b1ade567f net: less interrupt masking in NAPI bc9ad166e38a net: introduce napi_schedule_irqoff() This requested some fixes in various drivers : commit f31ec95fa19e07a8beebcc0297284f23aa57967e commit 24e579c8898aa641ede3149234906982290934e5 commit 6088beef3f7517717bd21d90b379714dd0837079 commit f104fedc0da126abe93dd0f4a9fa13e5133bf9df commit 7a05dc64e2e4c611d89007b125b20c0d2a4d31a5 commit 001ce546bb537bb5b7821f05633556a0c9787e32 commit 3079c652141f9d6377417a7e8fd650c9948df65e commit 8acdf999accfd95093db17f33a58429a38782060 commit 6a6dc08ff6395f58be3ee568cb970ea956f16819 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] ethernet: myri10ge: use arch_phys_wc_add()
From: Luis R. Rodriguez mcg...@do-not-panic.com Date: Tue, 21 Apr 2015 13:09:45 -0700 From: Luis R. Rodriguez mcg...@suse.com This driver already uses ioremap_wc() on the same range so when write-combining is available that will be used instead. ... Signed-off-by: Luis R. Rodriguez mcg...@suse.com I can't wait forever for the driver maintainers to review this, so I'm applying it. Thanks Luis. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix tcp fin memory accounting
On Thu, 2015-04-23 at 09:16 -0700, Eric Dumazet wrote: On Thu, 2015-04-23 at 11:54 -0400, David Miller wrote: From: Eric Dumazet eric.duma...@gmail.com Date: Thu, 23 Apr 2015 07:02:47 -0700 + if (!tcp_send_head(sk)) { + tp-snd_nxt++; + return; + } I'm not so sure about this. Why is this needed? Otherwise patch looks fine to me. I guess I need to add a comment then ;) If we want to pretend FIN was sent, we also need to tweak tp-snd_nxt to match new tskb-end_seq (or tp-write_seq). I tested following packetdrill script and confirmed that if I do not tweak snd_nxt, last packet sent is incorrect : . 5001:5001(0) ack 2 This might be because our stack relies that we never coalesce something on one already sent skb (we do this check in tcp_sendmsg() for example) Well, real reason is that tp-snd_nxt is not touched in retransmit paths, but when new data is sent (tcp_event_new_data_sent()) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: net: unix: garbage: fixed several comment and whitespace style issues
From: Jason Eastman eastman.jason.li...@gmail.com Date: Wed, 22 Apr 2015 00:56:42 -0600 fixed several comment and whitespace style issues Signed-off-by: Jason Eastman eastman.jason.li...@gmail.com Applied, thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Suspicious RCU usage in bridge with Linux v4.0-9362-g1fc149933fd4
On Thu, Apr 23, 2015 at 6:07 PM, Josh Boyer jwbo...@fedoraproject.org wrote: Hi All, We've had a user report the following backtrace from the bridge module with a recent Linus' tree. Has anything like this been reported yet? If you have any questions on setup, the user is CC'd. I too observed similar backtrace once(not able to reproduce it again) while I was trying to check inconsistent lock state[1] with lockdep enabled. Regards, Sudeep [1] https://lkml.org/lkml/2015/4/23/329 ---8 === [ INFO: suspicious RCU usage. ] 4.0.0 #269 Not tainted --- include/trace/events/ipi.h:68 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 1 RCU used illegally from extended quiescent state! no locks held by swapper/0/0. stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0 #269 Hardware name: ARM-Versatile Express [c00151f1] (unwind_backtrace) from [c0011971] (show_stack+0x11/0x14) [c0011971] (show_stack) from [c05627c7] (dump_stack+0x73/0x8c) [c05627c7] (dump_stack) from [c0013e63] (handle_IPI+0x257/0x410) [c0013e63] (handle_IPI) from [c000932b] (gic_handle_irq+0x4f/0x50) [c000932b] (gic_handle_irq) from [c00121ff] (__irq_svc+0x3f/0x64) Exception stack(0xc083bf30 to 0xc083bf78) bf20: 0001 0001 c0841938 bf40: c083a000 c083d5fc c08ba5c8 c083d598 c08b938f c056afc8 bf60: 0008 c083bf78 c005d157 c000f288 4133 [c00121ff] (__irq_svc) from [c000f288] (arch_cpu_idle+0x30/0x34) [c000f288] (arch_cpu_idle) from [c00555d5] (cpu_startup_entry+0x339/0x404) [c00555d5] (cpu_startup_entry) from [c07d1a4f] (start_kernel+0x32f/0x338) [c07d1a4f] (start_kernel) from [8000807f] (0x8000807f) === [ INFO: suspicious RCU usage. ] 4.0.0 #269 Not tainted --- include/trace/events/ipi.h:84 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 1 RCU used illegally from extended quiescent state! no locks held by swapper/0/0. stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0 #269 Hardware name: ARM-Versatile Express [c00151f1] (unwind_backtrace) from [c0011971] (show_stack+0x11/0x14) [c0011971] (show_stack) from [c05627c7] (dump_stack+0x73/0x8c) [c05627c7] (dump_stack) from [c0013f85] (handle_IPI+0x379/0x410) [c0013f85] (handle_IPI) from [c000932b] (gic_handle_irq+0x4f/0x50) [c000932b] (gic_handle_irq) from [c00121ff] (__irq_svc+0x3f/0x64) Exception stack(0xc083bf30 to 0xc083bf78) bf20: 0001 0001 c0841938 bf40: c083a000 c083d5fc c08ba5c8 c083d598 c08b938f c056afc8 bf60: 0008 c083bf78 c005d157 c000f288 4133 [c00121ff] (__irq_svc) from [c000f288] (arch_cpu_idle+0x30/0x34) [c000f288] (arch_cpu_idle) from [c00555d5] (cpu_startup_entry+0x339/0x404) [c00555d5] (cpu_startup_entry) from [c07d1a4f] (start_kernel+0x32f/0x338) [c07d1a4f] (start_kernel) from [8000807f] (0x8000807f) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] ethernet: amd: AMD_XGBE should depend on HAS_DMA
From: Geert Uytterhoeven ge...@linux-m68k.org Date: Thu, 23 Apr 2015 19:59:33 +0200 If NO_DMA=y: ... Signed-off-by: Geert Uytterhoeven ge...@linux-m68k.org Applied. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: NAPI: Polling function requesting extra calls - difference between 3.18.11 and 4.0.0
On 23 April 2015 at 20:33, Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-04-23 at 11:28 -0700, Eric Dumazet wrote: On Thu, 2015-04-23 at 20:18 +0200, Rafał Miłecki wrote: Can you help us with this, please? Does bgmac use NAPI incorrectly? Were there some important changes in 3.19 or 4.0? Right they were important changes in NAPI indeed : 3b47d30396ba net: gro: add a per device gro flush timer d75b1ade567f net: less interrupt masking in NAPI bc9ad166e38a net: introduce napi_schedule_irqoff() This requested some fixes in various drivers : commit f31ec95fa19e07a8beebcc0297284f23aa57967e commit 24e579c8898aa641ede3149234906982290934e5 commit 6088beef3f7517717bd21d90b379714dd0837079 commit f104fedc0da126abe93dd0f4a9fa13e5133bf9df commit 7a05dc64e2e4c611d89007b125b20c0d2a4d31a5 commit 001ce546bb537bb5b7821f05633556a0c9787e32 commit 3079c652141f9d6377417a7e8fd650c9948df65e commit 8acdf999accfd95093db17f33a58429a38782060 commit 6a6dc08ff6395f58be3ee568cb970ea956f16819 It looks like following fix is needed : diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c index de77d3a74abc..21e3c38c7c75 100644 --- a/drivers/net/ethernet/broadcom/bgmac.c +++ b/drivers/net/ethernet/broadcom/bgmac.c @@ -1260,7 +1260,7 @@ static int bgmac_poll(struct napi_struct *napi, int weight) /* Poll again if more events arrived in the meantime */ if (bgmac_read(bgmac, BGMAC_INT_STATUS) (BGMAC_IS_TX0 | BGMAC_IS_RX)) - return handled; + return weight; if (handled weight) { napi_complete(napi); Yeah, I'm flashing my device with the same patch applied right now :) -- Rafał -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] ethernet: arc: ARC_EMAC and EMAC_ROCKCHIP should depend on HAS_DMA
If NO_DMA=y: drivers/built-in.o: In function `arc_emac_tx_clean': emac_main.c:(.text+0x2decde): undefined reference to `dma_unmap_single' drivers/built-in.o: In function `arc_emac_rx': emac_main.c:(.text+0x2dee1c): undefined reference to `dma_unmap_single' emac_main.c:(.text+0x2dee72): undefined reference to `dma_map_single' emac_main.c:(.text+0x2dee7e): undefined reference to `dma_mapping_error' drivers/built-in.o: In function `arc_emac_probe': (.text+0x2df2ee): undefined reference to `dmam_alloc_coherent' drivers/built-in.o: In function `arc_emac_open': emac_main.c:(.text+0x2df6d8): undefined reference to `dma_map_single' emac_main.c:(.text+0x2df6e4): undefined reference to `dma_mapping_error' drivers/built-in.o: In function `arc_emac_tx': emac_main.c:(.text+0x2df9e4): undefined reference to `dma_map_single' emac_main.c:(.text+0x2df9f0): undefined reference to `dma_mapping_error' Signed-off-by: Geert Uytterhoeven ge...@linux-m68k.org --- drivers/net/ethernet/arc/Kconfig | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/arc/Kconfig b/drivers/net/ethernet/arc/Kconfig index 8e262e2b39b63fc5..dea29ee24da4a28c 100644 --- a/drivers/net/ethernet/arc/Kconfig +++ b/drivers/net/ethernet/arc/Kconfig @@ -25,8 +25,7 @@ config ARC_EMAC_CORE config ARC_EMAC tristate ARC EMAC support select ARC_EMAC_CORE - depends on OF_IRQ - depends on OF_NET + depends on OF_IRQ OF_NET HAS_DMA ---help--- On some legacy ARC (Synopsys) FPGA boards such as ARCAngel4/ML50x non-standard on-chip ethernet device ARC EMAC 10/100 is used. @@ -35,7 +34,7 @@ config ARC_EMAC config EMAC_ROCKCHIP tristate Rockchip EMAC support select ARC_EMAC_CORE - depends on OF_IRQ OF_NET REGULATOR + depends on OF_IRQ OF_NET REGULATOR HAS_DMA ---help--- Support for Rockchip RK3066/RK3188 EMAC ethernet controllers. This selects Rockchip SoC glue layer support for the -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] ethernet: amd: AMD_XGBE should depend on HAS_DMA
If NO_DMA=y: drivers/built-in.o: In function `xgbe_probe': xgbe-main.c:(.text+0x2def0a): undefined reference to `dma_set_mask' xgbe-main.c:(.text+0x2def20): undefined reference to `dma_supported' drivers/built-in.o: In function `xgbe_rx_poll': xgbe-drv.c:(.text+0x2e0320): undefined reference to `dma_sync_single_for_cpu' xgbe-drv.c:(.text+0x2e035e): undefined reference to `dma_sync_single_for_cpu' drivers/built-in.o: In function `xgbe_unmap_rdata': xgbe-desc.c:(.text+0x2e5fe4): undefined reference to `dma_unmap_page' xgbe-desc.c:(.text+0x2e5ffa): undefined reference to `dma_unmap_single' xgbe-desc.c:(.text+0x2e604a): undefined reference to `dma_unmap_page' xgbe-desc.c:(.text+0x2e6084): undefined reference to `dma_unmap_page' drivers/built-in.o: In function `xgbe_alloc_pages': xgbe-desc.c:(.text+0x2e6156): undefined reference to `dma_map_page' xgbe-desc.c:(.text+0x2e6164): undefined reference to `dma_mapping_error' drivers/built-in.o: In function `xgbe_free_ring': xgbe-desc.c:(.text+0x2e63d4): undefined reference to `dma_unmap_page' xgbe-desc.c:(.text+0x2e640e): undefined reference to `dma_unmap_page' xgbe-desc.c:(.text+0x2e644a): undefined reference to `dma_free_coherent' drivers/built-in.o: In function `xgbe_init_ring': xgbe-desc.c:(.text+0x2e64d4): undefined reference to `dma_alloc_coherent' drivers/built-in.o: In function `xgbe_map_tx_skb': xgbe-desc.c:(.text+0x2e6628): undefined reference to `dma_map_single' xgbe-desc.c:(.text+0x2e6638): undefined reference to `dma_mapping_error' xgbe-desc.c:(.text+0x2e66b2): undefined reference to `dma_map_single' xgbe-desc.c:(.text+0x2e66c2): undefined reference to `dma_mapping_error' xgbe-desc.c:(.text+0x2e6762): undefined reference to `dma_map_page' xgbe-desc.c:(.text+0x2e6772): undefined reference to `dma_mapping_error' Signed-off-by: Geert Uytterhoeven ge...@linux-m68k.org --- drivers/net/ethernet/amd/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/amd/Kconfig b/drivers/net/ethernet/amd/Kconfig index c638c85f3954bc68..089c269637b725da 100644 --- a/drivers/net/ethernet/amd/Kconfig +++ b/drivers/net/ethernet/amd/Kconfig @@ -179,7 +179,7 @@ config SUNLANCE config AMD_XGBE tristate AMD 10GbE Ethernet driver - depends on (OF_NET || ACPI) HAS_IOMEM + depends on (OF_NET || ACPI) HAS_IOMEM HAS_DMA select PHYLIB select AMD_XGBE_PHY select BITREVERSE -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] ethernet: arc: ARC_EMAC and EMAC_ROCKCHIP should depend on HAS_DMA
From: Geert Uytterhoeven ge...@linux-m68k.org Date: Thu, 23 Apr 2015 19:59:34 +0200 If NO_DMA=y: ... Signed-off-by: Geert Uytterhoeven ge...@linux-m68k.org Applied. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] can: CAN_GRCAN should depend on HAS_DMA
From: Geert Uytterhoeven ge...@linux-m68k.org Date: Thu, 23 Apr 2015 20:04:51 +0200 If NO_DMA=y: ... Signed-off-by: Geert Uytterhoeven ge...@linux-m68k.org Applied. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 2/3] tc: deprecate TC_ACT_QUEUED
On 04/23/2015 08:12 PM, Cong Wang wrote: On Wed, Apr 22, 2015 at 7:46 PM, Alexei Starovoitov a...@plumgrid.com wrote: ... TC_ACT_QUEUED cannot be removed. Only deprecated with backwards compatibility the way this patch did it. That should have been obvious. It is at least the third time I have to repeat that: we really don't care about out-of-tree modules. Everyone MUST read Documentation/stable_api_nonsense.txt. Seriously, what are you talking about ??? $ git grep -n TC_ACT_QUEUED include/uapi/ include/uapi/linux/pkt_cls.h:105:#define TC_ACT_QUEUED 5 This is *UAPI*, no-one is even remotely talking about out-of-tree kernel modules. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 2/3] tc: deprecate TC_ACT_QUEUED
On Thu, Apr 23, 2015 at 11:21 AM, Daniel Borkmann dan...@iogearbox.net wrote: On 04/23/2015 08:12 PM, Cong Wang wrote: On Wed, Apr 22, 2015 at 7:46 PM, Alexei Starovoitov a...@plumgrid.com wrote: ... TC_ACT_QUEUED cannot be removed. Only deprecated with backwards compatibility the way this patch did it. That should have been obvious. It is at least the third time I have to repeat that: we really don't care about out-of-tree modules. Everyone MUST read Documentation/stable_api_nonsense.txt. Seriously, what are you talking about ??? $ git grep -n TC_ACT_QUEUED include/uapi/ include/uapi/linux/pkt_cls.h:105:#define TC_ACT_QUEUED 5 This is *UAPI*, no-one is even remotely talking about out-of-tree kernel modules. I am not stupid. You should figure out we can just leave its definition by removing it, leaving the default as stolen. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: NAPI: Polling function requesting extra calls - difference between 3.18.11 and 4.0.0
On Thu, 2015-04-23 at 11:28 -0700, Eric Dumazet wrote: On Thu, 2015-04-23 at 20:18 +0200, Rafał Miłecki wrote: Can you help us with this, please? Does bgmac use NAPI incorrectly? Were there some important changes in 3.19 or 4.0? Right they were important changes in NAPI indeed : 3b47d30396ba net: gro: add a per device gro flush timer d75b1ade567f net: less interrupt masking in NAPI bc9ad166e38a net: introduce napi_schedule_irqoff() This requested some fixes in various drivers : commit f31ec95fa19e07a8beebcc0297284f23aa57967e commit 24e579c8898aa641ede3149234906982290934e5 commit 6088beef3f7517717bd21d90b379714dd0837079 commit f104fedc0da126abe93dd0f4a9fa13e5133bf9df commit 7a05dc64e2e4c611d89007b125b20c0d2a4d31a5 commit 001ce546bb537bb5b7821f05633556a0c9787e32 commit 3079c652141f9d6377417a7e8fd650c9948df65e commit 8acdf999accfd95093db17f33a58429a38782060 commit 6a6dc08ff6395f58be3ee568cb970ea956f16819 It looks like following fix is needed : diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c index de77d3a74abc..21e3c38c7c75 100644 --- a/drivers/net/ethernet/broadcom/bgmac.c +++ b/drivers/net/ethernet/broadcom/bgmac.c @@ -1260,7 +1260,7 @@ static int bgmac_poll(struct napi_struct *napi, int weight) /* Poll again if more events arrived in the meantime */ if (bgmac_read(bgmac, BGMAC_INT_STATUS) (BGMAC_IS_TX0 | BGMAC_IS_RX)) - return handled; + return weight; if (handled weight) { napi_complete(napi); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch net] igb: pass the correct maxlen for eth_get_headlen()
On 04/23/2015 11:06 AM, Cong Wang wrote: On Wed, Apr 22, 2015 at 8:40 PM, Alexander Duyck alexander.du...@gmail.com wrote: On 04/22/2015 04:23 PM, Cong Wang wrote: On Wed, Apr 22, 2015 at 3:34 PM, Alexander Duyck alexander.du...@gmail.com wrote: On 04/22/2015 02:56 PM, Cong Wang wrote: On Wed, Apr 22, 2015 at 2:42 PM, Alexander Duyck alexander.du...@gmail.com wrote: On 04/22/2015 01:33 PM, Cong Wang wrote: First, make sure you don't miss the TSIP case right above: The frag starting pointer and its size are advanced by: skb_frag_size_sub(frag, IGB_TS_HDR_LEN); ... va += IGB_TS_HDR_LEN; even though we unlikely pull header longer than IGB_RX_HDR_LEN - IGB_TS_HDR_LEN either. So I believe this is a possible bug, one heck of a corner case to get into though. It requires timestamp in packet, size 240 - 256, and a malformed header. The proper fix would probably be to pull the timestamp out of the packet before we add it to the frame. I'll submit a patch to address this. Huh? Doesn't my patch already fix this? skb_frag_size() is always up to date. Or you mean another different problem? Your patch has other issues. I am still NAKing your patch, but there is an issue with igb that you have pointed out. The proper fix would be to deal with the timestamp before we add the page fragment to the skb. If the first frag is always 2K, then this is not a problem either. IGB_TS_HDR_LEN + IGB_RX_HDR_LEN 2K. The problem is skb-tail will get screwed up. Why? We don't copy timestamp into skb header, instead it is saved in shared_info, why skb-tail matters here in TSIP case? TSIP only matters if you have a network header with bad data in it that indicates the size is actually larger than it actually is. The first frag will be 2K in size only if there are multiple frags. The Or it doesn't have frags at all since we just copy it to skb header, right? That is moot since this code only gets called if the skb is nonlinear problem in the TSIP case is that we will put the size reported by the descriptor into the fragment, and then try to pull the size we That size is saved when adding the frag, in TSIP case we just sub it by IGB_TS_HDR_LEN, which seems correct. Yes, the size is saved. But with your solution we could pull the whole fragment but not free it which isn't correct as we shouldn't be left with any 0 sized fragments. determined via eth_get_headlen. So there is a window where that value can be wrong and result in data_len going negative and tail being moved beyond the end of the actual data. This sounds like a different problem if you are saying we should sub the size in the descriptor too. Therefore I don't see why your patch could replace mine. Your patches sort-of fixed the problem, but they introduced other issues in the process. The thing I don't think you are getting is that the code was meant to be mutually exclusive w/ the copy-break code. Either the frag is less than IGB_RX_HDR_SIZE in which case we copy-break and don't attached the fragment, or we should be pulling the header and leaving at least 1 byte in the fragment. The problem with your solution is that you potentially pull the entire fragment in, but you don't free it if the size drops to 0. That is why in the other mail I told you the better solution for igb would likely be to just pass IGB_RX_HDR_SIZE - IGB_TS_HDR_SIZE as that way you end up with at least 1 byte left in the fragment after you pull the header. - Alex -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix tcp fin memory accounting
From: Eric Dumazet eric.duma...@gmail.com Date: Thu, 23 Apr 2015 09:48:12 -0700 On Thu, 2015-04-23 at 09:16 -0700, Eric Dumazet wrote: On Thu, 2015-04-23 at 11:54 -0400, David Miller wrote: From: Eric Dumazet eric.duma...@gmail.com Date: Thu, 23 Apr 2015 07:02:47 -0700 +if (!tcp_send_head(sk)) { +tp-snd_nxt++; +return; +} I'm not so sure about this. Why is this needed? Otherwise patch looks fine to me. I guess I need to add a comment then ;) If we want to pretend FIN was sent, we also need to tweak tp-snd_nxt to match new tskb-end_seq (or tp-write_seq). I tested following packetdrill script and confirmed that if I do not tweak snd_nxt, last packet sent is incorrect : . 5001:5001(0) ack 2 This might be because our stack relies that we never coalesce something on one already sent skb (we do this check in tcp_sendmsg() for example) Well, real reason is that tp-snd_nxt is not touched in retransmit paths, but when new data is sent (tcp_event_new_data_sent()) That makes sense, thanks for explaining. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net] tcp: avoid looping in tcp_send_fin()
From: Eric Dumazet eduma...@google.com Presence of an unbound loop in tcp_send_fin() had always been hard to explain when analyzing crash dumps involving gigantic dying processes with millions of sockets. Lets try a different strategy : In case of memory pressure, try to add the FIN flag to last packet in write queue, even if packet was already sent. TCP stack will be able to deliver this FIN after a timeout event. Note that this FIN being delivered by a retransmit, it also carries a Push flag given our current implementation. By checking sk_under_memory_pressure(), we anticipate that cooking many FIN packets might deplete tcp memory. In the case we could not allocate a packet, even with __GFP_WAIT allocation, then not sending a FIN seems quite reasonable if it allows to get rid of this socket, free memory, and not block the process from eventually doing other useful work. Signed-off-by: Eric Dumazet eduma...@google.com --- net/ipv4/tcp_output.c | 50 +++- 1 file changed, 29 insertions(+), 21 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 2ade67b7cdb0..a369e8a70b2c 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2814,7 +2814,8 @@ begin_fwd: /* We allow to exceed memory limits for FIN packets to expedite * connection tear down and (memory) recovery. - * Otherwise tcp_send_fin() could loop forever. + * Otherwise tcp_send_fin() could be tempted to either delay FIN + * or even be forced to close flow without any FIN. */ static void sk_forced_wmem_schedule(struct sock *sk, int size) { @@ -2827,33 +2828,40 @@ static void sk_forced_wmem_schedule(struct sock *sk, int size) sk_memory_allocated_add(sk, amt, status); } -/* Send a fin. The caller locks the socket for us. This cannot be - * allowed to fail queueing a FIN frame under any circumstances. +/* Send a FIN. The caller locks the socket for us. + * We should try to send a FIN packet really hard, but eventually give up. */ void tcp_send_fin(struct sock *sk) { + struct sk_buff *skb, *tskb = tcp_write_queue_tail(sk); struct tcp_sock *tp = tcp_sk(sk); - struct sk_buff *skb = tcp_write_queue_tail(sk); - int mss_now; - /* Optimization, tack on the FIN if we have a queue of -* unsent frames. But be careful about outgoing SACKS -* and IP options. + /* Optimization, tack on the FIN if we have one skb in write queue and +* this skb was not yet sent, or we are under memory pressure. +* Note: in the latter case, FIN packet will be sent after a timeout, +* as TCP stack thinks it has already been transmitted. */ - mss_now = tcp_current_mss(sk); - - if (tcp_send_head(sk)) { - TCP_SKB_CB(skb)-tcp_flags |= TCPHDR_FIN; - TCP_SKB_CB(skb)-end_seq++; + if (tskb (tcp_send_head(sk) || sk_under_memory_pressure(sk))) { +coalesce: + TCP_SKB_CB(tskb)-tcp_flags |= TCPHDR_FIN; + TCP_SKB_CB(tskb)-end_seq++; tp-write_seq++; + if (!tcp_send_head(sk)) { + /* This means tskb was already sent. +* Pretend we included the FIN on previous transmit. +* We need to set tp-snd_nxt to the value it would have +* if FIN had been sent. This is because retransmit path +* does not change tp-snd_nxt. +*/ + tp-snd_nxt++; + return; + } } else { - /* Socket is locked, keep trying until memory is available. */ - for (;;) { - skb = alloc_skb_fclone(MAX_TCP_HEADER, - sk-sk_allocation); - if (skb) - break; - yield(); + skb = alloc_skb_fclone(MAX_TCP_HEADER, sk-sk_allocation); + if (unlikely(!skb)) { + if (tskb) + goto coalesce; + return; } skb_reserve(skb, MAX_TCP_HEADER); sk_forced_wmem_schedule(sk, skb-truesize); @@ -2862,7 +2870,7 @@ void tcp_send_fin(struct sock *sk) TCPHDR_ACK | TCPHDR_FIN); tcp_queue_skb(sk, skb); } - __tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_OFF); + __tcp_push_pending_frames(sk, tcp_current_mss(sk), TCP_NAGLE_OFF); } /* We get here when a process closes a file descriptor (either due to -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] netdev_alloc_pcpu_stats: use less common iterator variable
From: Johannes Berg johan...@sipsolutions.net Date: Thu, 23 Apr 2015 12:06:30 +0200 From: Johannes Berg johannes.b...@intel.com With the CPU iteration variable called 'i', it's relatively easy to have variable shadowing which sparse will warn about. Avoid that by renaming the variable to __cpu which is less likely to be used in the surrounding context. Signed-off-by: Johannes Berg johannes.b...@intel.com Applied, thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Suspicious RCU usage in bridge with Linux v4.0-9362-g1fc149933fd4
Hi All, We've had a user report the following backtrace from the bridge module with a recent Linus' tree. Has anything like this been reported yet? If you have any questions on setup, the user is CC'd. josh [ 29.382235] br0: port 1(tap0) entered forwarding state [ 29.382286] === [ 29.382315] [ INFO: suspicious RCU usage. ] [ 29.382344] 4.1.0-0.rc0.git11.1.fc23.x86_64 #1 Not tainted [ 29.382380] --- [ 29.382409] net/bridge/br_private.h:626 suspicious rcu_dereference_check() usage! [ 29.382455] other info that might help us debug this: [ 29.382507] rcu_scheduler_active = 1, debug_locks = 0 [ 29.382549] 2 locks held by swapper/0/0: [ 29.382576] #0: (((p-forward_delay_timer))){+.-...}, at: [81139f75] call_timer_fn+0x5/0x4f0 [ 29.382660] #1: ((br-lock)-rlock){+.-...}, at: [a0450dc1] br_forward_delay_timer_expired+0x31/0x140 [bridge] [ 29.382754] stack backtrace: [ 29.382787] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.0-0.rc0.git11.1.fc23.x86_64 #1 [ 29.382838] Hardware name: LENOVO 422916G/LENOVO, BIOS A1KT53AUS 04/07/2015 [ 29.382882] 3ebfc20364115825 88003c48 81892d4b [ 29.382943] 81e124e0 88003c78 8110bcd7 [ 29.383004] 8800785c9d00 88065485ac58 880c62002800 880c5fc88ac0 [ 29.383065] Call Trace: [ 29.383084] IRQ [81892d4b] dump_stack+0x4c/0x65 [ 29.383130] [8110bcd7] lockdep_rcu_suspicious+0xe7/0x120 [ 29.383178] [a04520f9] br_fill_ifinfo+0x4a9/0x6a0 [bridge] [ 29.383225] [a045266b] br_ifinfo_notify+0x11b/0x4b0 [bridge] [ 29.383271] [a0450d90] ? br_hold_timer_expired+0x70/0x70 [bridge] [ 29.383320] [a0450de8] br_forward_delay_timer_expired+0x58/0x140 [bridge] [ 29.383371] [a0450d90] ? br_hold_timer_expired+0x70/0x70 [bridge] [ 29.383416] [8113a033] call_timer_fn+0xc3/0x4f0 [ 29.383454] [81139f75] ? call_timer_fn+0x5/0x4f0 [ 29.383493] [8110a90f] ? lock_release_holdtime.part.29+0xf/0x200 [ 29.383541] [a0450d90] ? br_hold_timer_expired+0x70/0x70 [bridge] [ 29.383587] [8113a6a4] run_timer_softirq+0x244/0x490 [ 29.383629] [810b68cc] __do_softirq+0xec/0x670 [ 29.383666] [810b70d5] irq_exit+0x145/0x150 [ 29.383703] [8189f506] smp_apic_timer_interrupt+0x46/0x60 [ 29.383744] [8189d523] apic_timer_interrupt+0x73/0x80 [ 29.383782] EOI [816f131f] ? cpuidle_enter_state+0x5f/0x2f0 [ 29.383832] [816f131b] ? cpuidle_enter_state+0x5b/0x2f0 [ 29.383873] [816f15e7] cpuidle_enter+0x17/0x20 [ 29.383908] [81103c6f] cpu_startup_entry+0x36f/0x5f0 [ 29.383949] [81887cbd] rest_init+0x13d/0x150 [ 29.383986] [821c905b] start_kernel+0x4d2/0x4f3 [ 29.384023] [821c8120] ? early_idt_handlers+0x120/0x120 [ 29.384064] [821c8339] x86_64_start_reservations+0x2a/0x2c [ 29.384105] [821c8485] x86_64_start_kernel+0x14a/0x16d [ 36.100347] IN=br0 OUT= MAC=33:33:00:00:00:02:52:54:00:12:01:00:86:dd SRC=fe80::::5054:00ff:fe12:0100 DST=ff02:::::::0002 LEN=56 TC=0 HOPLIMIT=255 FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0 [ 40.113744] IN=br0 OUT= MAC=33:33:00:00:00:02:52:54:00:12:01:00:86:dd SRC=fe80::::5054:00ff:fe12:0100 DST=ff02:::::::0002 LEN=56 TC=0 HOPLIMIT=255 FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0 [ 44.119727] IN=br0 OUT= MAC=33:33:00:00:00:02:52:54:00:12:01:00:86:dd SRC=fe80::::5054:00ff:fe12:0100 DST=ff02:::::::0002 LEN=56 TC=0 HOPLIMIT=255 FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] can: CAN_GRCAN should depend on HAS_DMA
If NO_DMA=y: drivers/built-in.o: In function `grcan_free_dma_buffers': grcan.c:(.text+0x2d7716): undefined reference to `dma_free_coherent' drivers/built-in.o: In function `grcan_allocate_dma_buffers': grcan.c:(.text+0x2d779c): undefined reference to `dma_alloc_coherent' Signed-off-by: Geert Uytterhoeven ge...@linux-m68k.org --- drivers/net/can/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig index 58808f6514520c86..e8c96b8e86f48e66 100644 --- a/drivers/net/can/Kconfig +++ b/drivers/net/can/Kconfig @@ -112,7 +112,7 @@ config PCH_CAN config CAN_GRCAN tristate Aeroflex Gaisler GRCAN and GRHCAN CAN devices - depends on OF + depends on OF HAS_DMA ---help--- Say Y here if you want to use Aeroflex Gaisler GRCAN or GRHCAN. Note that the driver supports little endian, even though little -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch net] igb: pass the correct maxlen for eth_get_headlen()
On Wed, Apr 22, 2015 at 8:40 PM, Alexander Duyck alexander.du...@gmail.com wrote: On 04/22/2015 04:23 PM, Cong Wang wrote: On Wed, Apr 22, 2015 at 3:34 PM, Alexander Duyck alexander.du...@gmail.com wrote: On 04/22/2015 02:56 PM, Cong Wang wrote: On Wed, Apr 22, 2015 at 2:42 PM, Alexander Duyck alexander.du...@gmail.com wrote: On 04/22/2015 01:33 PM, Cong Wang wrote: First, make sure you don't miss the TSIP case right above: The frag starting pointer and its size are advanced by: skb_frag_size_sub(frag, IGB_TS_HDR_LEN); ... va += IGB_TS_HDR_LEN; even though we unlikely pull header longer than IGB_RX_HDR_LEN - IGB_TS_HDR_LEN either. So I believe this is a possible bug, one heck of a corner case to get into though. It requires timestamp in packet, size 240 - 256, and a malformed header. The proper fix would probably be to pull the timestamp out of the packet before we add it to the frame. I'll submit a patch to address this. Huh? Doesn't my patch already fix this? skb_frag_size() is always up to date. Or you mean another different problem? Your patch has other issues. I am still NAKing your patch, but there is an issue with igb that you have pointed out. The proper fix would be to deal with the timestamp before we add the page fragment to the skb. If the first frag is always 2K, then this is not a problem either. IGB_TS_HDR_LEN + IGB_RX_HDR_LEN 2K. The problem is skb-tail will get screwed up. Why? We don't copy timestamp into skb header, instead it is saved in shared_info, why skb-tail matters here in TSIP case? [...] The first frag will be 2K in size only if there are multiple frags. The Or it doesn't have frags at all since we just copy it to skb header, right? problem in the TSIP case is that we will put the size reported by the descriptor into the fragment, and then try to pull the size we That size is saved when adding the frag, in TSIP case we just sub it by IGB_TS_HDR_LEN, which seems correct. determined via eth_get_headlen. So there is a window where that value can be wrong and result in data_len going negative and tail being moved beyond the end of the actual data. This sounds like a different problem if you are saying we should sub the size in the descriptor too. Therefore I don't see why your patch could replace mine. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
NAPI: Polling function requesting extra calls - difference between 3.18.11 and 4.0.0
Hi, Recently Felix improved bgmac driver to be smarter in situation where new packets were Tx/Rx-ed during bgmac_poll execution. It was handled in: eb64e29 bgmac: leave interrupts disabled as long as there is work to do http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=eb64e2923a886441c7b322f138b36029f3fa6a36 With above change, after handling all skb-s in bgmac_poll, bgmac checks if there are new sbk-s to be handled. If so, it doesn't call napi_complete it doesn't enable interrupts. Above commit was merged for 4.1 kernel, but I work with OpenWrt based on older releases, so I have all bgmac changes backported to 3.8.11 and 4.0. With 3.8.11 Felix's change seems to be working fine, I did some debugging: [ 44.97] [bgmac_interrupt] int_status int_mask = 0x0100 [ 44.98] [bgmac_poll] START weight:64 [ 44.99] [bgmac_poll] Read handled:0 skb-s [ 44.99] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:0 BGMAC_IS_RX:0 [ 45.00] [bgmac_poll] Read skb-s weight, calling napi_complete, en IRQs [ 45.01] [bgmac_poll] END returning handled:0 [ 48.04] [bgmac_interrupt] int_status int_mask = 0x0100 [ 48.05] [bgmac_poll] START weight:64 [ 48.06] [bgmac_poll] Read handled:0 skb-s [ 48.06] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:0 BGMAC_IS_RX:0 [ 48.07] [bgmac_poll] Read skb-s weight, calling napi_complete, en IRQs [ 48.08] [bgmac_poll] END returning handled:0 [ 48.10] [bgmac_poll] START weight:64 [ 48.11] [bgmac_poll] Read handled:1 skb-s [ 48.11] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:1 BGMAC_IS_RX:1 [ 48.12] [bgmac_poll] END more skb-s to handle, returning handled:1 [ 48.13] [bgmac_poll] START weight:64 [ 48.13] [bgmac_poll] Read handled:1 skb-s [ 48.14] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:0 BGMAC_IS_RX:0 [ 48.15] [bgmac_poll] Read skb-s weight, calling napi_complete, en IRQs [ 48.16] [bgmac_poll] END returning handled:1 Please note that at 48.12 bgmac decided bgmac_poll should be called again and it didn't call napi_complete didn't enable IRQs. It simply returned 1. NAPI behaved just like we expected, bgmac_poll was called again a moment later. It handled skb-s that appeared meanwhile and ended calling napi_complete and returning 1. Now, it fails badly for me at all when using kernel 4.0.0: [ 52.80] [bgmac_interrupt] int_status int_mask = 0x0100 [ 52.80] [bgmac_poll] START weight:64 [ 52.81] [bgmac_poll] Read handled:0 skb-s [ 52.81] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:0 BGMAC_IS_RX:0 [ 52.82] [bgmac_poll] Read skb-s weight, calling napi_complete, en IRQs [ 52.83] [bgmac_poll] END returning handled:0 [ 54.61] [bgmac_interrupt] int_status int_mask = 0x0100 [ 54.62] [bgmac_poll] START weight:64 [ 54.63] [bgmac_poll] Read handled:0 skb-s [ 54.63] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:0 BGMAC_IS_RX:0 [ 54.64] [bgmac_poll] Read skb-s weight, calling napi_complete, en IRQs [ 54.65] [bgmac_poll] END returning handled:0 [ 55.90] [bgmac_interrupt] int_status int_mask = 0x0100 [ 55.91] [bgmac_poll] START weight:64 [ 55.92] [bgmac_poll] Read handled:1 skb-s [ 55.92] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:1 BGMAC_IS_RX:1 [ 55.93] [bgmac_poll] END more skb-s to handle, returning handled:1 As you can see, at 55.93 bgmac also decided bgmac_poll should be called again. It didn't call napi_complete, didn't enable IRQs. It simply returned 1. Just like with kernel 3.8.11. It this case however, NAPI never called bgmac_poll again. It resulted in bgmac hanging (IRQs are off, bgmac expects bgmac_poll to be called). Can you help us with this, please? Does bgmac use NAPI incorrectly? Were there some important changes in 3.19 or 4.0? -- Rafał diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c index 4929932..40ce34d 100644 --- a/drivers/net/ethernet/broadcom/bgmac.c +++ b/drivers/net/ethernet/broadcom/bgmac.c @@ -1219,6 +1219,7 @@ static irqreturn_t bgmac_interrupt(int irq, void *dev_id) struct bgmac *bgmac = netdev_priv(dev_id); u32 int_status = bgmac_read(bgmac, BGMAC_INT_STATUS); + pr_info([%s] int_status:0x%08x int_mask:0x%08x = 0x%08x\n, __FUNCTION__, int_status, bgmac-int_mask, int_status bgmac-int_mask); int_status = bgmac-int_mask; if (!int_status) @@ -1240,22 +1241,31 @@ static int bgmac_poll(struct napi_struct *napi, int weight) { struct bgmac *bgmac = container_of(napi, struct bgmac, napi); int handled = 0; + u32 int_status; + pr_info([%s] START weight:%d\n, __FUNCTION__, weight); /* Ack */ bgmac_write(bgmac, BGMAC_INT_STATUS, ~0); bgmac_dma_tx_free(bgmac, bgmac-tx_ring[0]); handled += bgmac_dma_rx_read(bgmac,
Re: [PATCH][FIX 4.1] bgmac: fix requests for extra polling calls from NAPI
On Thu, 2015-04-23 at 20:56 +0200, Rafał Miłecki wrote: After d75b1ade567f (net: less interrupt masking in NAPI) polling function has to return whole budget when it wants NAPI to call it again. Signed-off-by: Rafał Miłecki zaj...@gmail.com Cc: Felix Fietkau n...@openwrt.org Fixes: eb64e2923a886 (bgmac: leave interrupts disabled as long as there is work to do) --- drivers/net/ethernet/broadcom/bgmac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Signed-off-by: Eric Dumazet eduma...@google.com Thanks ! -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][FIX 4.1] bgmac: fix requests for extra polling calls from NAPI
After d75b1ade567f (net: less interrupt masking in NAPI) polling function has to return whole budget when it wants NAPI to call it again. Signed-off-by: Rafał Miłecki zaj...@gmail.com Cc: Felix Fietkau n...@openwrt.org Fixes: eb64e2923a886 (bgmac: leave interrupts disabled as long as there is work to do) --- drivers/net/ethernet/broadcom/bgmac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c index de77d3a..21e3c38 100644 --- a/drivers/net/ethernet/broadcom/bgmac.c +++ b/drivers/net/ethernet/broadcom/bgmac.c @@ -1260,7 +1260,7 @@ static int bgmac_poll(struct napi_struct *napi, int weight) /* Poll again if more events arrived in the meantime */ if (bgmac_read(bgmac, BGMAC_INT_STATUS) (BGMAC_IS_TX0 | BGMAC_IS_RX)) - return handled; + return weight; if (handled weight) { napi_complete(napi); -- 1.8.4.5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] altera tse: add support for fixed-links.
From: Andreas Oetken ennoerlan...@googlemail.com Date: Wed, 22 Apr 2015 15:22:22 +0200 + /* check if a fixed-link is defined in device-tree */ + if (of_phy_is_fixed_link(priv-device-of_node)) { + if (of_phy_register_fixed_link(priv-device-of_node) + 0) { You should store the return value of of_phy_register_fixed_link() and return that as the error code when it fails. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Intel-wired-lan] [Patch net] igb: pass the correct maxlen for eth_get_headlen()
On 04/23/2015 12:30 PM, Cong Wang wrote: (Off-topic...) On Wed, Apr 22, 2015 at 4:23 PM, Cong Wang cw...@twopensource.com wrote: The code looks correct to me now, except it is suspicious skb-len is not updated after skb_copy_to_linear_data() while skb-tail is advanced already. I need to think more before submitting a patch. I feel like we need the following patch, maybe skb-len is updated somewhere else by skb-tail - skb-head, otherwise we are screwed? Maybe in skb_add_rx_frag? You might take a look at it. diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index a0a9b1f..66e6fb6 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -6843,7 +6843,6 @@ static void igb_pull_tail(struct igb_ring *rx_ring, skb_frag_size_sub(frag, IGB_TS_HDR_LEN); frag-page_offset += IGB_TS_HDR_LEN; skb-data_len -= IGB_TS_HDR_LEN; - skb-len -= IGB_TS_HDR_LEN; /* move va to start of packet data */ va += IGB_TS_HDR_LEN; @@ -6856,12 +6855,12 @@ static void igb_pull_tail(struct igb_ring *rx_ring, /* align pull length to size of long to optimize memcpy performance */ skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long))); + __skb_put(skb, pull_len); /* update all of the pointers */ skb_frag_size_sub(frag, pull_len); frag-page_offset += pull_len; skb-data_len -= pull_len; - skb-tail += pull_len; } /** Seriously? Are you even reading the code? The fragment is already a part of the skb, as such it is already included in skb-len. By re-adding the header you are adding bytes that aren't there. All we were doing is moving data from a fragment to the linear portion. No offense but your starting to waste my time with these silly patch ideas. The patches I submitted to intel-wired-lan fix the issue that you found, and there aren't any new issues that it creates so the issue is resolved. And like I said if you need to fix this in stable just subtract IGB_TS_HDR_LEN from the header length scanned in eth_get_headlen and it will resolve the issue you reported and that way we can fix the issue and avoid pulling a fragment down to size 0. - Alex -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] checkpatch: suggest using eth_zero_addr() and eth_broadcast_addr()
On 22.04.2015 00:27, Joe Perches wrote: On Tue, 2015-04-21 at 23:44 +0200, Mateusz Kulikowski wrote: On 21.04.2015 23:22, Joe Perches wrote: On Tue, 2015-04-21 at 22:57 +0200, Mateusz Kulikowski wrote: (...) (...) True, True; If you prefer $line and ability to --fix - I'll use that in v3 I suppose you could do both $line and $stat and the fix would only work when it's on a single line. Perhaps something like this would work: if ($line =~ /whatever/ || (defined($stat) $stat =~ /whatever/)) { if (WARN(...) $fix) { fixed[$fixlinenr] =~ s/whatever/appropriate/; } } Isn't it enough to just match $stat and do fix for line (that in some cases will just not match)? One more thing I noticed funny behavior about $stat matches - it reports the same error several times (including as scope whole file) Is it feature or feature or I missed something? Ex. file: -- cut int foo(void) { baz(); memset(a, b, 0); bar(); } -- cut Output of (@master) -- cut $ scripts/checkpatch.pl -f test42.c --types MEMSET ERROR: memset to 0's uses 0 as the 2nd argument, not the 3rd #1: FILE: test42.c:1: +int foo(void) { baz(); memset(a, b, 0); bar(); } ERROR: memset to 0's uses 0 as the 2nd argument, not the 3rd #2: FILE: test42.c:2: +{ baz(); memset(a, b, 0); bar(); } ERROR: memset to 0's uses 0 as the 2nd argument, not the 3rd #4: FILE: test42.c:4: +memset(a, b, 0); total: 3 errors, 0 warnings, 6 lines checked NOTE: Used message types: MEMSET test42.c has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. --cut Regards, Mateusz -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rhashtable: don't attempt to grow when at max_size
On Thu, 2015-04-23 at 21:46 +0100, Thomas Graf wrote: On 04/23/15 at 04:38pm, Johannes Berg wrote: From: Johannes Berg johannes.b...@intel.com The conversion of mac80211's station table to rhashtable had a bug that I found by accident in code review, that hadn't been found as rhashtable apparently managed to have a maximum hash chain length of one (!) in all our testing. This is the desired chain length ;-) Sure. But I had a bug in my handling of collisions, so I explicitly wanted to test them. After all, they are in some ways expected in a hash table :) At that point, rhashtable WARNed in rhashtable_insert_rehash() but didn't actually reject the hash table insertion. This caused it to lose insertions - my master list of stations would have 9 entries, but the rhashtable only had 5. This may warrant a deeper look, but that WARN_ON() just shouldn't happen. The warning got fixed recently (51bb8e331b) and rhashtable_insert_rehash() now only allows a single rehash if at max_size already. It will now return -EBUSY. Insertions may still fail while the table is above 100% utilization so this fix is absolutely needed though. Yeah just failing would be a bit strange. Fix this by not returning true from rht_grow_above_100() when the rhashtable's max_size has been reached - in this case the user is explicitly configuring it to be at most that big, so even if it's now above 100% it shouldn't attempt to resize. Good catch. I wonder whether we want to trigger a periodic rehash in an interval in this situation or just leave this up to the user to setup a timer himself. You could just document it that it's probably useful if max_size is set? I'm just going to be setting max_size for debug purposes, so don't really care all that much. johannes -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: NAPI: Polling function requesting extra calls - difference between 3.18.11 and 4.0.0
On Thu, 2015-04-23 at 21:08 +0200, Rafał Miłecki wrote: As you obviously noticed, I fixed bgmac now and submitted the fix. Just wanted to say a thanks for helping with that! Sure, it was a pleasure ;) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rhashtable: don't attempt to grow when at max_size
On 04/23/15 at 04:38pm, Johannes Berg wrote: From: Johannes Berg johannes.b...@intel.com The conversion of mac80211's station table to rhashtable had a bug that I found by accident in code review, that hadn't been found as rhashtable apparently managed to have a maximum hash chain length of one (!) in all our testing. This is the desired chain length ;-) In order to test the bug and verify the fix I set my rhashtable's max_size very low (4) in order to force getting hash collisions. At that point, rhashtable WARNed in rhashtable_insert_rehash() but didn't actually reject the hash table insertion. This caused it to lose insertions - my master list of stations would have 9 entries, but the rhashtable only had 5. This may warrant a deeper look, but that WARN_ON() just shouldn't happen. The warning got fixed recently (51bb8e331b) and rhashtable_insert_rehash() now only allows a single rehash if at max_size already. It will now return -EBUSY. Insertions may still fail while the table is above 100% utilization so this fix is absolutely needed though. Fix this by not returning true from rht_grow_above_100() when the rhashtable's max_size has been reached - in this case the user is explicitly configuring it to be at most that big, so even if it's now above 100% it shouldn't attempt to resize. Good catch. I wonder whether we want to trigger a periodic rehash in an interval in this situation or just leave this up to the user to setup a timer himself. This fixes the lost insertion issue and consequently allows my code to display its error (and verify my fix for it.) Signed-off-by: Johannes Berg johannes.b...@intel.com Acked-by: Thomas Graf tg...@suug.ch -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: NAPI: Polling function requesting extra calls - difference between 3.18.11 and 4.0.0
On 23 April 2015 at 20:28, Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-04-23 at 20:18 +0200, Rafał Miłecki wrote: Can you help us with this, please? Does bgmac use NAPI incorrectly? Were there some important changes in 3.19 or 4.0? Right they were important changes in NAPI indeed : 3b47d30396ba net: gro: add a per device gro flush timer d75b1ade567f net: less interrupt masking in NAPI bc9ad166e38a net: introduce napi_schedule_irqoff() As you obviously noticed, I fixed bgmac now and submitted the fix. Just wanted to say a thanks for helping with that! -- Rafał -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch net] igb: pass the correct maxlen for eth_get_headlen()
(Off-topic...) On Wed, Apr 22, 2015 at 4:23 PM, Cong Wang cw...@twopensource.com wrote: The code looks correct to me now, except it is suspicious skb-len is not updated after skb_copy_to_linear_data() while skb-tail is advanced already. I need to think more before submitting a patch. I feel like we need the following patch, maybe skb-len is updated somewhere else by skb-tail - skb-head, otherwise we are screwed? diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index a0a9b1f..66e6fb6 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -6843,7 +6843,6 @@ static void igb_pull_tail(struct igb_ring *rx_ring, skb_frag_size_sub(frag, IGB_TS_HDR_LEN); frag-page_offset += IGB_TS_HDR_LEN; skb-data_len -= IGB_TS_HDR_LEN; - skb-len -= IGB_TS_HDR_LEN; /* move va to start of packet data */ va += IGB_TS_HDR_LEN; @@ -6856,12 +6855,12 @@ static void igb_pull_tail(struct igb_ring *rx_ring, /* align pull length to size of long to optimize memcpy performance */ skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long))); + __skb_put(skb, pull_len); /* update all of the pointers */ skb_frag_size_sub(frag, pull_len); frag-page_offset += pull_len; skb-data_len -= pull_len; - skb-tail += pull_len; } /** -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net: netcp: remove call to netif_carrier_(on/off) for MAC to Phy interface
Currently when interface type is MAC to Phy, netif_carrier_(on/off) is called which is not needed as Phy lib already updates the carrier status to net stack. This is needed only for other interface types Signed-off-by: Murali Karicheri m-kariche...@ti.com --- drivers/net/ethernet/ti/netcp_ethss.c |8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c index 2bef655..0a1ef2e 100644 --- a/drivers/net/ethernet/ti/netcp_ethss.c +++ b/drivers/net/ethernet/ti/netcp_ethss.c @@ -1765,7 +1765,9 @@ static void netcp_ethss_link_state_action(struct gbe_priv *gbe_dev, ALE_PORT_STATE, ALE_PORT_STATE_FORWARD); - if (ndev slave-open) + if (ndev slave-open + ((slave-link_interface != SGMII_LINK_MAC_PHY) + (slave-link_interface != XGMII_LINK_MAC_PHY))) netif_carrier_on(ndev); } else { writel(mac_control, GBE_REG_ADDR(slave, emac_regs, @@ -1773,7 +1775,9 @@ static void netcp_ethss_link_state_action(struct gbe_priv *gbe_dev, cpsw_ale_control_set(gbe_dev-ale, slave-port_num, ALE_PORT_STATE, ALE_PORT_STATE_DISABLE); - if (ndev) + if (ndev + ((slave-link_interface != SGMII_LINK_MAC_PHY) + (slave-link_interface != XGMII_LINK_MAC_PHY))) netif_carrier_off(ndev); } -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Suspicious RCU usage in bridge with Linux v4.0-9362-g1fc149933fd4
On Thu, Apr 23, 2015 at 10:07 AM, Josh Boyer jwbo...@fedoraproject.org wrote: [ 29.382235] br0: port 1(tap0) entered forwarding state [ 29.382286] === [ 29.382315] [ INFO: suspicious RCU usage. ] [ 29.382344] 4.1.0-0.rc0.git11.1.fc23.x86_64 #1 Not tainted [ 29.382380] --- [ 29.382409] net/bridge/br_private.h:626 suspicious rcu_dereference_check() usage! [ 29.382455] other info that might help us debug this: [ 29.382507] rcu_scheduler_active = 1, debug_locks = 0 [ 29.382549] 2 locks held by swapper/0/0: [ 29.382576] #0: (((p-forward_delay_timer))){+.-...}, at: [81139f75] call_timer_fn+0x5/0x4f0 [ 29.382660] #1: ((br-lock)-rlock){+.-...}, at: [a0450dc1] br_forward_delay_timer_expired+0x31/0x140 [bridge] [ 29.382754] stack backtrace: [ 29.382787] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.0-0.rc0.git11.1.fc23.x86_64 #1 [ 29.382838] Hardware name: LENOVO 422916G/LENOVO, BIOS A1KT53AUS 04/07/2015 [ 29.382882] 3ebfc20364115825 88003c48 81892d4b [ 29.382943] 81e124e0 88003c78 8110bcd7 [ 29.383004] 8800785c9d00 88065485ac58 880c62002800 880c5fc88ac0 [ 29.383065] Call Trace: [ 29.383084] IRQ [81892d4b] dump_stack+0x4c/0x65 [ 29.383130] [8110bcd7] lockdep_rcu_suspicious+0xe7/0x120 [ 29.383178] [a04520f9] br_fill_ifinfo+0x4a9/0x6a0 [bridge] [ 29.383225] [a045266b] br_ifinfo_notify+0x11b/0x4b0 [bridge] [ 29.383271] [a0450d90] ? br_hold_timer_expired+0x70/0x70 [bridge] [ 29.383320] [a0450de8] br_forward_delay_timer_expired+0x58/0x140 [bridge] [ 29.383371] [a0450d90] ? br_hold_timer_expired+0x70/0x70 [bridge] [ 29.383416] [8113a033] call_timer_fn+0xc3/0x4f0 [ 29.383454] [81139f75] ? call_timer_fn+0x5/0x4f0 [ 29.383493] [8110a90f] ? lock_release_holdtime.part.29+0xf/0x200 [ 29.383541] [a0450d90] ? br_hold_timer_expired+0x70/0x70 [bridge] [ 29.383587] [8113a6a4] run_timer_softirq+0x244/0x490 [ 29.383629] [810b68cc] __do_softirq+0xec/0x670 [ 29.383666] [810b70d5] irq_exit+0x145/0x150 [ 29.383703] [8189f506] smp_apic_timer_interrupt+0x46/0x60 [ 29.383744] [8189d523] apic_timer_interrupt+0x73/0x80 [ 29.383782] EOI [816f131f] ? cpuidle_enter_state+0x5f/0x2f0 [ 29.383832] [816f131b] ? cpuidle_enter_state+0x5b/0x2f0 [ 29.383873] [816f15e7] cpuidle_enter+0x17/0x20 [ 29.383908] [81103c6f] cpu_startup_entry+0x36f/0x5f0 [ 29.383949] [81887cbd] rest_init+0x13d/0x150 [ 29.383986] [821c905b] start_kernel+0x4d2/0x4f3 [ 29.384023] [821c8120] ? early_idt_handlers+0x120/0x120 [ 29.384064] [821c8339] x86_64_start_reservations+0x2a/0x2c [ 29.384105] [821c8485] x86_64_start_kernel+0x14a/0x16d We are dereferencing a RCU pointer with rtnl assert, but in the timer context we only have br-lock. It looks like we need to take RCU read lock on that path rather than asserting rtnl lock. Thanks for the report. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 2/3] tc: deprecate TC_ACT_QUEUED
On Thu, Apr 23, 2015 at 3:13 PM, Alexei Starovoitov a...@plumgrid.com wrote: On 4/23/15 1:45 PM, Jamal Hadi Salim wrote: 1) the _XMIT semantics are useful on the egress side because in fact we do have queues and they can be attached to qdiscs etc. The TC_ACT_XXX codes were _intentional_ since ingress works as a classifier shell. then it is worse mess than I thought :( Why call it _qdisc_ then? and have special and convoluted handling for it in qdisc_create, qdisc_graft and other places? It needs to glue into qdisc layer, unify the abstractions. Are you planning to queue things on ingress? I thought that was the whole purpose of ingress qdisc. There is no queue for ingress qdisc. why then we have dev-ingress_queue? Glue layer, qdisc ties too much with txq, which is what I want to solve. We have struct netdev_rx_queue too, but look at it, there is nothing but some sysfs stuffs needed by RPS. If queueing was never a goal, may be we should kill ingress qdisc and replace it with a simple shim that only does cls/act. The code overall will get much simpler and faster. As I said in response to your patch for skb-data, we need to align ingress with egress, rather than making more differences, this includes qdisc's too. We do have per-cpu queues for ingress, at least in theory we have a queue for ingress to use. Of course, RX is naturally different from TX, which is the root cause why we have so many differences here. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 2/3] tc: deprecate TC_ACT_QUEUED
On 04/23/15 18:13, Alexei Starovoitov wrote: On 4/23/15 1:45 PM, Jamal Hadi Salim wrote: then it is worse mess than I thought :( Why call it _qdisc_ then? and have special and convoluted handling for it in qdisc_create, qdisc_graft and other places? Convinience for tooling and using existing abstractions is the historical. You can attach qdiscs to netdevs. You can then use all sorts of well understood tools. Are you planning to queue things on ingress? I thought that was the whole purpose of ingress qdisc. why then we have dev-ingress_queue? So you are planning to add queues? If you are that is a different discussion (and the use case needs some clarity). If queueing was never a goal, may be we should kill ingress qdisc and replace it with a simple shim that only does cls/act. The code overall will get much simpler and faster. Attaching to the device was considered simpler based on the tooling thought process. On whether we should queue on ingress - it was not considered useful if the packets were going to the host i.e we want to proceed to completion likely queueing to some socket layer further upstream. For packets being forwarded we already had egress qdiscs which had queues so it didnt seem to make sense to enqueue on ingress for such use cases. There was a use case later where multiple ingress ports had to be shared and ifb was born - which is pseudo temporary enqueueing on ingress. Feels like falling into rabbit hole. The fact that qdiscs dealt with these codes directly allows for specialized handling. Moving them to a generic function seems to defeat that purpose. So I am siding with Cong on this. that's not what patch 1 is doing. It is still doing specialized handling... but in light of what you said above, it looks like much bigger cleanup is needed. We'll continue arguing when I refactor this set and resubmit when net-next reopens. Sorry - i was refereing to the patch where you agregated things after the qdisc invokes a classifier. cheers, jamal -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 2/3] tc: deprecate TC_ACT_QUEUED
Jamal Hadi Salim j...@mojatatu.com wrote: 2) the ACT_QUEUED vs STOLEN was supposed to have semantics of something that was stolen (eg redirection should definetely have been returning STOLEN not QUEUED); something that queues for later re-injection (with any/all metadata) was intended to use QUEUED. I believe netfilter may have followed suit and introduced similar codes (so it would be interesting to see how they use them). Hooks (Targets) don't queue themselves, i.e. NF_QUEUE tells the netfilter core that the skb is to be handed off to nf_queue machinery, while NF_STOLEN is the more obvious don't touch this skb ever again. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Intel-wired-lan] [Patch net] igb: pass the correct maxlen for eth_get_headlen()
On Thu, Apr 23, 2015 at 12:45 PM, Alexander Duyck alexander.h.du...@redhat.com wrote: On 04/23/2015 12:30 PM, Cong Wang wrote: (Off-topic...) On Wed, Apr 22, 2015 at 4:23 PM, Cong Wang cw...@twopensource.com wrote: The code looks correct to me now, except it is suspicious skb-len is not updated after skb_copy_to_linear_data() while skb-tail is advanced already. I need to think more before submitting a patch. I feel like we need the following patch, maybe skb-len is updated somewhere else by skb-tail - skb-head, otherwise we are screwed? Maybe in skb_add_rx_frag? You might take a look at it. I saw that. diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index a0a9b1f..66e6fb6 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -6843,7 +6843,6 @@ static void igb_pull_tail(struct igb_ring *rx_ring, skb_frag_size_sub(frag, IGB_TS_HDR_LEN); frag-page_offset += IGB_TS_HDR_LEN; skb-data_len -= IGB_TS_HDR_LEN; - skb-len -= IGB_TS_HDR_LEN; /* move va to start of packet data */ va += IGB_TS_HDR_LEN; @@ -6856,12 +6855,12 @@ static void igb_pull_tail(struct igb_ring *rx_ring, /* align pull length to size of long to optimize memcpy performance */ skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long))); + __skb_put(skb, pull_len); /* update all of the pointers */ skb_frag_size_sub(frag, pull_len); frag-page_offset += pull_len; skb-data_len -= pull_len; - skb-tail += pull_len; } /** Seriously? Are you even reading the code? The fragment is already a part of the skb, as such it is already included in skb-len. By re-adding the header you are adding bytes that aren't there. All we were doing is moving data from a fragment to the linear portion. Hmm, I thought frags are counted only in skb-data_len, I misunderstood skb_headlen() and skb_pagelen(). No offense but your starting to waste my time with these silly patch ideas. You can always ignore any off-topic discussion, which I already warned you in the beginning... The patches I submitted to intel-wired-lan fix the issue that you found, and there aren't any new issues that it creates so the issue is resolved. And like I said if you need to fix this in stable just subtract IGB_TS_HDR_LEN from the header length scanned in eth_get_headlen and it will resolve the issue you reported and that way we can fix the issue and avoid pulling a fragment down to size 0. Please keep non-off-topic discussion from off-topic ones. I will reply there. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 2/3] tc: deprecate TC_ACT_QUEUED
On 4/23/15 1:45 PM, Jamal Hadi Salim wrote: 1) the _XMIT semantics are useful on the egress side because in fact we do have queues and they can be attached to qdiscs etc. The TC_ACT_XXX codes were _intentional_ since ingress works as a classifier shell. then it is worse mess than I thought :( Why call it _qdisc_ then? and have special and convoluted handling for it in qdisc_create, qdisc_graft and other places? Are you planning to queue things on ingress? I thought that was the whole purpose of ingress qdisc. why then we have dev-ingress_queue? If queueing was never a goal, may be we should kill ingress qdisc and replace it with a simple shim that only does cls/act. The code overall will get much simpler and faster. Feels like falling into rabbit hole. The fact that qdiscs dealt with these codes directly allows for specialized handling. Moving them to a generic function seems to defeat that purpose. So I am siding with Cong on this. that's not what patch 1 is doing. It is still doing specialized handling... but in light of what you said above, it looks like much bigger cleanup is needed. We'll continue arguing when I refactor this set and resubmit when net-next reopens. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 2/3] tc: deprecate TC_ACT_QUEUED
On Wed, Apr 22, 2015 at 7:46 PM, Alexei Starovoitov a...@plumgrid.com wrote: TC_ACT_QUEUED cannot be removed. Only deprecated with backwards compatibility the way this patch did it. That should have been obvious. It is at least the third time I have to repeat that: we really don't care about out-of-tree modules. Everyone MUST read Documentation/stable_api_nonsense.txt. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 2/3] tc: deprecate TC_ACT_QUEUED
On Thu, Apr 23, 2015 at 5:59 PM, Alexei Starovoitov a...@plumgrid.com wrote: On 4/23/15 3:51 PM, Jamal Hadi Salim wrote: So you are planning to add queues? If you are that is a different discussion (and the use case needs some clarity). nope. I wasn't planning to do that. For packets being forwarded we already had egress qdiscs which had queues so it didnt seem to make sense to enqueue on ingress for such use cases. There was a use case later where multiple ingress ports had to be shared and ifb was born - which is pseudo temporary enqueueing on ingress. agree. imo ifb approach is more flexible, since it has full hierarchy of qdiscs. As you're saying above and from the old ifb logs I thought that ifb is _temporary_ and long term plan is to use ingress_queue, but looks like this is not the case. Also not too long ago we decided that we don't want another ingress qdisc. Therefore I think it makes sense to simplify the code and boost performance. Which performance problem did you see? Did you hit the spinlock bottleneck? That spinlock should be eliminated before you try to do more, and it can be replaced by RCU read lock after John's work (I am still waiting for his patch btw). I'm not proposing to change tooling, of course. From iproute2 point of view we'll still have ingress qdisc. Right now we're pointlessly allocating memory for it and for ingress_queue, whereas we only need to call cls/act. I'm proposing to kill them and used tcf_proto in net_device instead. Right now to reach cls in critical path on ingress we do: rxq = skb-dev-ingress_queue sch = rxq-qdisc sch-enqueue sch-filter_list with a bunch of 'if' conditions and useless memory accesses in-between. If we add 'ingress_filter_list' directly to net_device, it will be just: skb-dev-ingress_filter_list which is huge performance boost. Code size will shrink as well. iproute2 and all existing tools will work as-is. Looks as win-win to me. Nope, it breaks Qdisc abstraction, filters never attach to netdevice directly. You should stop, since you really don't understand the code. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH next 1/3] ipvlan: Defer multicast / broadcast processing to a work-queue
From: Mahesh Bandewar mahe...@google.com Date: Thu, 23 Apr 2015 19:54:29 -0700 On Thu, Apr 23, 2015 at 5:32 PM, Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-04-23 at 14:29 -0700, Mahesh Bandewar wrote: +static void ipvlan_multicast_enqueue(struct ipvl_port *port, + struct sk_buff *skb) +{ + if (skb-protocol == htons(ETH_P_PAUSE)) + return; But what happens to this packet ? It seems leaked ? Hmm, will take care of it in v2. Please also provide a proper [PATCH 0/3] introduction posting stating what the patch series is doing, and why. Thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][FIX 4.1] bgmac: fix requests for extra polling calls from NAPI
On 2015-04-23 20:56, Rafał Miłecki wrote: After d75b1ade567f (net: less interrupt masking in NAPI) polling function has to return whole budget when it wants NAPI to call it again. Signed-off-by: Rafał Miłecki zaj...@gmail.com Cc: Felix Fietkau n...@openwrt.org Fixes: eb64e2923a886 (bgmac: leave interrupts disabled as long as there is work to do) Acked-by: Felix Fietkau n...@openwrt.org - Felix -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch net] igb: pass the correct maxlen for eth_get_headlen()
On Thu, Apr 23, 2015 at 11:40 AM, Alexander Duyck alexander.du...@gmail.com wrote: On 04/23/2015 11:06 AM, Cong Wang wrote: On Wed, Apr 22, 2015 at 8:40 PM, Alexander Duyck alexander.du...@gmail.com wrote: On 04/22/2015 04:23 PM, Cong Wang wrote: On Wed, Apr 22, 2015 at 3:34 PM, Alexander Duyck alexander.du...@gmail.com wrote: On 04/22/2015 02:56 PM, Cong Wang wrote: On Wed, Apr 22, 2015 at 2:42 PM, Alexander Duyck alexander.du...@gmail.com wrote: On 04/22/2015 01:33 PM, Cong Wang wrote: First, make sure you don't miss the TSIP case right above: The frag starting pointer and its size are advanced by: skb_frag_size_sub(frag, IGB_TS_HDR_LEN); ... va += IGB_TS_HDR_LEN; even though we unlikely pull header longer than IGB_RX_HDR_LEN - IGB_TS_HDR_LEN either. So I believe this is a possible bug, one heck of a corner case to get into though. It requires timestamp in packet, size 240 - 256, and a malformed header. The proper fix would probably be to pull the timestamp out of the packet before we add it to the frame. I'll submit a patch to address this. Huh? Doesn't my patch already fix this? skb_frag_size() is always up to date. Or you mean another different problem? Your patch has other issues. I am still NAKing your patch, but there is an issue with igb that you have pointed out. The proper fix would be to deal with the timestamp before we add the page fragment to the skb. If the first frag is always 2K, then this is not a problem either. IGB_TS_HDR_LEN + IGB_RX_HDR_LEN 2K. The problem is skb-tail will get screwed up. Why? We don't copy timestamp into skb header, instead it is saved in shared_info, why skb-tail matters here in TSIP case? TSIP only matters if you have a network header with bad data in it that indicates the size is actually larger than it actually is. OK. The first frag will be 2K in size only if there are multiple frags. The Or it doesn't have frags at all since we just copy it to skb header, right? That is moot since this code only gets called if the skb is nonlinear problem in the TSIP case is that we will put the size reported by the descriptor into the fragment, and then try to pull the size we That size is saved when adding the frag, in TSIP case we just sub it by IGB_TS_HDR_LEN, which seems correct. Yes, the size is saved. But with your solution we could pull the whole fragment but not free it which isn't correct as we shouldn't be left with any 0 sized fragments. Good point, then TSIP case should check frag_size == 0 case, the rest handles 0-size case correctly, see below. determined via eth_get_headlen. So there is a window where that value can be wrong and result in data_len going negative and tail being moved beyond the end of the actual data. This sounds like a different problem if you are saying we should sub the size in the descriptor too. Therefore I don't see why your patch could replace mine. Your patches sort-of fixed the problem, but they introduced other issues in the process. The thing I don't think you are getting is that the code was meant to be mutually exclusive w/ the copy-break code. Either the frag is less than IGB_RX_HDR_SIZE in which case we copy-break and don't attached the fragment, or we should be pulling the header and leaving at least 1 byte in the fragment. The problem with your solution is that you potentially pull the entire fragment in, but you don't free it if the size drops to Only if it is a malformed packet, and it is still safe as long as we don't run over the buffer size. Also I have a upper limit check for IGB_RX_HDR_SIZE after eth_get_headlen(). 0. That is why in the other mail I told you the better solution for igb I am sure eth_get_headlen() handles 0 case correctly, it simply returns 0. would likely be to just pass IGB_RX_HDR_SIZE - IGB_TS_HDR_SIZE as that way you end up with at least 1 byte left in the fragment after you pull the header. The code should already handle pull_len==0 correctly. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CCM/GCM implementation defect
On Thu, Apr 23, 2015 at 03:24:59PM +0200, Martin Willi wrote: Do you have any pointer for me where this is defined? Why is it needed, given that GCM implicitly authenticates the IV by using it in Y0? The IV if present must be covered by the ICV. This is required by RFC4303 (section 2). But really it's quite obvious. If you don't authenticate the IV, then I can easily inject random crap into your network by changing the IV. Cheers, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resource usages in Linux drivers
Sergei Shtylyov sergei.shtyl...@cogentembedded.com : On 4/23/2015 1:08 PM, Jia-Ju Bai wrote: [...] I also find many drivers do not use these managed APIs, especially in ethernet card drivers (like e100, r8169). Is it possible to change them? Patches welcome. :-) I respectfully disagree. If someone believes basic resouce management to be too hard or error-prone to handle, he should imvho seriously rise the bar and reconsider the way he wants to contribute to the kernel. I may hope he who reads e100.c to think about DMA api, bql or rx ring holes avoidance to quote a few ones. Managed API ? Mildly... -- Ueimor -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] checkpatch: suggest using eth_zero_addr() and eth_broadcast_addr()
On Thu, 2015-04-23 at 16:54 -0700, Joe Perches wrote: On Thu, 2015-04-23 at 21:53 +0200, Mateusz Kulikowski wrote: I noticed funny behavior about $stat matches - it reports the same error several times (including as scope whole file) Is it feature or feature or I missed something? You have to make sure the first character of $stat is a + if ($stat =~ /\+(?:.*)\bmemfoo Make that if ($stat =~ /^\+(?:.*)\bmemfoo -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH next 1/3] ipvlan: Defer multicast / broadcast processing to a work-queue
Processing multicast / broadcast in fast path is performance draining and having more links means more clonning and bringing performance down further. Broadcast; in particular, need to be given to all the virtual links. Earlier tricks of enabling broadcast bit for IPv4 only interfaces are not really working since it fails autoconf. Which means enabling braodcast for all the links if protocol specific hacks do not have to be added into the driver. This patch defers all (incoming as well as outgoing) multicast traffic to a work-queue leaving only the unicast traffic in the fast-path. Now if we need to apply any additional tricks to further reduce the impact of this (multicast / broadcast) type of traffic, it can be implemented while processing this work without affecting the fast-path. Signed-off-by: Mahesh Bandewar mahe...@google.com --- drivers/net/ipvlan/ipvlan.h | 5 ++ drivers/net/ipvlan/ipvlan_core.c | 134 +-- drivers/net/ipvlan/ipvlan_main.c | 5 ++ 3 files changed, 96 insertions(+), 48 deletions(-) diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h index 54549a6223dd..953a97492fab 100644 --- a/drivers/net/ipvlan/ipvlan.h +++ b/drivers/net/ipvlan/ipvlan.h @@ -39,6 +39,8 @@ #define IPVLAN_MAC_FILTER_SIZE (1 IPVLAN_MAC_FILTER_BITS) #define IPVLAN_MAC_FILTER_MASK (IPVLAN_MAC_FILTER_SIZE - 1) +#define IPVLAN_QBACKLOG_LIMIT 1000 + typedef enum { IPVL_IPV6 = 0, IPVL_ICMPV6, @@ -93,6 +95,8 @@ struct ipvl_port { struct hlist_head hlhead[IPVLAN_HASH_SIZE]; struct list_headipvlans; struct rcu_head rcu; + struct work_struct wq; + struct sk_buff_head backlog; int count; u16 mode; }; @@ -112,6 +116,7 @@ void ipvlan_set_port_mode(struct ipvl_port *port, u32 nval); void ipvlan_init_secret(void); unsigned int ipvlan_mac_hash(const unsigned char *addr); rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb); +void ipvlan_process_multicast(struct work_struct *work); int ipvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev); void ipvlan_ht_addr_add(struct ipvl_dev *ipvlan, struct ipvl_addr *addr); struct ipvl_addr *ipvlan_find_addr(const struct ipvl_dev *ipvlan, diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c index c30b5c300c05..58891666088c 100644 --- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -189,64 +189,85 @@ unsigned int ipvlan_mac_hash(const unsigned char *addr) return hash IPVLAN_MAC_FILTER_MASK; } -static void ipvlan_multicast_frame(struct ipvl_port *port, struct sk_buff *skb, - const struct ipvl_dev *in_dev, bool local) +void ipvlan_process_multicast(struct work_struct *work) { - struct ethhdr *eth = eth_hdr(skb); + struct ipvl_port *port = container_of(work, struct ipvl_port, wq); + struct ethhdr *ethh; struct ipvl_dev *ipvlan; - struct sk_buff *nskb; + struct sk_buff *skb, *nskb; + struct sk_buff_head list; unsigned int len; unsigned int mac_hash; int ret; + u8 pkt_type; + bool hlocal, dlocal; - if (skb-protocol == htons(ETH_P_PAUSE)) - return; - - rcu_read_lock(); - list_for_each_entry_rcu(ipvlan, port-ipvlans, pnode) { - if (local (ipvlan == in_dev)) - continue; + __skb_queue_head_init(list); - mac_hash = ipvlan_mac_hash(eth-h_dest); - if (!test_bit(mac_hash, ipvlan-mac_filters)) - continue; + spin_lock_bh(port-backlog.lock); + skb_queue_splice_tail_init(port-backlog, list); + spin_unlock_bh(port-backlog.lock); - ret = NET_RX_DROP; - len = skb-len + ETH_HLEN; - nskb = skb_clone(skb, GFP_ATOMIC); - if (!nskb) - goto mcast_acct; + while ((skb = __skb_dequeue(list)) != NULL) { + ethh = eth_hdr(skb); + hlocal = ether_addr_equal(ethh-h_source, port-dev-dev_addr); + mac_hash = ipvlan_mac_hash(ethh-h_dest); - if (ether_addr_equal(eth-h_dest, ipvlan-phy_dev-broadcast)) - nskb-pkt_type = PACKET_BROADCAST; + if (ether_addr_equal(ethh-h_dest, port-dev-broadcast)) + pkt_type = PACKET_BROADCAST; else - nskb-pkt_type = PACKET_MULTICAST; + pkt_type = PACKET_MULTICAST; + + dlocal = false; + rcu_read_lock(); + list_for_each_entry_rcu(ipvlan, port-ipvlans, pnode) { + if (hlocal (ipvlan-dev == skb-dev)) { + dlocal = true; + continue; + } +
[PATCH next 3/3] ipvlan: Always set broadcast bit in multicast filter
Earlier tricks of setting broadcast bit only when IPv4 address is added onto interface are not good enough especially when autoconf comes in play. Setting them on always is performance drag but now that multicast / broadcast is not processed in fast-path; enabling broadcast will let autoconf work correctly without affecting performance characteristics of the device. Signed-off-by: Mahesh Bandewar mahe...@google.com --- drivers/net/ipvlan/ipvlan_main.c | 20 ++-- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index a16d3017fdc3..8fe9d5ff331e 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -218,17 +218,6 @@ static void ipvlan_change_rx_flags(struct net_device *dev, int change) dev_set_allmulti(phy_dev, dev-flags IFF_ALLMULTI? 1 : -1); } -static void ipvlan_set_broadcast_mac_filter(struct ipvl_dev *ipvlan, bool set) -{ - struct net_device *dev = ipvlan-dev; - unsigned int hashbit = ipvlan_mac_hash(dev-broadcast); - - if (set !test_bit(hashbit, ipvlan-mac_filters)) - __set_bit(hashbit, ipvlan-mac_filters); - else if (!set test_bit(hashbit, ipvlan-mac_filters)) - __clear_bit(hashbit, ipvlan-mac_filters); -} - static void ipvlan_set_multicast_mac_filter(struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); @@ -243,6 +232,12 @@ static void ipvlan_set_multicast_mac_filter(struct net_device *dev) netdev_for_each_mc_addr(ha, dev) __set_bit(ipvlan_mac_hash(ha-addr), mc_filters); + /* Turn-on broadcast bit irrespective of address family, +* since broadcast is deferred to a work-queue, hence no +* impact on unicast fast-path processing. +*/ + __set_bit(ipvlan_mac_hash(dev-broadcast), mc_filters); + bitmap_copy(ipvlan-mac_filters, mc_filters, IPVLAN_MAC_FILTER_SIZE); } @@ -710,7 +705,6 @@ static int ipvlan_add_addr4(struct ipvl_dev *ipvlan, struct in_addr *ip4_addr) */ if (netif_running(ipvlan-dev)) ipvlan_ht_addr_add(ipvlan, addr); - ipvlan_set_broadcast_mac_filter(ipvlan, true); return 0; } @@ -727,8 +721,6 @@ static void ipvlan_del_addr4(struct ipvl_dev *ipvlan, struct in_addr *ip4_addr) list_del(addr-anode); ipvlan-ipv4cnt--; WARN_ON(ipvlan-ipv4cnt 0); - if (!ipvlan-ipv4cnt) - ipvlan_set_broadcast_mac_filter(ipvlan, false); kfree_rcu(addr, rcu); return; -- 2.2.0.rc0.207.ga3a616c -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH next 2/3] ipvlan: Process fragmented multicast frames correctly
Multicast processing in IPvlan was faulty as is. Eric Dumazet pointed out that fragmented packets wont be processed correctly unless defrag step is introduced. This patch adds the defrag step before driver attempts to process multicast frame(s). Signed-off-by: Mahesh Bandewar mahe...@google.com --- drivers/net/ipvlan/ipvlan_core.c | 8 include/net/ip.h | 1 + 2 files changed, 9 insertions(+) diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c index 58891666088c..282d32b16a87 100644 --- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -532,6 +532,10 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev) return dev_forward_skb(ipvlan-phy_dev, skb); } else if (is_multicast_ether_addr(eth-h_dest)) { + skb = ip_check_defrag(skb, IP_DEFRAG_IPVLAN); + if (!skb) + return RX_HANDLER_CONSUMED; + ipvlan_multicast_enqueue(ipvlan-port, skb); return NET_XMIT_SUCCESS; } @@ -617,6 +621,10 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb, int addr_type; if (is_multicast_ether_addr(eth-h_dest)) { + skb = ip_check_defrag(skb, IP_DEFRAG_IPVLAN); + if (!skb) + return RX_HANDLER_CONSUMED; + if (ipvlan_external_frame(skb, port)) { ipvlan_multicast_enqueue(port, skb); return RX_HANDLER_CONSUMED; diff --git a/include/net/ip.h b/include/net/ip.h index d14af7edd197..1fb432b346a8 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -476,6 +476,7 @@ enum ip_defrag_users { IP_DEFRAG_VS_FWD, IP_DEFRAG_AF_PACKET, IP_DEFRAG_MACVLAN, + IP_DEFRAG_IPVLAN, }; int ip_defrag(struct sk_buff *skb, u32 user); -- 2.2.0.rc0.207.ga3a616c -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] checkpatch: suggest using eth_zero_addr() and eth_broadcast_addr()
On Thu, 2015-04-23 at 21:53 +0200, Mateusz Kulikowski wrote: On 22.04.2015 00:27, Joe Perches wrote: On Tue, 2015-04-21 at 23:44 +0200, Mateusz Kulikowski wrote: On 21.04.2015 23:22, Joe Perches wrote: On Tue, 2015-04-21 at 22:57 +0200, Mateusz Kulikowski wrote: (...) (...) True, True; If you prefer $line and ability to --fix - I'll use that in v3 I suppose you could do both $line and $stat and the fix would only work when it's on a single line. Perhaps something like this would work: if ($line =~ /whatever/ || (defined($stat) $stat =~ /whatever/)) { if (WARN(...) $fix) { fixed[$fixlinenr] =~ s/whatever/appropriate/; } } Isn't it enough to just match $stat and do fix for line (that in some cases will just not match)? Yeah, that'd work too. One more thing I noticed funny behavior about $stat matches - it reports the same error several times (including as scope whole file) Is it feature or feature or I missed something? You have to make sure the first character of $stat is a + if ($stat =~ /\+(?:.*)\bmemfoo -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resource usages in Linux drivers
From: Florian Fainelli f.faine...@gmail.com Date: Thu, 23 Apr 2015 17:40:12 -0700 Well, for one, we could have a device managed register_netdev() which cleans up resources in case of failures and calls free_netdev() automatically, but is that adding much value? I don't think it's worth it at all. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 2/3] tc: deprecate TC_ACT_QUEUED
On 4/23/15 3:51 PM, Jamal Hadi Salim wrote: So you are planning to add queues? If you are that is a different discussion (and the use case needs some clarity). nope. I wasn't planning to do that. For packets being forwarded we already had egress qdiscs which had queues so it didnt seem to make sense to enqueue on ingress for such use cases. There was a use case later where multiple ingress ports had to be shared and ifb was born - which is pseudo temporary enqueueing on ingress. agree. imo ifb approach is more flexible, since it has full hierarchy of qdiscs. As you're saying above and from the old ifb logs I thought that ifb is _temporary_ and long term plan is to use ingress_queue, but looks like this is not the case. Also not too long ago we decided that we don't want another ingress qdisc. Therefore I think it makes sense to simplify the code and boost performance. I'm not proposing to change tooling, of course. From iproute2 point of view we'll still have ingress qdisc. Right now we're pointlessly allocating memory for it and for ingress_queue, whereas we only need to call cls/act. I'm proposing to kill them and used tcf_proto in net_device instead. Right now to reach cls in critical path on ingress we do: rxq = skb-dev-ingress_queue sch = rxq-qdisc sch-enqueue sch-filter_list with a bunch of 'if' conditions and useless memory accesses in-between. If we add 'ingress_filter_list' directly to net_device, it will be just: skb-dev-ingress_filter_list which is huge performance boost. Code size will shrink as well. iproute2 and all existing tools will work as-is. Looks as win-win to me. The fact that qdiscs dealt with these codes directly allows for specialized handling. Moving them to a generic function seems to defeat that purpose. So I am siding with Cong on this. that's not what patch 1 is doing. It is still doing specialized handling... but in light of what you said above, it looks like much bigger cleanup is needed. We'll continue arguing when I refactor this set and resubmit when net-next reopens. Sorry - i was refereing to the patch where you agregated things after the qdisc invokes a classifier. hmm. There I also didn't convert all qdiscs into single helper. Only those that have exactly the same logic after tc_classify. All other qdiscs have custom handling. No worries, it's hard to review things while traveling. Been there :) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net] inet: fix possible panic in reqsk_queue_unlink()
From: Eric Dumazet eduma...@google.com [ 3897.923145] BUG: unable to handle kernel NULL pointer dereference at 0080 [ 3897.931025] IP: [a9f27686] reqsk_timer_handler+0x1a6/0x243 There is a race when reqsk_timer_handler() and tcp_check_req() call inet_csk_reqsk_queue_unlink() on the same req at the same time. Before commit fa76ce7328b2 (inet: get rid of central tcp/dccp listener timer), listener spinlock was held and race could not happen. To solve this bug, we change reqsk_queue_unlink() to not assume req must be found, and we return a status, to conditionally release a refcount on the request sock. This also means tcp_check_req() in non fastopen case might or not consume req refcount, so tcp_v6_hnd_req() tcp_v4_hnd_req() have to properly handle this. (Same remark for dccp_check_req() and its callers) inet_csk_reqsk_queue_drop() is now too big to be inlined, as it is called 4 times in tcp and 3 times in dccp. Fixes: fa76ce7328b2 (inet: get rid of central tcp/dccp listener timer) Signed-off-by: Eric Dumazet eduma...@google.com Reported-by: Yuchung Cheng ych...@google.com --- include/net/inet_connection_sock.h | 20 --- include/net/request_sock.h | 18 -- net/dccp/ipv4.c|3 +- net/dccp/ipv6.c|3 +- net/dccp/minisocks.c |3 -- net/ipv4/inet_connection_sock.c| 34 +++ net/ipv4/tcp_ipv4.c|3 +- net/ipv4/tcp_minisocks.c |7 +++-- net/ipv6/tcp_ipv6.c|3 +- 9 files changed, 48 insertions(+), 46 deletions(-) diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index 7b5887cd1172..48a815823587 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -279,12 +279,6 @@ static inline void inet_csk_reqsk_queue_add(struct sock *sk, void inet_csk_reqsk_queue_hash_add(struct sock *sk, struct request_sock *req, unsigned long timeout); -static inline void inet_csk_reqsk_queue_removed(struct sock *sk, - struct request_sock *req) -{ - reqsk_queue_removed(inet_csk(sk)-icsk_accept_queue, req); -} - static inline void inet_csk_reqsk_queue_added(struct sock *sk, const unsigned long timeout) { @@ -306,19 +300,7 @@ static inline int inet_csk_reqsk_queue_is_full(const struct sock *sk) return reqsk_queue_is_full(inet_csk(sk)-icsk_accept_queue); } -static inline void inet_csk_reqsk_queue_unlink(struct sock *sk, - struct request_sock *req) -{ - reqsk_queue_unlink(inet_csk(sk)-icsk_accept_queue, req); -} - -static inline void inet_csk_reqsk_queue_drop(struct sock *sk, -struct request_sock *req) -{ - inet_csk_reqsk_queue_unlink(sk, req); - inet_csk_reqsk_queue_removed(sk, req); - reqsk_put(req); -} +void inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req); void inet_csk_destroy_sock(struct sock *sk); void inet_csk_prepare_forced_close(struct sock *sk); diff --git a/include/net/request_sock.h b/include/net/request_sock.h index fe41f3ceb008..9f4265ce8892 100644 --- a/include/net/request_sock.h +++ b/include/net/request_sock.h @@ -212,24 +212,6 @@ static inline int reqsk_queue_empty(struct request_sock_queue *queue) return queue-rskq_accept_head == NULL; } -static inline void reqsk_queue_unlink(struct request_sock_queue *queue, - struct request_sock *req) -{ - struct listen_sock *lopt = queue-listen_opt; - struct request_sock **prev; - - spin_lock(queue-syn_wait_lock); - - prev = lopt-syn_table[req-rsk_hash]; - while (*prev != req) - prev = (*prev)-dl_next; - *prev = req-dl_next; - - spin_unlock(queue-syn_wait_lock); - if (del_timer(req-rsk_timer)) - reqsk_put(req); -} - static inline void reqsk_queue_add(struct request_sock_queue *queue, struct request_sock *req, struct sock *parent, diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index 2b4f21d34df6..ccf4c5629b3c 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -453,7 +453,8 @@ static struct sock *dccp_v4_hnd_req(struct sock *sk, struct sk_buff *skb) iph-saddr, iph-daddr); if (req) { nsk = dccp_check_req(sk, skb, req); - reqsk_put(req); + if (!nsk) + reqsk_put(req); return nsk; } nsk = inet_lookup_established(sock_net(sk), dccp_hashinfo, diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index 9d0551092c6c..5165571f397a 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -301,7
Re: CCM/GCM implementation defect
On Thu, Apr 23, 2015 at 01:45:34PM +0200, Steffen Klassert wrote: Adding a second template for the correct implementation is probaply the only thing that we can do if we don't want to break backwards compatibility. But maybe we can add a warning to the old implementation, such that users notice that they use a broken version. If we are going to do a warning I think the place to do it would be in xfrm_algo.c. We could add an insecure/warning flag and if then print a warning if said algorithm is used. Cheers, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resource usages in Linux drivers
On 23/04/15 16:19, Francois Romieu wrote: Sergei Shtylyov sergei.shtyl...@cogentembedded.com : On 4/23/2015 1:08 PM, Jia-Ju Bai wrote: [...] I also find many drivers do not use these managed APIs, especially in ethernet card drivers (like e100, r8169). Is it possible to change them? Patches welcome. :-) I respectfully disagree. Me too, most of the device managed conversions we have seen were bogus because they were done in a semi-automated way without understanding the peculiarities of the network devices, that is the separation between init/open/close that most other device drivers do not have. A typical example is this: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6892b41d9701283085b655c6086fb57a5d63fa47 If someone believes basic resouce management to be too hard or error-prone to handle, he should imvho seriously rise the bar and reconsider the way he wants to contribute to the kernel. Well, for one, we could have a device managed register_netdev() which cleans up resources in case of failures and calls free_netdev() automatically, but is that adding much value? I may hope he who reads e100.c to think about DMA api, bql or rx ring holes avoidance to quote a few ones. Managed API ? Mildly... -- Florian -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix tcp fin memory accounting
On 04/21/2015 07:09 PM, Eric Dumazet wrote: Note that this patch adds a deadlock possibility in some stress situations. If a process owning some tcp socket dies, and tcp_mem[2] is already hit, all sk_stream_alloc_skb() can return NULL and we loop in tcp_send_fin(), making no progress because we can not free any tcp memory. Ugh. Thanks for fixing this Eric! Josh -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix tcp fin memory accounting
On Thu, 2015-04-23 at 08:48 -0500, Josh Hunt wrote: On 04/21/2015 07:09 PM, Eric Dumazet wrote: Note that this patch adds a deadlock possibility in some stress situations. If a process owning some tcp socket dies, and tcp_mem[2] is already hit, all sk_stream_alloc_skb() can return NULL and we loop in tcp_send_fin(), making no progress because we can not free any tcp memory. Ugh. Thanks for fixing this Eric! No problem ;) For the record, I've tested this followup patch that I'll formally submit when net-next reopens : If there is one already sent skb in write queue (we look at the tail of course), and : - We are under TCP memory pressure, - or allocation of a fresh skb using GFP_KERNEL failed, - we add the FIN flag that will eventually be sent later after a timeout. net/ipv4/tcp_output.c | 41 1 file changed, 21 insertions(+), 20 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 2ade67b7cdb0..fe6558eb64f3 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2827,33 +2827,34 @@ static void sk_forced_wmem_schedule(struct sock *sk, int size) sk_memory_allocated_add(sk, amt, status); } -/* Send a fin. The caller locks the socket for us. This cannot be - * allowed to fail queueing a FIN frame under any circumstances. +/* Send a FIN. The caller locks the socket for us. + * We should try to send a FIN packet really hard, but eventually give up. */ void tcp_send_fin(struct sock *sk) { + struct sk_buff *skb, *tskb = tcp_write_queue_tail(sk); struct tcp_sock *tp = tcp_sk(sk); - struct sk_buff *skb = tcp_write_queue_tail(sk); - int mss_now; - /* Optimization, tack on the FIN if we have a queue of -* unsent frames. But be careful about outgoing SACKS -* and IP options. + /* Optimization, tack on the FIN if we have one skb in write queue and +* this skb was not yet sent, or we are under memory pressure. +* Note: in the latter case, FIN packet will be sent after a timeout, +* as TCP stack thinks it has been transmitted once. */ - mss_now = tcp_current_mss(sk); - - if (tcp_send_head(sk)) { - TCP_SKB_CB(skb)-tcp_flags |= TCPHDR_FIN; - TCP_SKB_CB(skb)-end_seq++; + if (tskb (tcp_send_head(sk) || sk_under_memory_pressure(sk))) { +coalesce: + TCP_SKB_CB(tskb)-tcp_flags |= TCPHDR_FIN; + TCP_SKB_CB(tskb)-end_seq++; tp-write_seq++; + if (!tcp_send_head(sk)) { + tp-snd_nxt++; + return; + } } else { - /* Socket is locked, keep trying until memory is available. */ - for (;;) { - skb = alloc_skb_fclone(MAX_TCP_HEADER, - sk-sk_allocation); - if (skb) - break; - yield(); + skb = alloc_skb_fclone(MAX_TCP_HEADER, sk-sk_allocation); + if (unlikely(!skb)) { + if (tskb) + goto coalesce; + return; } skb_reserve(skb, MAX_TCP_HEADER); sk_forced_wmem_schedule(sk, skb-truesize); @@ -2862,7 +2863,7 @@ void tcp_send_fin(struct sock *sk) TCPHDR_ACK | TCPHDR_FIN); tcp_queue_skb(sk, skb); } - __tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_OFF); + __tcp_push_pending_frames(sk, tcp_current_mss(sk), TCP_NAGLE_OFF); } /* We get here when a process closes a file descriptor (either due to -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v2] be2net: log link status
On 04/23/2015 08:31 AM, Sathya Perla wrote: -Original Message- From: Ivan Vecera [mailto:ivec...@redhat.com] The driver unlike other drivers does not log link state changes. v2: added current link speed to log message Ivan, I disagree with the v2 change. I think your original intention was just to log a message when the link goes up or down asynchronously (i.e., without any user intervention.) After alerting the user, if the user wants to know other link properties like speed, duplex etc then the ethtool cmd needs to be used; there is no need to log a message with those details. Yes, this was my original intention... to see these async link events in the system log. In this case the v1 should be used. Ivan -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH net-next v2] be2net: log link status
-Original Message- From: Ivan Vecera [mailto:ivec...@redhat.com] The driver unlike other drivers does not log link state changes. v2: added current link speed to log message Ivan, I disagree with the v2 change. I think your original intention was just to log a message when the link goes up or down asynchronously (i.e., without any user intervention.) After alerting the user, if the user wants to know other link properties like speed, duplex etc then the ethtool cmd needs to be used; there is no need to log a message with those details. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 2/3] tc: deprecate TC_ACT_QUEUED
On 04/23/2015 04:46 AM, Alexei Starovoitov wrote: ... The other two threads degenerated into non-technical comments. Yep. :-/ Anyway, this set was RFC to answer my main question whether I should continue with tc cleanup or stop right here. I got my answer. I think it's worth proceeding with this, it's long overdue. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch added to the 3.12 stable tree] net: ethernet: pcnet32: Setup the SRAM and NOUFLO on Am79C97{3, 5}
From: Markos Chandras markos.chand...@imgtec.com This patch has been added to the 3.12 stable tree. If you have any objections, please let us know. === commit 87f966d97b89774162df04d2106c6350c8fe4cb3 upstream. On a MIPS Malta board, tons of fifo underflow errors have been observed when using u-boot as bootloader instead of YAMON. The reason for that is that YAMON used to set the pcnet device to SRAM mode but u-boot does not. As a result, the default Tx threshold (64 bytes) is now too small to keep the fifo relatively used and it can result to Tx fifo underflow errors. As a result of which, it's best to setup the SRAM on supported controllers so we can always use the NOUFLO bit. Cc: netdev@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: Don Fry pcne...@frontier.com Signed-off-by: Markos Chandras markos.chand...@imgtec.com Signed-off-by: David S. Miller da...@davemloft.net Signed-off-by: Jiri Slaby jsl...@suse.cz --- drivers/net/ethernet/amd/pcnet32.c | 31 +-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/amd/pcnet32.c b/drivers/net/ethernet/amd/pcnet32.c index 2d8e28819779..048743573230 100644 --- a/drivers/net/ethernet/amd/pcnet32.c +++ b/drivers/net/ethernet/amd/pcnet32.c @@ -1516,7 +1516,7 @@ pcnet32_probe1(unsigned long ioaddr, int shared, struct pci_dev *pdev) { struct pcnet32_private *lp; int i, media; - int fdx, mii, fset, dxsuflo; + int fdx, mii, fset, dxsuflo, sram; int chip_version; char *chipname; struct net_device *dev; @@ -1553,7 +1553,7 @@ pcnet32_probe1(unsigned long ioaddr, int shared, struct pci_dev *pdev) } /* initialize variables */ - fdx = mii = fset = dxsuflo = 0; + fdx = mii = fset = dxsuflo = sram = 0; chip_version = (chip_version 12) 0x; switch (chip_version) { @@ -1586,6 +1586,7 @@ pcnet32_probe1(unsigned long ioaddr, int shared, struct pci_dev *pdev) chipname = PCnet/FAST III 79C973; /* PCI */ fdx = 1; mii = 1; + sram = 1; break; case 0x2626: chipname = PCnet/Home 79C978; /* PCI */ @@ -1609,6 +1610,7 @@ pcnet32_probe1(unsigned long ioaddr, int shared, struct pci_dev *pdev) chipname = PCnet/FAST III 79C975; /* PCI */ fdx = 1; mii = 1; + sram = 1; break; case 0x2628: chipname = PCnet/PRO 79C976; @@ -1637,6 +1639,31 @@ pcnet32_probe1(unsigned long ioaddr, int shared, struct pci_dev *pdev) dxsuflo = 1; } + /* +* The Am79C973/Am79C975 controllers come with 12K of SRAM +* which we can use for the Tx/Rx buffers but most importantly, +* the use of SRAM allow us to use the BCR18:NOUFLO bit to avoid +* Tx fifo underflows. +*/ + if (sram) { + /* +* The SRAM is being configured in two steps. First we +* set the SRAM size in the BCR25:SRAM_SIZE bits. According +* to the datasheet, each bit corresponds to a 512-byte +* page so we can have at most 24 pages. The SRAM_SIZE +* holds the value of the upper 8 bits of the 16-bit SRAM size. +* The low 8-bits start at 0x00 and end at 0xff. So the +* address range is from 0x up to 0x17ff. Therefore, +* the SRAM_SIZE is set to 0x17. The next step is to set +* the BCR26:SRAM_BND midway through so the Tx and Rx +* buffers can share the SRAM equally. +*/ + a-write_bcr(ioaddr, 25, 0x17); + a-write_bcr(ioaddr, 26, 0xc); + /* And finally enable the NOUFLO bit */ + a-write_bcr(ioaddr, 18, a-read_bcr(ioaddr, 18) | (1 11)); + } + dev = alloc_etherdev(sizeof(*lp)); if (!dev) { ret = -ENOMEM; -- 2.3.5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH net-next v2] be2net: log link status
-Original Message- From: Joe Perches [mailto:j...@perches.com] On Wed, 2015-04-22 at 15:43 +0200, Ivan Vecera wrote: The driver unlike other drivers does not log link state changes. ... Question for the emulex folk: Is the dom argument to link_status_query necessary? It's currently always 0. It's needed only if the PF wants to query the link_status of a VF, which it doesn't do currently... Yes, its un-necessary and can be removed (in a separate patch I guess :-)). thanks, -Sathya -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH net-next] be2net: log link status
-Original Message- From: Ivan Vecera [mailto:ivec...@redhat.com] The driver unlike other drivers does not log link state changes. Cc: Sathya Perla sathya.pe...@emulex.com Cc: Subbu Seetharaman subbu.seethara...@emulex.com Cc: Ajit Khaparde ajit.khapa...@emulex.com Signed-off-by: Ivan Vecera ivec...@redhat.com --- drivers/net/ethernet/emulex/benet/be_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index fb0bc3c..e349131 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -662,6 +662,8 @@ void be_link_status_update(struct be_adapter *adapter, u8 link_status) netif_carrier_on(netdev); else netif_carrier_off(netdev); + + netdev_info(netdev, Link is %s\n, link_status ? Up : Down); } Acked-by: Sathya Perla sathya.pe...@emulex.com -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CCM/GCM implementation defect
On 4/23/2015 6:26 AM, Herbert Xu wrote: Hi: It looks like our IPsec implementations of CCM and GCM are buggy This applies also to GMAC (rfc4543), right? in that they don't include the IV in the authentication calculation. This definitely breaks interoperability with anyone who implements them correctly. The fact that there have been no reports on this probably means that nobody has run into this in the field yet. Does this mean that even the test vectors (crypto/testmgr.h) are broken? Thanks, Horia -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resource usages in Linux drivers
Thank you! On 04/23/2015 05:19 PM, Sergei Shtylyov wrote: Hello. On 4/23/2015 9:45 AM, Jia-Ju Bai wrote: Dear Sir, I am very sorry to trouble you. I find that resource management is error-prone when writing Linux drivers, and many problems may occur, such as resource leaks. Meanwhile, I find that many applied patches in the kernel mailing list focus on releasing allocated resources, especially in error-handling paths. Therefore, I have a question: is it possible to automatically release allocated resources in drivers before unloading and in error-handling paths? Yes, there's managed device API, look for functions starting with devm_. There's one limitation though: it can be used only in the driver's probe() method, so can't be used when e.g. network device is being opened. I think many APIs, such as kmalloc, can also be managed like garbage collection in Java. Maybe the performance is a matter. I am looking forward to your reply, thanks! Such questions should actually be asked on the mailing lists, not personally. I am sorry for that, and I will cc to the mailing lists and other maintainers. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/3] tc: fix return values of ingress qdisc
On 04/22/15 at 04:29pm, Cong Wang wrote: On Wed, Apr 22, 2015 at 3:04 PM, Alexei Starovoitov a...@plumgrid.com wrote: On 4/21/15 9:59 PM, Cong Wang wrote: On Tue, Apr 21, 2015 at 12:27 PM, Alexei Starovoitov a...@plumgrid.com wrote: ingress qdisc should return NET_XMIT_* values just like all other qdiscs. XMIT already means egress... may be then it should be renamed as well. from include/linux/netdevice.h: /* qdisc -enqueue() return codes. */ #define NET_XMIT_SUCCESS0x00 ... the point is that qdisc-enqeue() must return NET_XMIT_* values. ingress qdisc is violating this and therefore should be fixed. XMIT is non-sense for ingress, you really need to pick another name for it if TC_ACT_OK isn't okay for you (it is okay for me). You transmit into a qdisc. If that terminology doesn't suit you then rename it to NET_QUEUE_* but moving away from returning TC_ACT_* is definitely the right thing to do here. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 3/3] tc: cleanup tc_classify
On 04/22/15 at 04:38pm, Cong Wang wrote: On Wed, Apr 22, 2015 at 3:27 PM, Alexei Starovoitov a...@plumgrid.com wrote: On 4/21/15 10:05 PM, Cong Wang wrote: On Tue, Apr 21, 2015 at 12:27 PM, Alexei Starovoitov a...@plumgrid.com wrote: introduce tc_classify_act() and qdisc_drop_bypass() helper functions to reduce copy-paste among different qdiscs I like this cleanup. It aligns all skb dropping in qdiscs to a qdisc_drop*() function. I don't think qdisc_drop_bypass() is more readable than without it, maybe you need a better name, or just leave the code as it is. what would be a better name? I'm open to suggestions. My reading for qdisc_drop_bypass() is it bypasses packet dropping for some case, apparently doesn't match its definition. I can't think out a better name therefore I don't think it deserves a function, just leave as it is. Interesting logic ;-) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CCM/GCM implementation defect
On Thu, Apr 23, 2015 at 12:03:38PM +0300, Horia Geantă wrote: This applies also to GMAC (rfc4543), right? No RFC4543 appears to be correctly implemented. Does this mean that even the test vectors (crypto/testmgr.h) are broken? Indeed. The test vectors appear to be generated either through our implementation or by one that is identical to us. Cheers, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] netdev_alloc_pcpu_stats: use less common iterator variable
From: Johannes Berg johannes.b...@intel.com With the CPU iteration variable called 'i', it's relatively easy to have variable shadowing which sparse will warn about. Avoid that by renaming the variable to __cpu which is less likely to be used in the surrounding context. Signed-off-by: Johannes Berg johannes.b...@intel.com --- include/linux/netdevice.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index bcbde799ec69..6719041d8851 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2021,10 +2021,10 @@ struct pcpu_sw_netstats { ({ \ typeof(type) __percpu *pcpu_stats = alloc_percpu(type); \ if (pcpu_stats) { \ - int i; \ - for_each_possible_cpu(i) { \ + int __cpu; \ + for_each_possible_cpu(__cpu) { \ typeof(type) *stat; \ - stat = per_cpu_ptr(pcpu_stats, i); \ + stat = per_cpu_ptr(pcpu_stats, __cpu); \ u64_stats_init(stat-syncp); \ } \ } \ -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resource usages in Linux drivers
On 04/23/2015 05:29 PM, Jia-Ju Bai wrote: Thank you! On 04/23/2015 05:19 PM, Sergei Shtylyov wrote: Hello. On 4/23/2015 9:45 AM, Jia-Ju Bai wrote: Dear Sir, I am very sorry to trouble you. I find that resource management is error-prone when writing Linux drivers, and many problems may occur, such as resource leaks. Meanwhile, I find that many applied patches in the kernel mailing list focus on releasing allocated resources, especially in error-handling paths. Therefore, I have a question: is it possible to automatically release allocated resources in drivers before unloading and in error-handling paths? Yes, there's managed device API, look for functions starting with devm_. There's one limitation though: it can be used only in the driver's probe() method, so can't be used when e.g. network device is being opened. I think many APIs, such as kmalloc, can also be managed like garbage collection in Java. Maybe the performance is a matter. I am looking forward to your reply, thanks! Such questions should actually be asked on the mailing lists, not personally. I am sorry for that, and I will cc to the mailing lists and other maintainers. I find that some common APIs are not managed, such as napi_enable and napi_start_queue. Is it possible to provide managed APIs for them? I also find many drivers do not use these managed APIs, especially in ethernet card drivers (like e100, r8169). Is it possible to change them? Best wishes, Jia-Ju Bai -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CCM/GCM implementation defect
Hi Herbert, Does this mean that even the test vectors (crypto/testmgr.h) are broken? Indeed. The test vectors appear to be generated either through our implementation or by one that is identical to us. I'm not sure about that. RFC4106 refers to [1] for test vectors, which is still available at web.archive.org [2]. When looking for example at Test Case 3, this is the same as in a newer revision of the document [3]. That looks exactly the same as aes_gcm_enc_tv_template[2] from testmgr.h. We by the way use test vectors in userland from the same document to verify our own GCM backend, our OpenSSL backend and an AESNI/PCLMULQD backend. And I've never heard of any incompatibilities. Regards Martin [1]http://csrc.nist.gov/CryptoToolkit/modes/proposedmodes/gcm/gcm-spec.pdf [2]http://web.archive.org/web/20070712195408/http://csrc.nist.gov/CryptoToolkit/modes/proposedmodes/gcm/gcm-spec.pdf [3]http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/gcm/gcm-revised-spec.pdf -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CCM/GCM implementation defect
On Thu, Apr 23, 2015 at 11:58:52AM +0200, Martin Willi wrote: I'm not sure about that. RFC4106 refers to [1] for test vectors, which is still available at web.archive.org [2]. When looking for example at Test Case 3, this is the same as in a newer revision of the document [3]. That looks exactly the same as aes_gcm_enc_tv_template[2] from testmgr.h. These are GCM test vectors, not RFC4106 test vectors so they are of no use when you're testing whether the IPsec IV (which is not the same thing as the GCM IV) is included in the authentication or not. AFAIK GCM itself is implemented correctly. It's only the IPsec wrapper around it (rfc4106 in particular) that's broken. Cheers, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Resource usages in Linux drivers
On 4/23/2015 1:08 PM, Jia-Ju Bai wrote: I am very sorry to trouble you. I find that resource management is error-prone when writing Linux drivers, and many problems may occur, such as resource leaks. Meanwhile, I find that many applied patches in the kernel mailing list focus on releasing allocated resources, especially in error-handling paths. Therefore, I have a question: is it possible to automatically release allocated resources in drivers before unloading and in error-handling paths? Yes, there's managed device API, look for functions starting with devm_. There's one limitation though: it can be used only in the driver's probe() method, so can't be used when e.g. network device is being opened. I think many APIs, such as kmalloc, can also be managed like garbage collection in Java. Maybe the performance is a matter. I am looking forward to your reply, thanks! Such questions should actually be asked on the mailing lists, not personally. I am sorry for that, and I will cc to the mailing lists and other maintainers. I find that some common APIs are not managed, such as napi_enable and napi_start_queue. Is it possible to provide managed APIs for them? No, they're only called from ndo_open() method IIRC. The device managed APIs can only be called at the device probing time. I also find many drivers do not use these managed APIs, especially in ethernet card drivers (like e100, r8169). Is it possible to change them? Patches welcome. :-) Best wishes, Jia-Ju Bai WBR, Sergei -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CCM/GCM implementation defect
Hi Steffen, It looks like our IPsec implementations of CCM and GCM are buggy in that they don't include the IV in the authentication calculation. Seems like crypto_rfc4106_crypt() passes the associated data it got from ESP directly to gcm, without chaining with the IV. Do you have any pointer for me where this is defined? Why is it needed, given that GCM implicitly authenticates the IV by using it in Y0? Also, I've just verified that a tunnel between the Windows Filtering Platform and Linux 3.13 using AES128GCM16 works just fine. So if we do something wrong, the problem does not only affect Linux. Regards Martin -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net: phy: micrel: don't do clock-mode-select if we got NULL clock
Signed-off-by: Niklas Cassel nikl...@axis.com --- drivers/net/phy/micrel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 1190fd8..a422036 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -548,7 +548,7 @@ static int kszphy_probe(struct phy_device *phydev) } clk = devm_clk_get(phydev-dev, rmii-ref); - if (!IS_ERR(clk)) { + if (!IS_ERR_OR_NULL(clk)) { unsigned long rate = clk_get_rate(clk); bool rmii_ref_clk_sel_25_mhz; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 0/3] tipc: three bug fixes
A set of unrelated corrections; one for the tipc netns implementation, one regarding problems with random link resets, and one removing a an erroneous refcount decrement when reading link statistsics via netlink. Erik Hugne (2): tipc: fix random link reset problem tipc: fix node refcount issue Ying Xue (1): tipc: fix topology server broken issue net/tipc/link.c | 1 - net/tipc/server.c | 9 +++-- net/tipc/socket.c | 3 ++- 3 files changed, 5 insertions(+), 8 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 3/3] tipc: fix node refcount issue
From: Erik Hugne erik.hu...@ericsson.com When link statistics is dumped over netlink, we iterate over the list of peer nodes and append each links statistics to the netlink msg. In the case where the dump is resumed after filling up a nlmsg, the node refcnt is decremented without having been incremented previously which may cause the node reference to be freed. When this happens, the following info/stacktrace will be generated, followed by a crash or undefined behavior. We fix this by removing the erroneous call to tipc_node_put inside the loop that iterates over nodes. [ 384.312303] INFO: trying to register non-static key. [ 384.313110] the code is fine but needs lockdep annotation. [ 384.313290] turning off the locking correctness validator. [ 384.313290] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.0.0+ #13 [ 384.313290] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 384.313290] 88003c6d0290 88003cc03ca8 8170adf1 0007 [ 384.313290] 82728730 88003cc03d38 810a6a6d 001d7200 [ 384.313290] 88003c6d0ab0 88003cc03ce8 0285 0001 [ 384.313290] Call Trace: [ 384.313290] IRQ [8170adf1] dump_stack+0x4c/0x65 [ 384.313290] [810a6a6d] __lock_acquire+0xf3d/0xf50 [ 384.313290] [810a7375] lock_acquire+0xd5/0x290 [ 384.313290] [a0043e8c] ? link_timeout+0x1c/0x170 [tipc] [ 384.313290] [a0043e70] ? link_state_event+0x4e0/0x4e0 [tipc] [ 384.313290] [81712890] _raw_spin_lock_bh+0x40/0x80 [ 384.313290] [a0043e8c] ? link_timeout+0x1c/0x170 [tipc] [ 384.313290] [a0043e8c] link_timeout+0x1c/0x170 [tipc] [ 384.313290] [810c4698] call_timer_fn+0xb8/0x490 [ 384.313290] [810c45e0] ? process_timeout+0x10/0x10 [ 384.313290] [810c5a2c] run_timer_softirq+0x21c/0x420 [ 384.313290] [a0043e70] ? link_state_event+0x4e0/0x4e0 [tipc] [ 384.313290] [8105a954] __do_softirq+0xf4/0x630 [ 384.313290] [8105afdd] irq_exit+0x5d/0x60 [ 384.313290] [8103ade1] smp_apic_timer_interrupt+0x41/0x50 [ 384.313290] [817144a0] apic_timer_interrupt+0x70/0x80 [ 384.313290] EOI [8100db10] ? default_idle+0x20/0x210 [ 384.313290] [8100db0e] ? default_idle+0x1e/0x210 [ 384.313290] [8100e61a] arch_cpu_idle+0xa/0x10 [ 384.313290] [81099803] cpu_startup_entry+0x2c3/0x530 [ 384.313290] [810d2893] ? clockevents_register_device+0x113/0x200 [ 384.313290] [81038b0f] start_secondary+0x13f/0x170 Fixes: 8a0f6ebe8494 (tipc: involve reference counter for node structure) Signed-off-by: Erik Hugne erik.hu...@ericsson.com Signed-off-by: Jon Maloy jon.ma...@ericsson.com --- net/tipc/link.c | 1 - 1 file changed, 1 deletion(-) diff --git a/net/tipc/link.c b/net/tipc/link.c index a6b30df..57be6e6 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -2143,7 +2143,6 @@ int tipc_nl_link_dump(struct sk_buff *skb, struct netlink_callback *cb) err = __tipc_nl_add_node_links(net, msg, node, prev_link); tipc_node_unlock(node); - tipc_node_put(node); if (err) goto out; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 1/3] tipc: fix topology server broken issue
From: Ying Xue ying@windriver.com When a new topology server is launched in a new namespace, its listening socket is inserted into the init ns namespace's socket hash table rather than the one owned by the new namespace. Although the socket's namespace is forcedly changed to the new namespace later, the socket is still stored in the socket hash table of init ns namespace. When a client created in the new namespace connects its own topology server, the connection is failed as its server's socket could not be found from its own namespace's socket table. If __sock_create() instead of original sock_create_kern() is used to create the server's socket through specifying an expected namesapce, the socket will be inserted into the specified namespace's socket table, thereby avoiding to the topology server broken issue. Fixes: 76100a8a64bc (tipc: fix netns refcnt leak) Reported-by: Erik Hugne erik.hu...@ericsson.com Signed-off-by: Ying Xue ying@windriver.com Signed-off-by: Jon Maloy jon.ma...@ericsson.com --- net/tipc/server.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/net/tipc/server.c b/net/tipc/server.c index ab6183c..77ff03e 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -102,7 +102,7 @@ static void tipc_conn_kref_release(struct kref *kref) } saddr-scope = -TIPC_NODE_SCOPE; kernel_bind(sock, (struct sockaddr *)saddr, sizeof(*saddr)); - sk_release_kernel(sk); + sock_release(sock); con-sock = NULL; } @@ -321,12 +321,9 @@ static struct socket *tipc_create_listen_sock(struct tipc_conn *con) struct socket *sock = NULL; int ret; - ret = sock_create_kern(AF_TIPC, SOCK_SEQPACKET, 0, sock); + ret = __sock_create(s-net, AF_TIPC, SOCK_SEQPACKET, 0, sock, 1); if (ret 0) return NULL; - - sk_change_net(sock-sk, s-net); - ret = kernel_setsockopt(sock, SOL_TIPC, TIPC_IMPORTANCE, (char *)s-imp, sizeof(s-imp)); if (ret 0) @@ -376,7 +373,7 @@ static struct socket *tipc_create_listen_sock(struct tipc_conn *con) create_err: kernel_sock_shutdown(sock, SHUT_RDWR); - sk_release_kernel(sock-sk); + sock_release(sock); return NULL; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 2/3] tipc: fix random link reset problem
From: Erik Hugne erik.hu...@ericsson.com In the function tipc_sk_rcv(), the stack variable 'err' is only initialized to TIPC_ERR_NO_PORT for the first iteration over the link input queue. If a chain of messages are received from a link, failure to lookup the socket for any but the first message will cause the message to bounce back out on a random link. We fix this by properly initializing err. Signed-off-by: Erik Hugne erik.hu...@ericsson.com Signed-off-by: Jon Maloy jon.ma...@ericsson.com --- net/tipc/socket.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/tipc/socket.c b/net/tipc/socket.c index ee90d74..9074b5c 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -1764,13 +1764,14 @@ static int tipc_sk_enqueue(struct sk_buff_head *inputq, struct sock *sk, int tipc_sk_rcv(struct net *net, struct sk_buff_head *inputq) { u32 dnode, dport = 0; - int err = -TIPC_ERR_NO_PORT; + int err; struct sk_buff *skb; struct tipc_sock *tsk; struct tipc_net *tn; struct sock *sk; while (skb_queue_len(inputq)) { + err = -TIPC_ERR_NO_PORT; skb = NULL; dport = tipc_skb_peek_port(inputq, dport); tsk = tipc_sk_lookup(net, dport); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CCM/GCM implementation defect
On Thu, Apr 23, 2015 at 11:26:20AM +0800, Herbert Xu wrote: Hi: It looks like our IPsec implementations of CCM and GCM are buggy in that they don't include the IV in the authentication calculation. Seems like crypto_rfc4106_crypt() passes the associated data it got from ESP directly to gcm, without chaining with the IV. This definitely breaks interoperability with anyone who implements them correctly. The fact that there have been no reports on this probably means that nobody has run into this in the field yet. On the security side, this is probably not a big deal for CCM because it always verifies the authentication tag after decryption. But for GCM this may be a DoS issue as an attacker could modify the IV without triggering the authentication check and thus cause an unnecessary decryption. For both CCM and GCM this will result in random data injected as a packet into the network stack which hopefully will be dropped. In order to fix this without breaking backwards compatibility, my plan is to introduce new templates such as rfc4106v2 which implement the RFC correctly. The existing templates will be retained so that current users aren't broken by the fix. Adding a second template for the correct implementation is probaply the only thing that we can do if we don't want to break backwards compatibility. But maybe we can add a warning to the old implementation, such that users notice that they use a broken version. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH next 1/3] ipvlan: Defer multicast / broadcast processing to a work-queue
On Thu, 2015-04-23 at 14:29 -0700, Mahesh Bandewar wrote: +static void ipvlan_multicast_enqueue(struct ipvl_port *port, + struct sk_buff *skb) +{ + if (skb-protocol == htons(ETH_P_PAUSE)) + return; But what happens to this packet ? It seems leaked ? + + spin_lock(port-backlog.lock); + if (skb_queue_len(port-backlog) IPVLAN_QBACKLOG_LIMIT) { + __skb_queue_tail(port-backlog, skb); + spin_unlock(port-backlog.lock); + } else { + spin_unlock(port-backlog.lock); + atomic_long_inc(skb-dev-rx_dropped); + kfree_skb(skb); + } + schedule_work(port-wq); No point calling schedule_work(port-wq); if packet was dropped. We are under pressure, so don't add extra cpu cycles ;) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
rhashtable: Add cap on number of elements in hash table
On Thu, Apr 23, 2015 at 11:59:19AM -0400, David Miller wrote: From: Johannes Berg johan...@sipsolutions.net Date: Thu, 23 Apr 2015 16:38:43 +0200 From: Johannes Berg johannes.b...@intel.com The conversion of mac80211's station table to rhashtable had a bug that I found by accident in code review, that hadn't been found as rhashtable apparently managed to have a maximum hash chain length of one (!) in all our testing. In order to test the bug and verify the fix I set my rhashtable's max_size very low (4) in order to force getting hash collisions. At that point, rhashtable WARNed in rhashtable_insert_rehash() but didn't actually reject the hash table insertion. This caused it to lose insertions - my master list of stations would have 9 entries, but the rhashtable only had 5. This may warrant a deeper look, but that WARN_ON() just shouldn't happen. Fix this by not returning true from rht_grow_above_100() when the rhashtable's max_size has been reached - in this case the user is explicitly configuring it to be at most that big, so even if it's now above 100% it shouldn't attempt to resize. This fixes the lost insertion issue and consequently allows my code to display its error (and verify my fix for it.) Signed-off-by: Johannes Berg johannes.b...@intel.com It looks fine to me, but I'll let Herbert and Thomas review this. Thanks for the heads up. It seems that I lost track somewhere along the line. I meant to add an explicit limit on the overall number of entries since that was what users like netlink expected but never got around to doing it. Instead it seems that we're currently relying on the rht_grow_above_100 to protect us. So here is a patch that adds an explicit limit and fixes the problem Johannes reported. ---8--- We currently have no limit on the number of elements in a hash table. This is very bad especially considering that some rhashtable users had such a limit before the conversion and relied on it for defence against DoS attacks. We already have a maximum hash table size limit but its enforcement is only by luck and results in a nasty WARN_ON. This patch adds a new paramater insecure_max_entries which becomes the cap on the table. If unset it defaults to max_size. If it is also zero it means that there is no cap on the number of elements in the table. However, the table will grow whenever the utilisation hits 100% and if that growth fails, you will get ENOMEM on insertion. As allowing 100% utilisation is potentially dangerous, the name contains the word insecure. Note that the cap is not a hard limit. This is done for performance reasons as enforcing a hard limit will result in use of atomic ops that are heavier than the ones we currently use. The reasoning is that we're only guarding against a gross over- subscription of the table, rather than a small breach of the limit. Reported-by: Johannes Berg johannes.b...@intel.com Signed-off-by: Herbert Xu herb...@gondor.apana.org.au diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h index e23d242..95b5a36 100644 --- a/include/linux/rhashtable.h +++ b/include/linux/rhashtable.h @@ -17,6 +17,7 @@ #ifndef _LINUX_RHASHTABLE_H #define _LINUX_RHASHTABLE_H +#include linux/atomic.h #include linux/compiler.h #include linux/errno.h #include linux/jhash.h @@ -100,6 +101,7 @@ struct rhashtable; * @key_len: Length of key * @key_offset: Offset of key in struct to be hashed * @head_offset: Offset of rhash_head in struct to be hashed + * @insecure_max_entries: Maximum number of entries (may be exceeded) * @max_size: Maximum size while expanding * @min_size: Minimum size while shrinking * @nulls_base: Base value to generate nulls marker @@ -115,6 +117,7 @@ struct rhashtable_params { size_t key_len; size_t key_offset; size_t head_offset; + unsigned intinsecure_max_entries; unsigned intmax_size; unsigned intmin_size; u32 nulls_base; @@ -282,7 +285,20 @@ static inline bool rht_shrink_below_30(const struct rhashtable *ht, static inline bool rht_grow_above_100(const struct rhashtable *ht, const struct bucket_table *tbl) { - return atomic_read(ht-nelems) tbl-size; + return atomic_read(ht-nelems) tbl-size + (!ht-p.max_size || tbl-size ht-p.max_size); +} + +/** + * rht_grow_above_max - returns true if table is above maximum + * @ht:hash table + * @tbl: current table + */ +static inline bool rht_grow_above_max(const struct rhashtable *ht, + const struct bucket_table *tbl) +{ + return ht-p.insecure_max_entries + atomic_read(ht-nelems) = ht-p.insecure_max_entries; } /* The bucket lock is selected based on the hash and protects mutations @@ -611,6 +627,10 @@
[PATCH] ehea: Fix memory hook reference counting crashes
The recent commit to only register the EHEA memory hotplug hooks on adapter probe has a few problems. Firstly the reference counting is wrong for multiple adapters, in that the hooks are registered multiple times. Secondly the check in the tear down path is backward. Finally the error path doesn't decrement the count. The multiple registration of the hooks is the biggest problem, as it leads to oopses when the system is rebooted, and/or errors during memory hotplug, eg: $ ./mem-on-off-test.sh -r 2 ... ehea: memory is going offline ehea: LPAR memory changed - re-initializing driver ehea: re-initializing driver complete ehea: memory is going offline ehea: LPAR memory changed - re-initializing driver ehea: opcode=26c ret=fffc arg1=8303 arg2=0 arg3=7006d600 arg4=3fded arg5=200 arg6=0 arg7=0 ehea: register_rpage_mr failed ehea: registering mr failed ehea: register MR failed - driver inoperable! ehea: memory is going offline Fixes: aa183323312d (ehea: Register memory hotplug, reboot and crash hooks on adapter probe) Signed-off-by: Michael Ellerman m...@ellerman.id.au --- drivers/net/ethernet/ibm/ehea/ehea_main.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/ibm/ehea/ehea_main.c b/drivers/net/ethernet/ibm/ehea/ehea_main.c index c05e50759621..00d86be0c831 100644 --- a/drivers/net/ethernet/ibm/ehea/ehea_main.c +++ b/drivers/net/ethernet/ibm/ehea/ehea_main.c @@ -3347,7 +3347,7 @@ static int ehea_register_memory_hooks(void) { int ret = 0; - if (atomic_inc_and_test(ehea_memory_hooks_registered)) + if (atomic_inc_return(ehea_memory_hooks_registered) 1) return 0; ret = ehea_create_busmap(); @@ -3381,12 +3381,14 @@ out3: out2: unregister_reboot_notifier(ehea_reboot_nb); out: + atomic_dec(ehea_memory_hooks_registered); return ret; } static void ehea_unregister_memory_hooks(void) { - if (atomic_read(ehea_memory_hooks_registered)) + /* Only remove the hooks if we've registered them */ + if (atomic_read(ehea_memory_hooks_registered) == 0) return; unregister_reboot_notifier(ehea_reboot_nb); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: netcp: remove call to netif_carrier_(on/off) for MAC to Phy interface
On Friday 24 April 2015 12:47 AM, Murali Karicheri wrote: Currently when interface type is MAC to Phy, netif_carrier_(on/off) is called which is not needed as Phy lib already updates the carrier status to net stack. This is needed only for other interface types Signed-off-by: Murali Karicheri m-kariche...@ti.com --- Acked-by: Mugunthan V N mugunthan...@ti.com Regards Mugunthan V N -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH next 1/3] ipvlan: Defer multicast / broadcast processing to a work-queue
On Thu, Apr 23, 2015 at 9:28 PM, David Miller da...@davemloft.net wrote: From: Mahesh Bandewar mahe...@google.com Date: Thu, 23 Apr 2015 19:54:29 -0700 On Thu, Apr 23, 2015 at 5:32 PM, Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-04-23 at 14:29 -0700, Mahesh Bandewar wrote: +static void ipvlan_multicast_enqueue(struct ipvl_port *port, + struct sk_buff *skb) +{ + if (skb-protocol == htons(ETH_P_PAUSE)) + return; But what happens to this packet ? It seems leaked ? Hmm, will take care of it in v2. Please also provide a proper [PATCH 0/3] introduction posting stating what the patch series is doing, and why. Will do. Thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/tg3: Release IRQs on permanent error
When having permanent EEH error, the PCI device will be removed from the system. For this case, we shouldn't set pcierr_recovery to true wrongly, which blocks the driver to release the allocated interrupts and their handlers. Eventually, we can't disable MSI or MSIx successfully because of the MSI or MSIx interrupts still have associated interrupt actions, which is turned into following stack dump. Oops: Exception in kernel mode, sig: 5 [#1] : [c03b76a8] .free_msi_irqs+0x80/0x1a0 (unreliable) [c039f388] .pci_remove_bus_device+0x98/0x110 [c00790f4] .pcibios_remove_pci_devices+0x9c/0x128 [c0077b98] .handle_eeh_events+0x2d8/0x4b0 [c00782d0] .eeh_event_handler+0x130/0x1c0 [c0022bd4] .kernel_thread+0x54/0x70 Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com --- drivers/net/ethernet/broadcom/tg3.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c index 1270b18..069952f 100644 --- a/drivers/net/ethernet/broadcom/tg3.c +++ b/drivers/net/ethernet/broadcom/tg3.c @@ -18129,7 +18129,9 @@ static pci_ers_result_t tg3_io_error_detected(struct pci_dev *pdev, rtnl_lock(); - tp-pcierr_recovery = true; + /* We needn't recover from permanent error */ + if (state == pci_channel_io_frozen) + tp-pcierr_recovery = true; /* We probably don't have netdev yet */ if (!netdev || !netif_running(netdev)) -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CCM/GCM implementation defect
On Thu, Apr 23, 2015 at 03:24:59PM +0200, Martin Willi wrote: Do you have any pointer for me where this is defined? Why is it needed, given that GCM implicitly authenticates the IV by using it in Y0? Actually you're quite right. Both GCM and CCM implicitly includes the IV in the authentication tag. In fact after reviewing the two relevant RFCs (4106/4309) it seems that we are correct after all since they explicitly exclude the IV from the AAD. Now we just need to figre out whether we're still OK with RFC4543. Sorry for the false alarm. Cheers, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] ethernet: myri10ge: use arch_phys_wc_add()
On Thu, Apr 23, 2015 at 02:28:40PM -0400, David Miller wrote: I can't wait forever for the driver maintainers to review this, so I'm applying it. FWIW, I replied to Luis's patch yesterday. It appears on the netdev list and patchwork. Checked the recipient list, and it includes netdev, but not your address. I guess I will not make that mistake again. https://patchwork.ozlabs.org/patch/463481/ -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html