Re: [iproute2 PATCH] bridge: fix vlan show stats formatting

2018-10-22 Thread Stephen Hemminger
On Sat, 20 Oct 2018 15:42:33 +0200 Tobias Jungel wrote: > The output of -statistics vlan show was broken previous change for json > output. This aligns the format to vlan show. > > Signed-off-by: Tobias Jungel Applied, thanks

Re: [PATCH iproute2-next] Tree wide: Drop sockaddr_nl arg

2018-10-22 Thread Stephen Hemminger
On Fri, 19 Oct 2018 13:44:18 -0700 David Ahern wrote: > From: David Ahern > > No command, filter, or print function uses the sockaddr_nl arg, > so just drop it. > > Signed-off-by: David Ahern Acked-by: Stephen Hemminger

[PATCH iproute2] doc/man: spelling fixes

2018-10-18 Thread Stephen Hemminger
Use ispell and codespell to find/fix spelling errors in documentation and man pages. Signed-off-by: Stephen Hemminger --- doc/actions/actions-general | 14 +++--- doc/actions/ifb-README | 18 +- doc/actions/mirred-usage| 6 +++--- man/man8/ip-link.8

Re: Form sk_buff from DMA page

2018-10-17 Thread Stephen Hemminger
On Wed, 17 Oct 2018 17:32:33 + Keyur Amrutbhai Patel wrote: > Hi, > > Can anyone help me on how to form sk_buff from DMA page? Basically I get > complete packet from DMA as single page. > > Regards, > Keyur > This email and any attachments are intended for the sole use of the named >

Re: [PATCH net] r8169: fix NAPI handling under high load

2018-10-16 Thread Stephen Hemminger
On Tue, 16 Oct 2018 23:17:31 +0200 Holger Hoffstätte wrote: > On 10/16/18 22:37, Heiner Kallweit wrote: > > rtl_rx() and rtl_tx() are called only if the respective bits are set > > in the interrupt status register. Under high load NAPI may not be > > able to process all data (work_done ==

Re: [PATCH net] r8169: fix NAPI handling under high load

2018-10-16 Thread Stephen Hemminger
On Tue, 16 Oct 2018 22:37:31 +0200 Heiner Kallweit wrote: > rtl_rx() and rtl_tx() are called only if the respective bits are set > in the interrupt status register. Under high load NAPI may not be > able to process all data (work_done == budget) and it will schedule > subsequent calls to the

Re: [PATCH net-next 11/18] vxlan: Add netif_is_vxlan()

2018-10-15 Thread Stephen Hemminger
On Mon, 15 Oct 2018 13:30:41 -0700 Jakub Kicinski wrote: > On Mon, 15 Oct 2018 23:27:41 +0300, Ido Schimmel wrote: > > On Mon, Oct 15, 2018 at 01:16:42PM -0700, Stephen Hemminger wrote: > > > On Mon, 15 Oct 2018 22:57:48 +0300 > > > Ido Schimmel wrote: > > &g

Re: [PATCH net-next 11/18] vxlan: Add netif_is_vxlan()

2018-10-15 Thread Stephen Hemminger
On Mon, 15 Oct 2018 22:57:48 +0300 Ido Schimmel wrote: > On Mon, Oct 15, 2018 at 11:57:56AM -0700, Jakub Kicinski wrote: > > On Sat, 13 Oct 2018 17:18:38 +, Ido Schimmel wrote: > > > Add the ability to determine whether a netdev is a VxLAN netdev by > > > calling the above mentioned

Re: [PATCH iproute 2/2] utils: fix get_rtnl_link_stats_rta stats parsing

2018-10-15 Thread Stephen Hemminger
On Thu, 11 Oct 2018 14:24:03 +0200 Lorenzo Bianconi wrote: > > > iproute2 walks through the list of available tunnels using netlink > > > protocol in order to get device info instead of reading > > > them from proc filesystem. However the kernel reports device statistics > > > using

Re: [PATCH iproute2] macsec: fix off-by-one when parsing attributes

2018-10-15 Thread Stephen Hemminger
On Fri, 12 Oct 2018 17:34:12 +0200 Sabrina Dubroca wrote: > I seem to have had a massive brainfart with uses of > parse_rtattr_nested(). The rtattr* array must have MAX+1 elements, and > the call to parse_rtattr_nested must have MAX as its bound. Let's fix > those. > > Fixes: b26fc590ce62 ("ip:

Re: [PATCH iproute2] json: make 0xhex handle u64

2018-10-15 Thread Stephen Hemminger
On Fri, 12 Oct 2018 17:34:32 +0200 Sabrina Dubroca wrote: > Stephen converted macsec's sci to use 0xhex, but 0xhex handles > unsigned int's, not 64 bits ints. Thus, the output of the "ip macsec > show" command is mangled, with half of the SCI replaced with 0s: > > # ip macsec show > 11:

Re: [iproute PATCH] bridge: fdb: Fix for missing keywords in non-JSON output

2018-10-15 Thread Stephen Hemminger
On Tue, 9 Oct 2018 14:44:08 +0200 Phil Sutter wrote: > While migrating to JSON print library, some keywords were dropped from > standard output by accident. Add them back to unbreak output parsers. > > Fixes: c7c1a1ef51aea ("bridge: colorize output and use JSON print library") > Signed-off-by:

Re: [Bug 201423] New: eth0: hw csum failure

2018-10-15 Thread Stephen Hemminger
On Mon, 15 Oct 2018 08:41:47 -0700 Eric Dumazet wrote: > On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger > wrote: > > > > > > > > Begin forwarded message: > > > > Date: Sun, 14 Oct 2018 10:42:48 + > > From: bugzilla-dae...@bugzilla.kernel.or

Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-15 Thread Stephen Hemminger
Begin forwarded message: Date: Sun, 14 Oct 2018 10:42:48 + From: bugzilla-dae...@bugzilla.kernel.org To: step...@networkplumber.org Subject: [Bug 201423] New: eth0: hw csum failure https://bugzilla.kernel.org/show_bug.cgi?id=201423 Bug ID: 201423 Summary: eth0: hw

Re: [rtnetlink] Potential bug in Linux (rt)netlink code

2018-10-12 Thread Stephen Hemminger
On Fri, 12 Oct 2018 09:30:40 +0200 Henning Rogge wrote: > Hi, > > I am working on a self-written routing agent > (https://github.com/OLSR/OONF) and am stuck on a problem with netlink > that I cannot explain with an userspace error. > > I am using a netlink socket for setting routes >

Re: [PATCH net-next v2] vxlan: support NTF_USE refresh of fdb entries

2018-10-12 Thread Stephen Hemminger
gt; 1 file changed, 7 insertions(+), 3 deletions(-) Acked-by: Stephen Hemminger

Re: [PATCH iproute2 net-next] bridge: add support for backup port

2018-10-12 Thread Stephen Hemminger
On Fri, 12 Oct 2018 14:42:55 +0300 Nikolay Aleksandrov wrote: > This patch adds support for the new backup port option that can be set > on a bridge port. If the port's carrier goes down all of the traffic > gets redirected to the configured backup port. We add the following new > arguments: > $

Re: [PATCH net-next] net: bridge: add support for per-port vlan stats

2018-10-12 Thread Stephen Hemminger
On Fri, 12 Oct 2018 13:41:16 +0300 Nikolay Aleksandrov wrote: > This patch adds an option to have per-port vlan stats instead of the > default global stats. The option can be set only when there are no port > vlans in the bridge since we need to allocate the stats if it is set > when vlans are

Re: [PATCH net-next 0/9] net: Kernel side filtering for route dumps

2018-10-11 Thread Stephen Hemminger
On Thu, 11 Oct 2018 08:06:18 -0700 David Ahern wrote: > From: David Ahern > > Implement kernel side filtering of route dumps by protocol (e.g., which > routing daemon installed the route), route type (e.g., unicast), table > id and nexthop device. > > iproute2 has been doing this filtering in

Re: [PATCH iproute 2/2] utils: fix get_rtnl_link_stats_rta stats parsing

2018-10-11 Thread Stephen Hemminger
On Thu, 11 Oct 2018 14:24:03 +0200 Lorenzo Bianconi wrote: > > > iproute2 walks through the list of available tunnels using netlink > > > protocol in order to get device info instead of reading > > > them from proc filesystem. However the kernel reports device statistics > > > using

Re: [PATCH stable 4.9 00/29] backport of IP fragmentation fixes

2018-10-10 Thread Stephen Hemminger
On Tue, 9 Oct 2018 21:15:04 -0700 Florian Fainelli wrote: > > > > Strange, I do not see "ip: use rb trees for IP frag queue." in this list ? > > And it was not in Stephen's backport to 4.14 either, wait, looks like it > was somehow squashed into "net: sk_buff rbnode reorg". Stephen, was >

Re: [PATCH iproute 2/2] utils: fix get_rtnl_link_stats_rta stats parsing

2018-10-10 Thread Stephen Hemminger
On Wed, 10 Oct 2018 17:00:58 +0200 Lorenzo Bianconi wrote: > iproute2 walks through the list of available tunnels using netlink > protocol in order to get device info instead of reading > them from proc filesystem. However the kernel reports device statistics > using

Re: [sky2 driver] 88E8056 PCI-E Gigabit Ethernet Controller not working after suspend

2018-10-10 Thread Stephen Hemminger
On Wed, 10 Oct 2018 03:16:40 +0200 Laurent Bigonville wrote: > Le 9/10/18 à 22:09, Stephen Hemminger a écrit : > > On Tue, 9 Oct 2018 19:30:30 +0200 > > Laurent Bigonville wrote: > > > >> Hello, > >> > >> On my desktop (Asus MB with dual E

Re: [sky2 driver] 88E8056 PCI-E Gigabit Ethernet Controller not working after suspend

2018-10-09 Thread Stephen Hemminger
On Tue, 9 Oct 2018 19:30:30 +0200 Laurent Bigonville wrote: > Hello, > > On my desktop (Asus MB with dual Ethernet port), when waking up after > suspend, the network card is not detecting the link. > > I have to rmmod the sky2 driver and then modprobing it again. > > lspci shows me: > >

Re: [PATCH net-next] net: core: change bool members of struct net_device to bitfield members

2018-10-08 Thread Stephen Hemminger
On Mon, 8 Oct 2018 22:00:51 +0200 Heiner Kallweit wrote: > * > + * @uc_promisc:Counter that indicates promiscuous mode > + * has been enabled due to the need to listen to > + * additional unicast addresses in a device that > + * does

Re: [PATCH iproute2-next] tc: jsonify output of q_fifo

2018-10-04 Thread Stephen Hemminger
On Thu, 4 Oct 2018 17:08:34 -0700 Jakub Kicinski wrote: > Print limits correctly in JSON context. > > Signed-off-by: Jakub Kicinski > --- > tc/q_fifo.c | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/tc/q_fifo.c b/tc/q_fifo.c > index

Re: [PATCH net] vxlan: use nla_put_flag for ttl inherit

2018-10-04 Thread Stephen Hemminger
On Fri, 28 Sep 2018 09:08:26 +0800 Hangbin Liu wrote: > Phil pointed out that there is a mismatch between vxlan and geneve ttl > inherit. > We should define it as a flag and use nla_put_flag to export this opiton. > > Fixes: 8fd780698745b ("vxlan: fill ttl inherit info") > Reported-by: Phil

Re: [PATCH iproute2] lib/libnetlink: fix response seq check

2018-10-03 Thread Stephen Hemminger
On Wed, 3 Oct 2018 16:01:40 -0700 Vlad Dumitrescu wrote: > Hi, > > On Fri, Sep 28, 2018 at 10:14 AM wrote: > > > > From: Vlad Dumitrescu > > > > Taking a one-iovec example, with rtnl->seq at 42. iovlen == 1, seq > > becomes 43 on line 604, and a message is sent with nlmsg_seq == 43. If > > a

Re: [PATCH RFC v2 net-next 00/25] rtnetlink: Add support for rigid checking of data in dump request

2018-10-03 Thread Stephen Hemminger
On Mon, 1 Oct 2018 17:28:26 -0700 David Ahern wrote: > How to resolve the problem of not breaking old userspace yet be able to > move forward with new features such as kernel side filtering which are > crucial for efficient operation at high scale? What about forward compatibility? How would

Re: [PATCH iproute2/net-next v2] tc_util: Add support for showing TCA_STATS_BASIC_HW statistics

2018-10-01 Thread Stephen Hemminger
On Mon, 01 Oct 2018 09:08:32 +0200 "Eelco Chaudron" wrote: > On 10 Aug 2018, at 16:48, Eelco Chaudron wrote: > > > On 10 Aug 2018, at 16:44, Stephen Hemminger wrote: > > > >> On Fri, 10 Aug 2018 07:59:30 -0400 > >> Eelco Chaudron wrote: >

Re: [PATCH iproute2-next 01/11] libnetlink: Convert GETADDR dumps to use rtnl_addrdump_req

2018-09-30 Thread Stephen Hemminger
On Sat, 29 Sep 2018 10:59:21 -0700 David Ahern wrote: > From: David Ahern > > Add rtnl_addrdump_req for address dumps using the proper ifaddrmsg > as the header. Convert existing RTM_GETADDR dumps to use it. > > Signed-off-by: David Ahern } > > +int rtnl_addrdump_req(struct rtnl_handle

Re: [PATCH][net-next] ipv6: drop container_of when convert dst to rt6_info

2018-09-30 Thread Stephen Hemminger
On Sun, 30 Sep 2018 13:02:52 +0800 Li RongQing wrote: > we can save container_of computation and return dst directly, > since dst is always first member of struct rt6_info > > Add a BUILD_BUG_ON() to catch any change that could break this > assertion. > > Signed-off-by: Li RongQing I don't

Re: [PATCH] hv_netvsc: remove ndo_poll_controller

2018-09-29 Thread Stephen Hemminger
On Sat, 29 Sep 2018 14:52:56 +0200 Stephen Hemminger wrote: > Similar to other patches from ERic. > > As diagnosed by Song Liu, ndo_poll_controller() can > be very dangerous on loaded hosts, since the cpu > calling ndo_poll_controller() might steal all NAPI > contexts (fo

[PATCH] hv_netvsc: remove ndo_poll_controller

2018-09-29 Thread Stephen Hemminger
is generally not able to drain all the queues under load. In netvsc driver it uses NAPI for TX completions. The default poll_napi will do this for us now and avoid the capture. Signed-off-by: Stephen Hemminger Cc: Haiyang Zhang Cc: Eric Dumazet --- drivers/net/hyperv/netvsc_drv.c | 23

Re: re iproute2 - don't return error on success fix

2018-09-27 Thread Stephen Hemminger
On Thu, 27 Sep 2018 15:22:41 +0300 Or Gerlitz wrote: > Something is still broken also after commit b45e300 "libnetlink: don't > return error on success" - when error is returned, the error code is > success.. > > $ tc filter add dev enp33s0f0 protocol ip parent : flower skip_sw > ip_flags

[PATCH] MAINTAINERS: change bridge maintainers

2018-09-27 Thread Stephen Hemminger
I haven't been doing reviews only but not active development on bridge code for several years. Roopa and Nikolay have been doing most of the new features and have agreed to take over as new co-maintainers. Signed-off-by: Stephen Hemminger --- MAINTAINERS | 3 ++- 1 file changed, 2 insertions

Re: netlink: 16 bytes leftover after parsing attributes in process `ip'.

2018-09-26 Thread Stephen Hemminger
On Wed, 26 Sep 2018 08:54:43 -0600 David Ahern wrote: > On 9/25/18 11:51 PM, Jiri Benc wrote: > > On Tue, 25 Sep 2018 09:37:41 -0600, David Ahern wrote: > >> For ifaddrmsg ifa_flags aligns with ifi_type which is set by kernel side > >> so this should be ok. > > > > Does the existing user

Fw: [Bug 201233] New: e1000e

2018-09-26 Thread Stephen Hemminger
Begin forwarded message: Date: Tue, 25 Sep 2018 19:44:31 + From: bugzilla-dae...@bugzilla.kernel.org To: step...@networkplumber.org Subject: [Bug 201233] New: e1000e https://bugzilla.kernel.org/show_bug.cgi?id=201233 Bug ID: 201233 Summary: e1000e

Re: netlink: 16 bytes leftover after parsing attributes in process `ip'.

2018-09-25 Thread Stephen Hemminger
On Tue, 25 Sep 2018 14:34:08 +0200 Christian Brauner wrote: > On Tue, Sep 25, 2018, 14:07 Stephen Hemminger > wrote: > > > On Tue, 25 Sep 2018 11:49:10 +0200 > > Christian Brauner wrote: > > > > > On Mon, Sep 24, 2018 at 09:19:06PM -0600, David Ahern wrot

Re: netlink: 16 bytes leftover after parsing attributes in process `ip'.

2018-09-25 Thread Stephen Hemminger
On Tue, 25 Sep 2018 11:49:10 +0200 Christian Brauner wrote: > On Mon, Sep 24, 2018 at 09:19:06PM -0600, David Ahern wrote: > > On top of net-next I am see a dmesg error: > > > > netlink: 16 bytes leftover after parsing attributes in process `ip'. > > > > I traced it to address lists and

Re: [PATCH iproute2] iplink_vxlan: take into account preferred_family creating vxlan device

2018-09-25 Thread Stephen Hemminger
On Fri, 21 Sep 2018 15:34:25 +0200 Lorenzo Bianconi wrote: > Take into account the configured preferred_family if neither saddr or > daddr are provided since otherwise vxlan kernel module will use IPv4 as > default remote inet family neglecting the one provided by userspace. > This behaviour was

Re: [PATCH iproute2 1/1] Makefile: Add check target

2018-09-25 Thread Stephen Hemminger
On Fri, 21 Sep 2018 22:29:16 +0200 Petr Vorel wrote: > Signed-off-by: Petr Vorel Applied, thanks.

Re: [PATCH iproute2 v2 0/3] testsuite: make alltests fixes

2018-09-21 Thread Stephen Hemminger
On Thu, 20 Sep 2018 01:36:21 +0200 Petr Vorel wrote: > Hi, > > here are simply fixes to restore 'make alltests'. > Currently it does not run. > > Kind regards, > Petr > > Petr Vorel (3): > testsuite: Fix missing generate_nlmsg > testsuite: Generate generate_nlmsg when needed >

[PATCH iproute2] Makefile: add help target

2018-09-21 Thread Stephen Hemminger
Add help target to Makefile. Signed-off-by: Stephen Hemminger --- Makefile | 12 1 file changed, 12 insertions(+) diff --git a/Makefile b/Makefile index ea2f797c933f..25de3893cae4 100644 --- a/Makefile +++ b/Makefile @@ -71,6 +71,18 @@ all: config.mk for i in $(SUBDIRS

Re: [PATCH iproute2] iplink_vxlan: take into account preferred_family creating vxlan device

2018-09-21 Thread Stephen Hemminger
On Fri, 21 Sep 2018 15:34:25 +0200 Lorenzo Bianconi wrote: > Take into account the configured preferred_family if neither saddr or > daddr are provided since otherwise vxlan kernel module will use IPv4 as > default remote inet family neglecting the one provided by userspace. > This behaviour was

Re: [PATCH v2 0/2] hv_netvsc: associate VF and PV device by serial number

2018-09-20 Thread Stephen Hemminger
On Thu, 20 Sep 2018 15:18:20 +0100 Lorenzo Pieralisi wrote: > On Fri, Sep 14, 2018 at 12:54:55PM -0700, Stephen Hemminger wrote: > > The Hyper-V implementation of PCI controller has concept of 32 bit serial > > number > > (not to be confused with PCI-E serial number)

Re: Bridge connectivity interruptions while devices join or leave the bridge

2018-09-19 Thread Stephen Hemminger
On Wed, 19 Sep 2018 19:45:08 +0300 Ido Schimmel wrote: > On Wed, Sep 19, 2018 at 01:00:15PM +0200, Johannes Wienke wrote: > > This behavior of inheriting the mac address is really unexpected to us. > > Is it documented somewhere? > > Not that I'm aware, but it's a well established behavior.

Re: [PATCH iproute2] libnetlink: fix leak and using unused memory on error

2018-09-17 Thread Stephen Hemminger
On Thu, 13 Sep 2018 12:33:38 -0700 Stephen Hemminger wrote: > If an error happens in multi-segment message (tc only) > then report the error and stop processing further responses. > This also fixes refering to the buffer after free. > > The sequence check is not necessa

Re: [RFC PATCH iproute2-next] System specification health API

2018-09-16 Thread Stephen Hemminger
On Thu, 13 Sep 2018 10:36:04 -0700 Jakub Kicinski wrote: > On Thu, 13 Sep 2018 11:18:15 +0300, Eran Ben Elisha wrote: > > The health spec is targeted for Real Time Alerting, in order to know when > > something bad had happened to a PCI device > > By spec you mean some standards body spec you

Fw: [Bug 201137] New: using traffic control with sfq cause kernel crash

2018-09-15 Thread Stephen Hemminger
Begin forwarded message: Date: Sat, 15 Sep 2018 08:43:09 + From: bugzilla-dae...@bugzilla.kernel.org To: step...@networkplumber.org Subject: [Bug 201137] New: using traffic control with sfq cause kernel crash https://bugzilla.kernel.org/show_bug.cgi?id=201137 Bug ID: 201137

[PATCH v2 2/2] hv_netvsc: pair VF based on serial number

2018-09-14 Thread Stephen Hemminger
Matching network device based on MAC address is problematic since a non VF network device can be creted with a duplicate MAC address causing confusion and problems. The VMBus API does provide a serial number that is a better matching method. Signed-off-by: Stephen Hemminger --- drivers/net

[PATCH v2 0/2] hv_netvsc: associate VF and PV device by serial number

2018-09-14 Thread Stephen Hemminger
together here for better review. The PCI changes were submitted previously, but the main review comment was "why do you need this?". This is why. v2 - slot name can be shorter. remove locking when creating pci_slots; see comment for explaination Stephen Hemminger (2): PCI: h

[PATCH v2 1/2] PCI: hv: support reporting serial number as slot information

2018-09-14 Thread Stephen Hemminger
sing GPU's. But the PCI slot infrastructure will handle that. This has a side effect which may also be useful. The common udev network device naming policy uses the slot information (rather than PCI address). Signed-off-by: Stephen Hemminger --- drivers/pci/controller/pci-hyp

Re: [PATCH net-next RFC 6/8] net: make gro configurable

2018-09-14 Thread Stephen Hemminger
On Fri, 14 Sep 2018 13:59:39 -0400 Willem de Bruijn wrote: > diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c > index e5d236595206..8cb8e02c8ab6 100644 > --- a/drivers/net/vxlan.c > +++ b/drivers/net/vxlan.c > @@ -572,6 +572,7 @@ static struct sk_buff *vxlan_gro_receive(struct sock *sk, >

[PATCH iproute2] libnetlink: fix leak and using unused memory on error

2018-09-13 Thread Stephen Hemminger
of the sequence number of the iov. Reported-by: Mahesh Bandewar Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic") Signed-off-by: Stephen Hemminger --- lib/libnetlink.c | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/lib/libnetlink.c b/lib/libnetli

Re: [PATCH iproute2] iproute2: fix use-after-free

2018-09-13 Thread Stephen Hemminger
On Wed, 12 Sep 2018 23:07:20 -0700 Mahesh Bandewar (महेश बंडेवार) wrote: > On Wed, Sep 12, 2018 at 5:33 PM, Stephen Hemminger > wrote: > > > > On Wed, 12 Sep 2018 16:29:28 -0700 > > Mahesh Bandewar wrote: > > > > > From: Mahesh Bandewar > > >

[PATCH] hv_netvsc: fix schedule in RCU context

2018-09-13 Thread Stephen Hemminger
ting RTNL earlier. This is safe because the subchannel work queue does trylock on RTNL and will detect the race. Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic") Signed-off-by: Stephen Hemminger --- drivers/net/hyperv/netvsc_drv.c | 9 +++-- 1 file changed, 3 insertions(+), 6 delet

[PATCH v3 09/30] inet: frags: use rhashtables for reassembly units

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet Some applications still rely on IP fragmentation, and to be fair linux reassembly unit is not working under any serious load. It uses static hash tables of 1024 buckets, and up to 128 items per bucket (!!!) A work queue is supposed to garbage collect items when host is under

[PATCH v3 08/30] rhashtable: add schedule points

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet Rehashing and destroying large hash table takes a lot of time, and happens in process context. It is safe to add cond_resched() in rhashtable_rehash_table() and rhashtable_free_and_destroy() Signed-off-by: Eric Dumazet Acked-by: Herbert Xu Signed-off-by: David S. Miller

[PATCH v3 18/30] inet: frags: get rid of ipfrag_skb_cb/FRAG_CB

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet ip_defrag uses skb->cb[] to store the fragment offset, and unfortunately this integer is currently in a different cache line than skb->next, meaning that we use two cache lines per skb when finding the insertion point. By aliasing skb->ip_defrag_offset and skb->dev, we pack

[PATCH v3 14/30] inet: frags: do not clone skb in ip_expire()

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet An skb_clone() was added in commit ec4fbd64751d ("inet: frag: release spinlock before calling icmp_send()") While fixing the bug at that time, it also added a very high cost for DDOS frags, as the ICMP rate limit is applied after this expensive operation (skb_clone() +

[PATCH v3 02/30] inet: frags: add a pointer to struct netns_frags

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet In order to simplify the API, add a pointer to struct inet_frags. This will allow us to make things less complex. These functions no longer have a struct inet_frags parameter : inet_frag_destroy(struct inet_frag_queue *q /*, struct inet_frags *f */) inet_frag_put(struct

[PATCH v3 22/30] net: modify skb_rbtree_purge to return the truesize of all purged skbs.

2018-09-13 Thread Stephen Hemminger
From: Peter Oskolkov Tested: see the next patch is the series. Suggested-by: Eric Dumazet Signed-off-by: Peter Oskolkov Signed-off-by: Eric Dumazet Cc: Florian Westphal Signed-off-by: David S. Miller (cherry picked from commit 385114dec8a49b5e5945e77ba7de6356106713f4) ---

[PATCH v3 19/30] inet: frags: fix ip6frag_low_thresh boundary

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet Giving an integer to proc_doulongvec_minmax() is dangerous on 64bit arches, since linker might place next to it a non zero value preventing a change to ip6frag_low_thresh. ip6frag_low_thresh is not used anymore in the kernel, but we do not want to prematuraly break user

[PATCH v3 12/30] inet: frags: remove inet_frag_maybe_warn_overflow()

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet This function is obsolete, after rhashtable addition to inet defrag. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller (cherry picked from commit 2d44ed22e607f9a285b049de2263e3840673a260) --- include/net/inet_frag.h | 2 --

[PATCH v3 04/30] inet: frags: Convert timers to use timer_setup()

2018-09-13 Thread Stephen Hemminger
From: Kees Cook In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Alexander Aring Cc: Stefan Schmidt Cc: "David S. Miller" Cc: Alexey Kuznetsov Cc:

[PATCH v3 23/30] ipv6: defrag: drop non-last frags smaller than min mtu

2018-09-13 Thread Stephen Hemminger
From: Florian Westphal don't bother with pathological cases, they only waste cycles. IPv6 requires a minimum MTU of 1280 so we should never see fragments smaller than this (except last frag). v3: don't use awkward "-offset + len" v2: drop IPv4 part, which added same check w. IPV4_MIN_MTU (68).

[PATCH v3 25/30] net: add rb_to_skb() and other rb tree helpers

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet Geeralize private netem_rb_to_skb() TCP rtx queue will soon be converted to rb-tree, so we will need skb_rbtree_walk() helpers. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller (cherry picked from commit 18a4c0eab2623cc95be98a1e6af1ad18e7695977) ---

[PATCH v3 30/30] ip: frags: fix crash in ip_do_fragment()

2018-09-13 Thread Stephen Hemminger
From: Taehee Yoo commit 5d407b071dc369c26a38398326ee2be53651cfe4 upstream A kernel crash occurrs when defragmented packet is fragmented in ip_do_fragment(). In defragment routine, skb_orphan() is called and skb->ip_defrag_offset is set. but skb->sk and skb->ip_defrag_offset are same union

[PATCH v3 28/30] ip: add helpers to process in-order fragments faster.

2018-09-13 Thread Stephen Hemminger
From: Peter Oskolkov This patch introduces several helper functions/macros that will be used in the follow-up patch. No runtime changes yet. The new logic (fully implemented in the second patch) is as follows: * Nodes in the rb-tree will now contain not single fragments, but lists of

[PATCH v3 29/30] ip: process in-order fragments efficiently

2018-09-13 Thread Stephen Hemminger
From: Peter Oskolkov This patch changes the runtime behavior of IP defrag queue: incoming in-order fragments are added to the end of the current list/"run" of in-order fragments at the tail. On some workloads, UDP stream performance is substantially improved: RX: ./udp_stream -F 10 -T 2 -l 60

[PATCH v3 27/30] ipv4: frags: precedence bug in ip_expire()

2018-09-13 Thread Stephen Hemminger
From: Dan Carpenter We accidentally removed the parentheses here, but they are required because '!' has higher precedence than '&'. Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.") Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller (cherry picked from commit

[PATCH v3 26/30] net: sk_buff rbnode reorg

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet commit bffa72cf7f9df842f0016ba03586039296b4caaf upstream skb->rbnode shares space with skb->next, skb->prev and skb->tstamp Current uses (TCP receive ofo queue and netem) need to save/restore tstamp, while skb->dev is either NULL (TCP) or a constant for a given queue

[PATCH v3 21/30] net: speed up skb_rbtree_purge()

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet As measured in my prior patch ("sch_netem: faster rb tree removal"), rbtree_postorder_for_each_entry_safe() is nice looking but much slower than using rb_next() directly, except when tree is small enough to fit in CPU caches (then the cost is the same) Also note that there is

[PATCH v3 24/30] net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet After working on IP defragmentation lately, I found that some large packets defeat CHECKSUM_COMPLETE optimization because of NIC adding zero paddings on the last (small) fragment. While removing the padding with pskb_trim_rcsum(), we set skb->ip_summed to CHECKSUM_NONE,

[PATCH v3 20/30] ip: discard IPv4 datagrams with overlapping segments.

2018-09-13 Thread Stephen Hemminger
Signed-off-by: Peter Oskolkov Signed-off-by: Eric Dumazet Cc: Florian Westphal Acked-by: Stephen Hemminger Signed-off-by: David S. Miller (cherry picked from commit 7969e5c40dfd04799d4341f1b7cd266b6e47f227) --- include/uapi/linux/snmp.h | 1 + net/ipv4/ip_fragment.c| 75

[PATCH v3 16/30] rhashtable: reorganize struct rhashtable layout

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet While under frags DDOS I noticed unfortunate false sharing between @nelems and @params.automatic_shrinking Move @nelems at the end of struct rhashtable so that first cache line is shared between all cpus, because almost never dirtied. Signed-off-by: Eric Dumazet

[PATCH v3 17/30] inet: frags: reorganize struct netns_frags

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet Put the read-mostly fields in a separate cache line at the beginning of struct netns_frags, to reduce false sharing noticed in inet_frag_kill() Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller (cherry picked from commit c2615cf5a761b32bf74e85bddc223dfff3d9b9f0)

[PATCH v3 15/30] ipv6: frags: rewrite ip6_expire_frag_queue()

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet Make it similar to IPv4 ip_expire(), and release the lock before calling icmp functions. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller (cherry picked from commit 05c0b86b9696802fd0ce5676a92a63f1b455bdf3) --- net/ipv6/reassembly.c | 24

[PATCH v3 11/30] inet: frags: get rif of inet_frag_evicting()

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet This refactors ip_expire() since one indentation level is removed. Note: in the future, we should try hard to avoid the skb_clone() since this is a serious performance cost. Under DDOS, the ICMP message wont be sent because of rate limits. Fact that ip6_expire_frag_queue()

[PATCH v3 13/30] inet: frags: break the 2GB limit for frags storage

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet Some users are willing to provision huge amounts of memory to be able to perform reassembly reasonnably well under pressure. Current memory tracking is using one atomic_t and integers. Switch to atomic_long_t so that 64bit arches can use more than 2GB, without any cost for

[PATCH v3 10/30] inet: frags: remove some helpers

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet Remove sum_frag_mem_limit(), ip_frag_mem() & ip6_frag_mem() Also since we use rhashtable we can bring back the number of fragments in "grep FRAG /proc/net/sockstat /proc/net/sockstat6" that was removed in commit 434d305405ab ("inet: frag: don't account number of fragment

[PATCH v3 07/30] ipv6: export ip6 fragments sysctl to unprivileged users

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet IPv4 was changed in commit 52a773d645e9 ("net: Export ip fragment sysctl to unprivileged users") The only sysctl that is not per-netns is not used : ip6frag_secret_interval Signed-off-by: Eric Dumazet Cc: Nikolay Borisov Signed-off-by: David S. Miller (cherry picked from

[PATCH v3 06/30] inet: frags: refactor lowpan_net_frag_init()

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet We want to call lowpan_net_frag_init() earlier. Similar to commit "inet: frags: refactor ipv6_frag_init()" This is a prereq to "inet: frags: use rhashtables for reassembly units" Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller (cherry picked from commit

[PATCH v3 00/30] backport of IP fragmentation fixes

2018-09-13 Thread Stephen Hemminger
Took the set of patches from 4.19 to handle IP fragmentation DoS and applied them against 4.14.69. Most of these are from Eric. In a couple case, it required some manual merge conflict resolution. Tested normal IP fragmentation with iperf3 and malicious IP fragments with fragmentsmack. Under

[PATCH v3 05/30] inet: frags: refactor ipv6_frag_init()

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet We want to call inet_frags_init() earlier. This is a prereq to "inet: frags: use rhashtables for reassembly units" Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller (cherry picked from commit 5b975bab23615cd0fdf67af6c9298eb01c4b9f61) --- net/ipv6/reassembly.c |

[PATCH v3 03/30] inet: frags: refactor ipfrag_init()

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet We need to call inet_frags_init() before register_pernet_subsys(), as a prereq for following patch ("inet: frags: use rhashtables for reassembly units") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller (cherry picked from commit

[PATCH v3 01/30] inet: frags: change inet_frags_init_net() return value

2018-09-13 Thread Stephen Hemminger
From: Eric Dumazet We will soon initialize one rhashtable per struct netns_frags in inet_frags_init_net(). This patch changes the return value to eventually propagate an error. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller (cherry picked from commit

Re: [PATCH iproute2] iproute2: fix use-after-free

2018-09-12 Thread Stephen Hemminger
On Wed, 12 Sep 2018 16:29:28 -0700 Mahesh Bandewar wrote: > From: Mahesh Bandewar > > A local program using iproute2 lib pointed out the issue and looking > at the code it is pretty obvious - > > a = (struct nlmsghdr *)b; > ... > free(b); > if (a->nlmsg_seq == seq) > ... >

Re: [PATCH iproute2] q_cake: Add printing of no-split-gso option

2018-09-12 Thread Stephen Hemminger
On Wed, 12 Sep 2018 00:32:16 +0200 Toke Høiland-Jørgensen wrote: > When the GSO splitting was turned into dual split-gso/no-split-gso options, > the printing of the latter was left out. Add that, so output is consistent > with the options passed. > > Signed-off-by: Toke Høiland-Jørgensen

Re: [PATCHv3 iproute2] bridge/mdb: fix missing new line when show bridge mdb

2018-09-11 Thread Stephen Hemminger
On Tue, 11 Sep 2018 22:04:50 +0800 Hangbin Liu wrote: > The bridge mdb show is broken on current iproute2. e.g. > ]# bridge mdb show > 34: br0 veth0_br 224.1.1.2 temp 34: br0 veth0_br 224.1.1.1 temp > > After fix: > ]# bridge mdb show > 34: br0 veth0_br 224.1.1.2 temp > 34: br0

Re: [PATCHv2 iproute2] bridge/mdb: fix missing new line when show bridge mdb

2018-09-11 Thread Stephen Hemminger
On Tue, 11 Sep 2018 09:26:35 +0800 Hangbin Liu wrote: > + if (!is_json_context() && !show_stats) > + print_string(PRINT_FP, NULL, "\n", NULL); I just added print_nl to json_print which does what you want.

Re: [PATCH net-next 02/15] sch_netem: Move private queue handler to generic location.

2018-09-10 Thread Stephen Hemminger
On Sat, 08 Sep 2018 13:10:01 -0700 (PDT) David Miller wrote: > By hand copies of SKB list handlers do not belong in individual packet > schedulers. > > Signed-off-by: David S. Miller Thanks for cleaning this up. Signed-off-by: Stephen Hemminger

Re: [iproute PATCH v2] ip-route: Fix segfault with many nexthops

2018-09-10 Thread Stephen Hemminger
On Thu, 6 Sep 2018 15:31:51 +0200 Phil Sutter wrote: > It was possible to crash ip-route by adding an IPv6 route with 37 > nexthop statements. A simple reproducer is: > > | for i in `seq 37`; do > | nhs="nexthop via ::$i "$nhs > | done > | ip -6 route add ::/64 $nhs > > The

Re: [PATCH iproute2 v2] tc/mqprio: Print extra info on invalid args.

2018-09-10 Thread Stephen Hemminger
On Thu, 6 Sep 2018 14:01:17 -0700 Caleb Raitto wrote: > From: Caleb Raitto > > Print the name of the argument that wasn't understood. > > Signed-off-by: Caleb Raitto That is simpler, thanks. Applied

Fw: [Bug 201063] New: kernel panic on heavy network use

2018-09-10 Thread Stephen Hemminger
Begin forwarded message: Date: Sun, 09 Sep 2018 13:45:28 + From: bugzilla-dae...@bugzilla.kernel.org To: step...@networkplumber.org Subject: [Bug 201063] New: kernel panic on heavy network use https://bugzilla.kernel.org/show_bug.cgi?id=201063 Bug ID: 201063

Fw: [Bug 201071] New: Creating a vxlan in state 'up' does not give proper RTM_NEWLINK message

2018-09-10 Thread Stephen Hemminger
Begin forwarded message: Date: Mon, 10 Sep 2018 04:04:37 + From: bugzilla-dae...@bugzilla.kernel.org To: step...@networkplumber.org Subject: [Bug 201071] New: Creating a vxlan in state 'up' does not give proper RTM_NEWLINK message https://bugzilla.kernel.org/show_bug.cgi?id=201071

[PATCH iproute2 3/3] bridge: fix vlan show formatting

2018-09-06 Thread Stephen Hemminger
The output of vlan show was broken previous change to use json_print. Clean the code up and return to original format. Note: the JSON syntax has changed to make the bridge vlan show more like other outputs (e.g. ip -j li show). Signed-off-by: Stephen Hemminger --- bridge/br_common.h | 2

[PATCH iproute2 2/3] bridge: use print_json for some outputs

2018-09-06 Thread Stephen Hemminger
Rather than using is_json_context(), use the print_string functions which handle both cases. Signed-off-by: Stephen Hemminger --- bridge/mdb.c | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/bridge/mdb.c b/bridge/mdb.c index 9bdef0262c54..cc1b4547865c 100644

[PATCH iproute2 1/3] bridge: minor change to mdb print

2018-09-06 Thread Stephen Hemminger
Get port ifname once rather than on both sides of if(is_json_context). Signed-off-by: Stephen Hemminger --- bridge/mdb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/bridge/mdb.c b/bridge/mdb.c index f38dc67c849a..9bdef0262c54 100644 --- a/bridge/mdb.c +++ b/bridge

<    1   2   3   4   5   6   7   8   9   10   >