Re: [PATCH] net/mlx5: allocate enough space in

2018-10-21 Thread Dan Carpenter
On Sun, Oct 21, 2018 at 01:56:26PM +0300, Or Gerlitz wrote:
> I will re-post your patch, this time to netdev since the original
> commit is there
> and so should be the fix, thanks for reporting/fixing!

I didn't realize it had been posted to netdev already so I deliberately
left that off the CC.  If Dave hasn't applied the original (he probably
has now because he is so quick) then it's fine by me if you fold them
together.

regards,
dan carpenter



Re: [rtnetlink] Potential bug in Linux (rt)netlink code

2018-10-21 Thread Henning Rogge

Does anyone else have an idea how to debug this problem?

Henning Rogge

Am 15.10.2018 um 07:25 schrieb Henning Rogge:

Am 12.10.2018 um 20:51 schrieb Stephen Hemminger:

On Fri, 12 Oct 2018 09:30:40 +0200
Henning Rogge  wrote:


Hi,

I am working on a self-written routing agent
(https://github.com/OLSR/OONF) and am stuck on a problem with netlink
that I cannot explain with an userspace error.

I am using a netlink socket for setting routes
(RTM_NEWROUTE/RTM_DELROUTE), querying the kernel for the current routes
in the database (via a RTM_GETROUTE dump) and for getting multicast
messages for ongoing routing changes.

After a few netlink messages I get to the point where the kernel just
does not responst to a RTM_NEWROUTE. No error, no answer, despite the
NLM_F_ACK flag set)... but sometime when (during shutdown of the routing
agent) the program sends another route command (most times a
RTM_DELROUTE) I get a single netlink packet with a "successful" response
for both the "missing" RTM_NEWROUTE and one for the new RTM DELROUTE
sequence number.

I am testing two routing agents, each of them in a systemd-nspawn based
container connected over a bridge on the host system on a current Debian
Testing (kernel 4.18.0-1-amd64).

I am directly using the netlink sockets, without any other userspace
library in between.

I have checked the hexdumps of a couple of netlink messages (including
the ones just before the bug happens) by hand and they seem to be okay.

When I tried to add a "netlink listener" socket for futher debugging (ip
link add nlmon0 type nlmon) the problem vanished until I removed the
listener socket again.

Any ideas how to debug this problem? Unfortunately I have no short
example program to trigger the bug... I have rarely seen the problem for
years (once every couple of months), but until a few days ago I never
managed to reproduce it.

Henning Rogge


Are you reading the responses to your requests?  If you don't read
the response, the socket will get flow blocked.


Yes, I do...

all netlink sockets the program uses are constantly watched for traffic 
coming from the kernel (with an epoll()-based event loop, no edge-trigger).


I even have a rate limitation towards the kernel, only sending a 
"pagesize" full of netlink data towards the kernel, then waiting for the 
reply before sending more (I had the blocking problem a few years ago 
when experimenting with LOTS of routes).


Henning Rogge


Henning Rogge
--
Diplom-Informatiker Henning Rogge , Fraunhofer-Institut für
Kommunikation, Informationsverarbeitung und Ergonomie FKIE
Kommunikationssysteme (KOM)
Zanderstrasse 5, 53177 Bonn, Germany
Telefon +49 228 50212-469
mailto:henning.ro...@fkie.fraunhofer.de http://www.fkie.fraunhofer.de


RE: [PATCH v2 1/2] dt-bindings: net: add MDIO bus multiplexer driven by a regmap device

2018-10-21 Thread Pankaj Bansal

> -Original Message-
> From: Pankaj Bansal
> Sent: Thursday, October 18, 2018 10:00 AM
> To: Florian Fainelli ; Andrew Lunn 
> Cc: netdev@vger.kernel.org
> Subject: RE: [PATCH v2 1/2] dt-bindings: net: add MDIO bus multiplexer driven 
> by
> a regmap device
> 
> Hi Florian
> 
> > -Original Message-
> > From: Florian Fainelli [mailto:f.faine...@gmail.com]
> > Sent: Sunday, October 7, 2018 11:32 PM
> > To: Pankaj Bansal ; Andrew Lunn
> > 
> > Cc: netdev@vger.kernel.org
> > Subject: Re: [PATCH v2 1/2] dt-bindings: net: add MDIO bus multiplexer
> > driven by a regmap device
> >
> >
> >
> > On 10/07/18 11:24, Pankaj Bansal wrote:
> > > Add support for an MDIO bus multiplexer controlled by a regmap
> > > device, like an FPGA.
> > >
> > > Tested on a NXP LX2160AQDS board which uses the "QIXIS" FPGA
> > > attached to the i2c bus.
> > >
> > > Signed-off-by: Pankaj Bansal 
> > > ---
> > >
> > > Notes:
> > > V2:
> > >  - Fixed formatting error caused by using space instead of tab
> > >  - Add newline between property list and subnode
> > >  - Add newline between two subnodes
> > >
> > >  .../bindings/net/mdio-mux-regmap.txt | 95 ++
> > >  1 file changed, 95 insertions(+)
> > >
> > > diff --git
> > > a/Documentation/devicetree/bindings/net/mdio-mux-regmap.txt
> > > b/Documentation/devicetree/bindings/net/mdio-mux-regmap.txt
> > > new file mode 100644
> > > index ..88ebac26c1c5
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/net/mdio-mux-regmap.txt
> > > @@ -0,0 +1,95 @@
> > > +Properties for an MDIO bus multiplexer controlled by a regmap
> > > +
> > > +This is a special case of a MDIO bus multiplexer.  A regmap device,
> > > +like an FPGA, is used to control which child bus is connected.  The
> > > +mdio-mux node must be a child of the device that is controlled by a
> regmap.
> > > +The driver currently only supports devices with upto 32-bit registers.
> >
> > I would omit any sort of details about Linux constructs designed to
> > support specific needs (e.g: regmap) as well as putting driver
> > limitations into a binding document because it really ought to be 
> > restricted to
> describing hardware.
> >
> 
> Actually the platform driver mdio-mux-regmap.c, is generalization of mdio-mux-
> mmioreg.c As such, it doesn't describe any new hardware, so no new properties
> are described by it.
> The only new property is compatible field.
> I don't know how to describe this driver otherwise.  Can you please help?

I further thought about it. mdio-mux-regmap.c is not meant to control a 
specific device.
It is meant to control some registers of parent device. Therefore, IMO this 
should not be a
Platform driver and there should not be any "compatible" property to which this 
driver is associated.

Rather this driver should expose the APIs, which should be called by parent 
devices' driver.

What is your thought on this ?

> 
> > > +
> > > +Required properties in addition to the generic multiplexer properties:
> > > +
> > > +- compatible : string, must contain "mdio-mux-regmap"
> > > +
> > > +- reg : integer, contains the offset of the register that controls the 
> > > bus
> > > + multiplexer. it can be 32 bit number.
> >
> > Technically it could be any "reg" property size, the only requirement
> > is that it must be of the appropriate size with respect to what the
> > parent node of this "mdio-mux-regmap" node has, determined by #address-
> cells/#size-cells width.
> 
> We are reading only single cell of this property using "of_propert_read_u32".
> That is why I put the size in this.
> 
> >
> > > +
> > > +- mux-mask : integer, contains an 32 bit mask that specifies which
> > > + bits in the register control the actual bus multiplexer.  The
> > > + 'reg' property of each child mdio-mux node must be constrained by
> > > + this mask.
> >
> > Same thing here.
> 
> We are reading only single cell of this property using "of_propert_read_u32".
> That is why I put the size in this.
> 
> >
> > Since this is a MDIO mux, I would invite you to specify what the
> > correct #address-cells/#size-cells values should be (1, and 0
> > respectively as your example correctly shows).
> >
> 
> I will mention #address-cells/#size-cells values
> 
> > > +
> > > +Example:
> > > +
> > > +The FPGA node defines a i2c connected FPGA with a register space of
> > > +0x30
> > bytes.
> > > +For the "EMI2" MDIO bus, register 0x54 (BRDCFG4) controls the mux
> > > +on that
> > bus.
> > > +A bitmask of 0x07 means that bits 0, 1 and 2 (bit 0 is lsb) are the
> > > +bits on
> > > +BRDCFG4 that control the actual mux.
> > > +
> > > +i2c@200 {
> > > + compatible = "fsl,vf610-i2c";
> > > + #address-cells = <1>;
> > > + #size-cells = <0>;
> > > + reg = <0x0 0x200 0x0 0x1>;
> > > + interrupts = <0 34 0x4>; // Level high type
> > > + clock-names = "i2c";
> > > + clocks = < 4 7>;
> > > + fsl-scl-gpio = < 15 0>;
> > > + status = "okay";
> > > +
> > > + /* The FPGA node */
> 

Re: pull-request: bpf-next 2018-10-21

2018-10-21 Thread David Miller
From: Daniel Borkmann 
Date: Sun, 21 Oct 2018 21:24:26 +0200

> The following pull-request contains BPF updates for your *net-next* tree.
> 
> The main changes are:
 ...
> Please consider pulling these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

Pulled, thanks Daniel.


RE: [PATCH] net: ethernet:fec: Consistently use SPEED_ prefix

2018-10-21 Thread Andy Duan
From: netdev-ow...@vger.kernel.org 
> All other calls to phy_set_max_speed() use the SPEED_ prefix. Make the
> FEC driver follow this common pattern. This makes no different to
> generated code since SPEED_1000 is 1000, and SPEED_100 is 100.
> 

Please also add more information that was introduced by commit: 58056c1e1b0e 
("net: ethernet: Use phy_set_max_speed() to limit advertised speed ")

Andy
> Reported-by: Corentin Labbe 
> Signed-off-by: Andrew Lunn 
> ---
>  drivers/net/ethernet/freescale/fec_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/fec_main.c
> b/drivers/net/ethernet/freescale/fec_main.c
> index 6db69ba30dcd..b067eaf8b792 100644
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
> @@ -1948,7 +1948,7 @@ static int fec_enet_mii_probe(struct
> net_device *ndev)
> 
>   /* mask with MAC supported features */
>   if (fep->quirks & FEC_QUIRK_HAS_GBIT) {
> - phy_set_max_speed(phy_dev, 1000);
> + phy_set_max_speed(phy_dev, SPEED_1000);
>   phy_remove_link_mode(phy_dev,
>ETHTOOL_LINK_MODE_1000baseT_Half_BIT);
>  #if !defined(CONFIG_M5272)
> @@ -1956,7 +1956,7 @@ static int fec_enet_mii_probe(struct
> net_device *ndev)  #endif
>   }
>   else
> - phy_set_max_speed(phy_dev, 100);
> + phy_set_max_speed(phy_dev, SPEED_100);
> 
>   fep->link = 0;
>   fep->full_duplex = 0;
> --
> 2.19.0



Re: [PATCH net] r8169: fix NAPI handling under high load

2018-10-21 Thread Jonathan Woithe
On Fri, 19 Oct 2018 17:59:21 +1030, Jonathan Woithe wrote:
> On 10/18/18 08:15, Jonathan Woithe wrote:
> > On Thu, Oct 18, 2018 at 08:03:32AM +0200, Heiner Kallweit wrote:
> > > Proposed fix is here:
> > > https://patchwork.ozlabs.org/patch/985014/
> > > Would be good if you could test it. Thanks!
> > 
> > I should be able to do so tomorrow.  Which kernel would you like me to apply
> > the patch to?
>
> It turns out I couldn't compile 4.18.15 conveniently ...
> I could compile 4.14 though - I had previously confirmed that the problem
> was still seen with that kernel.  I therefore applied the fix to 4.14 (which
> was trivial) and started a test with that.  The system has been running
> without exhibiting the effects of the bug for over five hours.  With an
> unpatched 4.14, symptoms were typically seen within 30 minutes.
> 
> I will leave the system to operate over the weekend to be sure, but at this
> stage it seems likely that the patch will resolve this long-standing
> difficulty that we've experienced.  I will report back on Monday.

Our test system has now been running for just under 3 days (69 hours) and
there has not been any incidence of the problem with this patch applied to
the 4.14 r8169 driver.  Without the patch, multiple problems are logged
within 30 minutes.

Given this, I would conclude that this patch fixes the problem for us. 
Although belated (since the patch has already been accepted):

  Tested-by: Jonathan Woithe 

Thank you very much for your work which lead to the patch!  It means we can
finally provide an upgrade path for systems in the field which are equipped
with the r8169 hardware.

For reference, the problem being tested on our systems is the one discussed
in the "r8169 regression: UDP packets dropped intermittantly" thread on this
mailing list.

Regards
  jonathan


Re: [PATCH bpf-next] selftests/bpf: enable (uncomment) all tests in test_libbpf.sh

2018-10-21 Thread Jesper Dangaard Brouer
On Sun, 21 Oct 2018 16:37:08 +0100
Quentin Monnet  wrote:

> 2018-10-21 11:57 UTC+0200 ~ Jesper Dangaard Brouer 
> > On Sat, 20 Oct 2018 23:00:24 +0100
> > Quentin Monnet  wrote:
> >   
> 
> [...]
> 
> >> --- a/tools/testing/selftests/bpf/test_libbpf.sh
> >> +++ b/tools/testing/selftests/bpf/test_libbpf.sh
> >> @@ -33,17 +33,11 @@ trap exit_handler 0 2 3 6 9
> >>   
> >>   libbpf_open_file test_l4lb.o
> >>   
> >> -# TODO: fix libbpf to load noinline functions
> >> -# [warning] libbpf: incorrect bpf_call opcode
> >> -#libbpf_open_file test_l4lb_noinline.o
> >> +libbpf_open_file test_l4lb_noinline.o
> >>   
> >> -# TODO: fix test_xdp_meta.c to load with libbpf
> >> -# [warning] libbpf: test_xdp_meta.o doesn't provide kernel version
> >> -#libbpf_open_file test_xdp_meta.o
> >> +libbpf_open_file test_xdp_meta.o
> >>   
> >> -# TODO: fix libbpf to handle .eh_frame
> >> -# [warning] libbpf: relocation failed: no section(10)
> >> -#libbpf_open_file ../../../../samples/bpf/tracex3_kern.o
> >> +libbpf_open_file ../../../../samples/bpf/tracex3_kern.o  
> > 
> > I don't like the ../../../../samples/bpf/ reference (even-through I
> > added this TODO), as the kselftests AFAIK support installing the
> > selftests and then this tests will fail.
> > Maybe we can find another example kern.o file?
> > (which isn't compiled with -target bpf)  
> 
> Hi Jesper, yeah maybe making the test rely on something from samples/bpf
> instead of just the selftests/bpf directory is not a good idea. But
> there is no program compiled without the "-target-bpf" in that
> directory. What we could do is explicitly compile one without the flag
> in the Makefile, as in the patch below, but I am not sure that doing so
> is acceptable?

I think it makes sense to have a test program compiled without the
"-target-bpf", as that will happen for users.  And I guess we can add
some more specific test that are related to "-target-bpf".

> Or should tests for libbpf have a directory of their own,
> with another Makefile?

Hmm, I'm not sure about that idea.

I did plan by naming the test "libbpf_open_file", what we add more
libbpf_ prefixed tests to the test_libbpf.sh script, which should
cover more aspects of the _base_ libbpf functionality.

> Another question regarding the test with test_xdp_meta.o: does the fix I
> suggested (setting a version in the .C file) makes sense, or did you
> leave this test for testing someday that libbpf would be able to open
> even programs that do not set a version (in which case this is still not
> the case if program type is not provided, and in fact my fix ruins
> everything? :s).

Well, yes.  I was hinting if we should relax the version requirement
for e.g. XDP BPF progs.

> ---
>  tools/testing/selftests/bpf/Makefile| 10 ++
>  tools/testing/selftests/bpf/test_libbpf.sh  | 14 +-
>  tools/testing/selftests/bpf/test_xdp_meta.c |  2 ++
>  3 files changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/Makefile 
> b/tools/testing/selftests/bpf/Makefile
> index e39dfb4e7970..ecd79b7fb107 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -135,6 +135,16 @@ endif
>  endif
>  endif
>  
> +# Have one program compiled without "-target bpf" to test whether libbpf 
> loads
> +# it successfully
> +$(OUTPUT)/test_xdp.o: test_xdp.c
> + $(CLANG) $(CLANG_FLAGS) \
> + -O2 -emit-llvm -c $< -o - | \
> + $(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
> +ifeq ($(DWARF2BTF),y)
> + $(BTF_PAHOLE) -J $@
> +endif
> +
>  $(OUTPUT)/%.o: %.c
>   $(CLANG) $(CLANG_FLAGS) \
>-O2 -target bpf -emit-llvm -c $< -o - |  \
> diff --git a/tools/testing/selftests/bpf/test_libbpf.sh 
> b/tools/testing/selftests/bpf/test_libbpf.sh
> index 156d89f1edcc..b45962a44243 100755
> --- a/tools/testing/selftests/bpf/test_libbpf.sh
> +++ b/tools/testing/selftests/bpf/test_libbpf.sh
> @@ -33,17 +33,13 @@ trap exit_handler 0 2 3 6 9
>  
>  libbpf_open_file test_l4lb.o
>  
> -# TODO: fix libbpf to load noinline functions
> -# [warning] libbpf: incorrect bpf_call opcode
> -#libbpf_open_file test_l4lb_noinline.o
> +# Load a program with BPF-to-BPF calls
> +libbpf_open_file test_l4lb_noinline.o
>  
> -# TODO: fix test_xdp_meta.c to load with libbpf
> -# [warning] libbpf: test_xdp_meta.o doesn't provide kernel version
> -#libbpf_open_file test_xdp_meta.o
> +libbpf_open_file test_xdp_meta.o
>  
> -# TODO: fix libbpf to handle .eh_frame
> -# [warning] libbpf: relocation failed: no section(10)
> -#libbpf_open_file ../../../../samples/bpf/tracex3_kern.o
> +# Load a program compiled without the "-target bpf" flag
> +libbpf_open_file test_xdp.o
>  
>  # Success
>  exit 0
> diff --git a/tools/testing/selftests/bpf/test_xdp_meta.c 
> b/tools/testing/selftests/bpf/test_xdp_meta.c
> index 8d0182650653..2f42de66e2bb 100644
> --- a/tools/testing/selftests/bpf/test_xdp_meta.c
> +++ 

Re: [RFC PATCH v2 08/10] selftests: conditionally enable XDP support in udpgso_bench_rx

2018-10-21 Thread Willem de Bruijn
On Fri, Oct 19, 2018 at 10:31 AM Paolo Abeni  wrote:
>
> XDP support will be used by a later patch to test the GRO path
> in a net namespace, leveraging the veth XDP implementation.
> To avoid breaking existing setup, XDP support is conditionally
> enabled and build only if llc is locally available.
>
> Signed-off-by: Paolo Abeni 
> ---
> diff --git a/tools/testing/selftests/net/Makefile 
> b/tools/testing/selftests/net/Makefile
> index 256d82d5fa87..176459b7c4d6 100644
> --- a/tools/testing/selftests/net/Makefile
> +++ b/tools/testing/selftests/net/Makefile
> @@ -16,8 +16,77 @@ TEST_GEN_PROGS = reuseport_bpf reuseport_bpf_cpu 
> reuseport_bpf_numa
>  TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls
>
>  KSFT_KHDR_INSTALL := 1
> +
> +# Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on 
> cmdline:
> +#  make samples/bpf/ LLC=~/git/llvm/build/bin/llc 
> CLANG=~/git/llvm/build/bin/clang
> +LLC ?= llc
> +CLANG ?= clang
> +LLVM_OBJCOPY ?= llvm-objcopy
> +BTF_PAHOLE ?= pahole
> +HAS_LLC := $(shell which $(LLC) 2>/dev/null)
> +
> +# conditional enable testes requiring llc
> +ifneq (, $(HAS_LLC))
> +TEST_GEN_FILES += xdp_dummy.o
> +endif
> +
>  include ../lib.mk
>
> +ifneq (, $(HAS_LLC))
> +
> +# Detect that we're cross compiling and use the cross compiler
> +ifdef CROSS_COMPILE
> +CLANG_ARCH_ARGS = -target $(ARCH)
> +endif
> +
> +PROBE := $(shell $(LLC) -march=bpf -mcpu=probe -filetype=null /dev/null 2>&1)
> +
> +# Let newer LLVM versions transparently probe the kernel for availability
> +# of full BPF instruction set.
> +ifeq ($(PROBE),)
> +  CPU ?= probe
> +else
> +  CPU ?= generic
> +endif
> +
> +SRC_PATH := $(abspath ../../../..)
> +LIB_PATH := $(SRC_PATH)/tools/lib
> +XDP_CFLAGS := -D SUPPORT_XDP=1 -I$(LIB_PATH)
> +LIBBPF = $(LIB_PATH)/bpf/libbpf.a
> +BTF_LLC_PROBE := $(shell $(LLC) -march=bpf -mattr=help 2>&1 | grep dwarfris)
> +BTF_PAHOLE_PROBE := $(shell $(BTF_PAHOLE) --help 2>&1 | grep BTF)
> +BTF_OBJCOPY_PROBE := $(shell $(LLVM_OBJCOPY) --help 2>&1 | grep -i 
> 'usage.*llvm')
> +CLANG_SYS_INCLUDES := $(shell $(CLANG) -v -E - &1 \
> +| sed -n '/<...> search starts here:/,/End of search list./{ s| 
> \(/.*\)|-idirafter \1|p }')
> +CLANG_FLAGS = -I. -I$(SRC_PATH)/include -I../bpf/ \
> + $(CLANG_SYS_INCLUDES) -Wno-compare-distinct-pointer-types
> +
> +ifneq ($(and $(BTF_LLC_PROBE),$(BTF_PAHOLE_PROBE),$(BTF_OBJCOPY_PROBE)),)
> +   CLANG_CFLAGS += -g
> +   LLC_FLAGS += -mattr=dwarfris
> +   DWARF2BTF = y
> +endif
> +
> +$(LIBBPF): FORCE
> +# Fix up variables inherited from Kbuild that tools/ build system won't like
> +   $(MAKE) -C $(dir $@) RM='rm -rf' LDFLAGS= srctree=$(SRC_PATH) O= 
> $(nodir $@)
> +

This is a lot of XDP specific code. Not for this patchset, per se, but
would be nice if we can reuse the logic in selftests/bpf for all this.

> --- a/tools/testing/selftests/net/udpgso_bench_rx.c
> @@ -227,6 +234,13 @@ static void parse_opts(int argc, char **argv)
> cfg_verify = true;
> cfg_read_all = true;
> break;
> +#ifdef SUPPORT_XDP
> +   case 'x':
> +   cfg_xdp_iface = if_nametoindex(optarg);
> +   if (!cfg_xdp_iface)
> +   error(1, errno, "unknown interface %s", 
> optarg);
> +   break;
> +#endif

nit: needs to be added to getopt string in this patch.


Re: [RFC PATCH v2 10/10] selftests: add functionals test for UDP GRO

2018-10-21 Thread Willem de Bruijn
On Fri, Oct 19, 2018 at 10:31 AM Paolo Abeni  wrote:
>
> Extends the existing udp programs to allow checking for proper
> GRO aggregation/GSO size, and run the tests via a shell script, using
> a veth pair with XDP program attached to trigger the GRO code path.
>
> Signed-off-by: Paolo Abeni 
> ---
>  tools/testing/selftests/net/Makefile  |   2 +-
>  tools/testing/selftests/net/udpgro.sh | 144 ++
>  tools/testing/selftests/net/udpgro_bench.sh   |   8 +-
>  tools/testing/selftests/net/udpgso_bench.sh   |   2 +-
>  tools/testing/selftests/net/udpgso_bench_rx.c | 125 +--
>  tools/testing/selftests/net/udpgso_bench_tx.c |  22 ++-
>  6 files changed, 281 insertions(+), 22 deletions(-)
>  create mode 100755 tools/testing/selftests/net/udpgro.sh
>

> diff --git a/tools/testing/selftests/net/udpgro.sh 
> b/tools/testing/selftests/net/udpgro.sh

> +   run_test "no GRO chk cmsg" "${ipv4_args} -M 10 -s 1400" "-4 -n 10 -l 
> 1400 -S -1"
> +   run_test "no GRO chk cmsg" "${ipv6_args} -M 10 -s 1400" "-n 10 -l 
> 1400 -S -1"

why expected segment size -1 in these two?

> diff --git a/tools/testing/selftests/net/udpgso_bench_tx.c 
> b/tools/testing/selftests/net/udpgso_bench_tx.c
>  static void usage(const char *filepath)
>  {
> -   error(1, 0, "Usage: %s [-46cmStuz] [-C cpu] [-D dst ip] [-l secs] [-p 
> port] [-s sendsize]",
> +   error(1, 0, "Usage: %s [-46cmtuz] [-C cpu] [-D dst ip] [-l secs] [-m 
> messagenr] [-p port] [-s sendsize] [-S gsosize]",
> filepath);

missing -M


Re: [RFC PATCH v2 07/10] selftests: add GRO support to udp bench rx program

2018-10-21 Thread Willem de Bruijn
On Fri, Oct 19, 2018 at 10:31 AM Paolo Abeni  wrote:
>
> And fix a couple of buglets (port option processing,
> clean termination on SIGINT). This is preparatory work
> for GRO tests.
>
> Signed-off-by: Paolo Abeni 
> ---
>  tools/testing/selftests/net/udpgso_bench_rx.c | 37 +++
>  1 file changed, 30 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/net/udpgso_bench_rx.c 
> b/tools/testing/selftests/net/udpgso_bench_rx.c

> @@ -167,10 +177,10 @@ static void do_verify_udp(const char *data, int len)
>  /* Flush all outstanding datagrams. Verify first few bytes of each. */
>  static void do_flush_udp(int fd)
>  {
> -   static char rbuf[ETH_DATA_LEN];
> +   static char rbuf[65535];

we can use ETH_MAX_MTU.


Re: [RFC PATCH v2 06/10] udp: cope with UDP GRO packet misdirection

2018-10-21 Thread Willem de Bruijn
On Fri, Oct 19, 2018 at 10:31 AM Paolo Abeni  wrote:
>
> In some scenarios, the GRO engine can assemble an UDP GRO packet
> that ultimately lands on a non GRO-enabled socket.
> This patch tries to address the issue explicitly checking for the UDP
> socket features before enqueuing the packet, and eventually segmenting
> the unexpected GRO packet, as needed.
>
> We must also cope with re-insertion requests: after segmentation the
> UDP code calls the helper introduced by the previous patches, as needed.
>
> Signed-off-by: Paolo Abeni 
> ---

> +static inline struct sk_buff *udp_rcv_segment(struct sock *sk,
> + struct sk_buff *skb)
> +{
> +   struct sk_buff *segs;
> +
> +   /* the GSO CB lays after the UDP one, no need to save and restore any
> +* CB fragment, just initialize it
> +*/
> +   segs = __skb_gso_segment(skb, NETIF_F_SG, false);
> +   if (unlikely(IS_ERR(segs)))
> +   kfree_skb(skb);
> +   else if (segs)
> +   consume_skb(skb);
> +   return segs;
> +}
> +
> +
> +void ip_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int 
> proto);
> +
> +static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
> +{
> +   struct sk_buff *next, *segs;
> +   int ret;
> +
> +   if (likely(!udp_unexpected_gso(sk, skb)))
> +   return udp_queue_rcv_one_skb(sk, skb);
> +
> +   BUILD_BUG_ON(sizeof(struct udp_skb_cb) > SKB_SGO_CB_OFFSET);
> +   __skb_push(skb, -skb_mac_offset(skb));
> +   segs = udp_rcv_segment(sk, skb);
> +   for (skb = segs; skb; skb = next) {

need to check IS_ERR(segs) again?


Re: [RFC PATCH v2 03/10] udp: add support for UDP_GRO cmsg

2018-10-21 Thread Willem de Bruijn
On Fri, Oct 19, 2018 at 10:30 AM Paolo Abeni  wrote:
>
> When UDP GRO is enabled, the UDP_GRO cmsg will carry the ingress
> datagram size. User-space can use such info to compute the original
> packets layout.
>
> Signed-off-by: Paolo Abeni 
> ---
> CHECK: should we use a separate setsockopt to explicitly enable
> gso_size cmsg reception? So that user space can enable UDP_GRO and
> fetch cmsg without forcefully receiving GRO related info.

A user can avoid the message by not passing control data. Though in
most practical cases it seems unsafe to do so, anyway, as the path MTU
can be lower than the expected device MTU.

> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 3c277378814f..2331ac9de954 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1714,6 +1714,10 @@ int udp_recvmsg(struct sock *sk, struct msghdr *msg, 
> size_t len, int noblock,
> memset(sin->sin_zero, 0, sizeof(sin->sin_zero));
> *addr_len = sizeof(*sin);
> }
> +
> +   if (udp_sk(sk)->gro_enabled)
> +   udp_cmsg_recv(msg, sk, skb);
> +

Perhaps we can avoid adding a branch by setting a bit in
inet->cmsg_flags for gso_size to let the below branch handle the cmsg
processing.

I'd still set that as part of the UDP_GRO setsockopt. Though if you
insist it could be a value 2 instead of 1, effectively allowing the
above mentioned opt-out.

> if (inet->cmsg_flags)
> ip_cmsg_recv_offset(msg, sk, skb, sizeof(struct udphdr), off);
>


Re: [RFC PATCH v2 02/10] udp: implement GRO for plain UDP sockets.

2018-10-21 Thread Willem de Bruijn
On Fri, Oct 19, 2018 at 10:30 AM Paolo Abeni  wrote:
>
> This is the RX counterpart of commit bec1f6f69736 ("udp: generate gso
> with UDP_SEGMENT"). When UDP_GRO is enabled, such socket is also
> eligible for GRO in the rx path: UDP segments directed to such socket
> are assembled into a larger GSO_UDP_L4 packet.
>
> The core UDP GRO support is enabled with setsockopt(UDP_GRO).
>
> Initial benchmark numbers:
>
> Before:
> udp rx:   1079 MB/s   769065 calls/s
>
> After:
> udp rx:   1466 MB/s24877 calls/s
>
>
> This change introduces a side effect in respect to UDP tunnels:
> after a UDP tunnel creation, now the kernel performs a lookup per ingress
> UDP packet, while before such lookup happened only if the ingress packet
> carried a valid internal header csum.
>
> v1 -> v2:
>  - use a new option to enable UDP GRO
>  - use static keys to protect the UDP GRO socket lookup
>
> Signed-off-by: Paolo Abeni 
> ---
>  include/linux/udp.h  |   3 +-
>  include/uapi/linux/udp.h |   1 +
>  net/ipv4/udp.c   |   7 +++
>  net/ipv4/udp_offload.c   | 109 +++
>  net/ipv6/udp_offload.c   |   6 +--
>  5 files changed, 98 insertions(+), 28 deletions(-)
>
> diff --git a/include/linux/udp.h b/include/linux/udp.h
> index a4dafff407fb..f613b329852e 100644
> --- a/include/linux/udp.h
> +++ b/include/linux/udp.h
> @@ -50,11 +50,12 @@ struct udp_sock {
> __u8 encap_type;/* Is this an Encapsulation socket? */
> unsigned charno_check6_tx:1,/* Send zero UDP6 checksums on TX? */
>  no_check6_rx:1,/* Allow zero UDP6 checksums on RX? */
> -encap_enabled:1; /* This socket enabled encap
> +encap_enabled:1, /* This socket enabled encap
>* processing; UDP tunnels and
>* different encapsulation layer set
>* this
>*/
> +gro_enabled:1; /* Can accept GRO packets */
>
> /*
>  * Following member retains the information to create a UDP header
>  * when the socket is uncorked.
> diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h
> index 09502de447f5..30baccb6c9c4 100644
> --- a/include/uapi/linux/udp.h
> +++ b/include/uapi/linux/udp.h
> @@ -33,6 +33,7 @@ struct udphdr {
>  #define UDP_NO_CHECK6_TX 101   /* Disable sending checksum for UDP6X */
>  #define UDP_NO_CHECK6_RX 102   /* Disable accpeting checksum for UDP6 */
>  #define UDP_SEGMENT103 /* Set GSO segmentation size */
> +#define UDP_GRO104 /* This socket can receive UDP GRO 
> packets */
>
>  /* UDP encapsulation types */
>  #define UDP_ENCAP_ESPINUDP_NON_IKE 1 /* draft-ietf-ipsec-nat-t-ike-00/01 
> */
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 9fcb5374e166..3c277378814f 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -115,6 +115,7 @@
>  #include "udp_impl.h"
>  #include 
>  #include 
> +#include 
>
>  struct udp_table udp_table __read_mostly;
>  EXPORT_SYMBOL(udp_table);
> @@ -2459,6 +2460,12 @@ int udp_lib_setsockopt(struct sock *sk, int level, int 
> optname,
> up->gso_size = val;
> break;
>
> +   case UDP_GRO:
> +   if (valbool)
> +   udp_tunnel_encap_enable(sk->sk_socket);
> +   up->gro_enabled = valbool;

The socket lock is not held here, so multiple updates to
up->gro_enabled and the up->encap_enabled and the static branch can
race. Syzkaller is adept at generating those.

> +#define UDO_GRO_CNT_MAX 64
> +static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
> +  struct sk_buff *skb)
> +{
> +   struct udphdr *uh = udp_hdr(skb);
> +   struct sk_buff *pp = NULL;
> +   struct udphdr *uh2;
> +   struct sk_buff *p;
> +
> +   /* requires non zero csum, for simmetry with GSO */

symmetry


Re: [RFC PATCH v2 00/10] udp: implement GRO support

2018-10-21 Thread Willem de Bruijn
On Fri, Oct 19, 2018 at 10:30 AM Paolo Abeni  wrote:
>
> This series implements GRO support for UDP sockets, as the RX counterpart
> of commit bec1f6f69736 ("udp: generate gso with UDP_SEGMENT").
> The core functionality is implemented by the second patch, introducing a new
> sockopt to enable UDP_GRO, while patch 3 implements support for passing the
> segment size to the user space via a new cmsg.
> UDP GRO performs a socket lookup for each ingress packets and aggregate 
> datagram
> directed to UDP GRO enabled sockets with constant l4 tuple.
>
> UDP GRO packets can land on non GRO-enabled sockets, e.g. due to iptables NAT
> rules, and that could potentially confuse existing applications.

Good catch.

> The solution adopted here is to de-segment the GRO packet before enqueuing
> as needed. Since we must cope with packet reinsertion after de-segmentation,
> the relevant code is factored-out in ipv4 and ipv6 specific helpers and 
> exposed
> to UDP usage.
>
> While the current code can probably be improved, this safeguard ,implemented 
> in
> the patches 4-7, allows future enachements to enable UDP GSO offload on more
> virtual devices eventually even on forwarded packets.
>
> The last 4 for patches implement some performance and functional self-tests,
> re-using the existing udpgso infrastructure. The problematic scenario 
> described
> above is explicitly tested.

This looks awesome! Impressive testing, too.

A few comments in the individual patches, mostly minor.


bpf-next is CLOSED

2018-10-21 Thread Daniel Borkmann
Given the merge window is opening shortly, please only submit bug
fixes to bpf tree at this time, thank you!


pull-request: bpf-next 2018-10-21

2018-10-21 Thread Daniel Borkmann
Hi David,

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Implement two new kind of BPF maps, that is, queue and stack
   map along with new peek, push and pop operations, from Mauricio.

2) Add support for MSG_PEEK flag when redirecting into an ingress
   psock sk_msg queue, and add a new helper bpf_msg_push_data() for
   insert data into the message, from John.

3) Allow for BPF programs of type BPF_PROG_TYPE_CGROUP_SKB to use
   direct packet access for __skb_buff, from Song.

4) Use more lightweight barriers for walking perf ring buffer for
   libbpf and perf tool as well. Also, various fixes and improvements
   from verifier side, from Daniel.

5) Add per-symbol visibility for DSO in libbpf and hide by default
   global symbols such as netlink related functions, from Andrey.

6) Two improvements to nfp's BPF offload to check vNIC capabilities
   in case prog is shared with multiple vNICs and to protect against
   mis-initializing atomic counters, from Jakub.

7) Fix for bpftool to use 4 context mode for the nfp disassembler,
   also from Jakub.

8) Fix a return value comparison in test_libbpf.sh and add several
   bpftool improvements in bash completion, documentation of bpf fs
   restrictions and batch mode summary print, from Quentin.

9) Fix a file resource leak in BPF selftest's load_kallsyms()
   helper, from Peng.

10) Fix an unused variable warning in map_lookup_and_delete_elem(),
from Alexei.

11) Fix bpf_skb_adjust_room() signature in BPF UAPI helper doc,
from Nicolas.

12) Add missing executables to .gitignore in BPF selftests, from Anders.

Please consider pulling these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

Thanks a lot!



The following changes since commit 2c59f06cc0442862d589c36bd2f29667f96c35e7:

  Merge branch 'net-Kernel-side-filtering-for-route-dumps' (2018-10-16 00:14:18 
-0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git 

for you to fetch changes up to fe8e10b3adc071de05ca7af728ca1a4ac9aa:

  tools: bpftool: fix completion for "bpftool map update" (2018-10-21 20:45:14 
+0200)


Alexei Starovoitov (6):
  Merge branch 'nfp-improve-bpf-offload'
  Merge branch 'queue_stack_maps'
  Merge branch 'improve_perf_barriers'
  Merge branch 'cg_skb_direct_pkt_access'
  bpf: remove unused variable
  Merge branch 'misc-improvements'

Anders Roxell (1):
  selftests/bpf: add missing executables to .gitignore

Andrey Ignatov (1):
  libbpf: Per-symbol visibility for DSO

Daniel Borkmann (11):
  bpf, tls: add tls header to tools infrastructure
  Merge branch 'bpf-sk-msg-peek'
  tools, perf: add and use optimized ring_buffer_{read_head, write_tail} 
helpers
  bpf, libbpf: use correct barriers in perf ring buffer walk
  Merge branch 'bpf-msg-push-data'
  ulp: remove uid and user_visible members
  bpf, verifier: fix register type dump in xadd and st
  bpf, verifier: reject xadd on flow key memory
  bpf, verifier: remove unneeded flow key in check_helper_mem_access
  bpf, verifier: avoid retpoline for map push/pop/peek operation
  bpf, libbpf: simplify and cleanup perf ring buffer walk

Jakub Kicinski (3):
  nfp: bpf: protect against mis-initializing atomic counters
  nfp: bpf: double check vNIC capabilities after object sharing
  tools: bpftool: use 4 context mode for the NFP disasm

John Fastabend (8):
  bpf: sockmap, fix skmsg recvmsg handler to track size correctly
  bpf: skmsg, improve sk_msg_used_element to work in cork context
  bpf: sockmap, support for msg_peek in sk_msg with redirect ingress
  bpf: sockmap, add msg_peek tests to test_sockmap
  bpf: skmsg, fix psock create on existing kcm/tls port
  bpf: sk_msg program helper bpf_msg_push_data
  bpf: libbpf support for msg_push_data
  bpf: test_sockmap add options to use msg_push_data

Mauricio Vasquez B (7):
  bpf: rename stack trace map operations
  bpf/syscall: allow key to be null in map functions
  bpf/verifier: add ARG_PTR_TO_UNINIT_MAP_VALUE
  bpf: add queue and stack maps
  bpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall
  Sync uapi/bpf.h to tools/include
  selftests/bpf: add test cases for queue and stack maps

Nicolas Dichtel (1):
  bpf: fix doc of bpf_skb_adjust_room() in uapi

Peng Hao (1):
  selftests/bpf: fix file resource leak in load_kallsyms

Quentin Monnet (4):
  selftests/bpf: fix return value comparison for tests in test_libbpf.sh
  tools: bpftool: document restriction on '.' in names to pin in bpffs
  tools: bpftool: print nb of cmds to stdout (not stderr) for batch mode
  tools: bpftool: fix completion for "bpftool map update"

Song Liu (2):
  bpf: add 

Re: [PATCH iproute2-next 3/3] rdma: Add an option to rename IB device interface

2018-10-21 Thread David Ahern
On 10/18/18 5:51 AM, Leon Romanovsky wrote:
> From: Leon Romanovsky 
> 
> Enrich rdmatool with an option to rename IB devices,
> the command interface follows Iproute2 convention:
> "rdma dev set [OLD-DEVNAME] name NEW-DEVNAME"
> 
> Signed-off-by: Leon Romanovsky 
> ---
>  rdma/dev.c | 35 +++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/rdma/dev.c b/rdma/dev.c
> index e2eafe47..760b7fb3 100644
> --- a/rdma/dev.c
> +++ b/rdma/dev.c
> @@ -14,6 +14,7 @@
>  static int dev_help(struct rd *rd)
>  {
>   pr_out("Usage: %s dev show [DEV]\n", rd->filename);
> + pr_out("   %s dev set [DEV] name DEVNAME\n", rd->filename);
>   return 0;
>  }
>  
> @@ -240,17 +241,51 @@ static int dev_one_show(struct rd *rd)
>   return rd_exec_cmd(rd, cmds, "parameter");
>  }
>  
> +static int dev_set_name(struct rd *rd)
> +{
> + uint32_t seq;
> +
> + if (rd_no_arg(rd)) {
> + pr_err("Please provide device new name.\n");
> + return -EINVAL;
> + }

This is redundant with rd_exec_require_dev which is the required path to
get to this point.



Re: [PATCHv2 iproute2-next] ip/geneve: fix ttl inherit behavior

2018-10-21 Thread David Ahern
On 10/18/18 1:01 AM, Hangbin Liu wrote:
> Currently when we add geneve with "ttl inherit", we only set ttl to 0, which
> is actually use whatever default value instead of inherit the inner protocol's
> ttl value.
> 
> To make a difference with ttl inherit and ttl == 0, we add an attribute
> IFLA_GENEVE_TTL_INHERIT in kernel commit 52d0d404d39dd ("geneve: add ttl
> inherit support"). Now let's use "ttl inherit" to inherit the inner
> protocol's ttl, and use "ttl auto" to means "use whatever default value",
> the same behavior with ttl == 0.
> 
> v2:
> 1) remove IFLA_GENEVE_TTL_INHERIT defination in if_link.h as it's already
>updated.
> 2) Still use addattr8() so we can enable/disable ttl inherit, as Michal
>suggested.
> 
> Reported-by: Jianlin Shi 
> Signed-off-by: Hangbin Liu 
> ---
>  ip/iplink_geneve.c | 20 +---
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 

please update the man page as well.



Re: [PATCH bpf-next] selftests/bpf: enable (uncomment) all tests in test_libbpf.sh

2018-10-21 Thread Quentin Monnet
2018-10-21 11:57 UTC+0200 ~ Jesper Dangaard Brouer 
> On Sat, 20 Oct 2018 23:00:24 +0100
> Quentin Monnet  wrote:
> 

[...]

>> --- a/tools/testing/selftests/bpf/test_libbpf.sh
>> +++ b/tools/testing/selftests/bpf/test_libbpf.sh
>> @@ -33,17 +33,11 @@ trap exit_handler 0 2 3 6 9
>>   
>>   libbpf_open_file test_l4lb.o
>>   
>> -# TODO: fix libbpf to load noinline functions
>> -# [warning] libbpf: incorrect bpf_call opcode
>> -#libbpf_open_file test_l4lb_noinline.o
>> +libbpf_open_file test_l4lb_noinline.o
>>   
>> -# TODO: fix test_xdp_meta.c to load with libbpf
>> -# [warning] libbpf: test_xdp_meta.o doesn't provide kernel version
>> -#libbpf_open_file test_xdp_meta.o
>> +libbpf_open_file test_xdp_meta.o
>>   
>> -# TODO: fix libbpf to handle .eh_frame
>> -# [warning] libbpf: relocation failed: no section(10)
>> -#libbpf_open_file ../../../../samples/bpf/tracex3_kern.o
>> +libbpf_open_file ../../../../samples/bpf/tracex3_kern.o
> 
> I don't like the ../../../../samples/bpf/ reference (even-through I
> added this TODO), as the kselftests AFAIK support installing the
> selftests and then this tests will fail.
> Maybe we can find another example kern.o file?
> (which isn't compiled with -target bpf)

Hi Jesper, yeah maybe making the test rely on something from samples/bpf
instead of just the selftests/bpf directory is not a good idea. But
there is no program compiled without the "-target-bpf" in that
directory. What we could do is explicitly compile one without the flag
in the Makefile, as in the patch below, but I am not sure that doing so
is acceptable? Or should tests for libbpf have a directory of their own,
with another Makefile?

Another question regarding the test with test_xdp_meta.o: does the fix I
suggested (setting a version in the .C file) makes sense, or did you
leave this test for testing someday that libbpf would be able to open
even programs that do not set a version (in which case this is still not
the case if program type is not provided, and in fact my fix ruins
everything? :s).

Thanks,
Quentin

---
 tools/testing/selftests/bpf/Makefile| 10 ++
 tools/testing/selftests/bpf/test_libbpf.sh  | 14 +-
 tools/testing/selftests/bpf/test_xdp_meta.c |  2 ++
 3 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index e39dfb4e7970..ecd79b7fb107 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -135,6 +135,16 @@ endif
 endif
 endif
 
+# Have one program compiled without "-target bpf" to test whether libbpf loads
+# it successfully
+$(OUTPUT)/test_xdp.o: test_xdp.c
+   $(CLANG) $(CLANG_FLAGS) \
+   -O2 -emit-llvm -c $< -o - | \
+   $(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
+ifeq ($(DWARF2BTF),y)
+   $(BTF_PAHOLE) -J $@
+endif
+
 $(OUTPUT)/%.o: %.c
$(CLANG) $(CLANG_FLAGS) \
 -O2 -target bpf -emit-llvm -c $< -o - |  \
diff --git a/tools/testing/selftests/bpf/test_libbpf.sh 
b/tools/testing/selftests/bpf/test_libbpf.sh
index 156d89f1edcc..b45962a44243 100755
--- a/tools/testing/selftests/bpf/test_libbpf.sh
+++ b/tools/testing/selftests/bpf/test_libbpf.sh
@@ -33,17 +33,13 @@ trap exit_handler 0 2 3 6 9
 
 libbpf_open_file test_l4lb.o
 
-# TODO: fix libbpf to load noinline functions
-# [warning] libbpf: incorrect bpf_call opcode
-#libbpf_open_file test_l4lb_noinline.o
+# Load a program with BPF-to-BPF calls
+libbpf_open_file test_l4lb_noinline.o
 
-# TODO: fix test_xdp_meta.c to load with libbpf
-# [warning] libbpf: test_xdp_meta.o doesn't provide kernel version
-#libbpf_open_file test_xdp_meta.o
+libbpf_open_file test_xdp_meta.o
 
-# TODO: fix libbpf to handle .eh_frame
-# [warning] libbpf: relocation failed: no section(10)
-#libbpf_open_file ../../../../samples/bpf/tracex3_kern.o
+# Load a program compiled without the "-target bpf" flag
+libbpf_open_file test_xdp.o
 
 # Success
 exit 0
diff --git a/tools/testing/selftests/bpf/test_xdp_meta.c 
b/tools/testing/selftests/bpf/test_xdp_meta.c
index 8d0182650653..2f42de66e2bb 100644
--- a/tools/testing/selftests/bpf/test_xdp_meta.c
+++ b/tools/testing/selftests/bpf/test_xdp_meta.c
@@ -8,6 +8,8 @@
 #define round_up(x, y) x) - 1) | __round_mask(x, y)) + 1)
 #define ctx_ptr(ctx, mem) (void *)(unsigned long)ctx->mem
 
+int _version SEC("version") = 1;
+
 SEC("t")
 int ing_cls(struct __sk_buff *ctx)
 {
-- 
2.7.4



[PATCH V1 net-next] net: ena: fix compilation error in xtensa architecture

2018-10-21 Thread akiyano
From: Arthur Kiyanovski 

linux/prefetch.h is never explicitly included in ena_com, although
functions from it, such as prefetchw(), are used throughout ena_com.
This is an inclusion bug, and we fix it here by explicitly including
linux/prefetch.h. The bug was exposed when the driver was compiled
for the xtensa architecture.

Fixes: 689b2bdaaa14 ("net: ena: add functions for handling Low Latency Queues 
in ena_com")
Fixes: 8c590f977638 ("ena: Fix Kconfig dependency on X86")
Signed-off-by: Arthur Kiyanovski 
---
 drivers/net/ethernet/amazon/ena/ena_com.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.h 
b/drivers/net/ethernet/amazon/ena/ena_com.h
index ae8b485..078d6f2 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.h
+++ b/drivers/net/ethernet/amazon/ena/ena_com.h
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.7.4



Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-21 Thread Andre Tomt

On 20.10.2018 00:25, Eric Dumazet wrote:

On 10/19/2018 02:58 PM, Eric Dumazet wrote:

On 10/16/2018 06:00 AM, Eric Dumazet wrote:

On Mon, Oct 15, 2018 at 11:30 PM Andre Tomt  wrote:

I've seen similar on several systems with mlx4 cards when using 4.18.x -
that is hw csum failure followed by some backtrace.

Only seems to happen on systems dealing with quite a bit of UDP.



Strange, because mlx4 on IPv6+UDP should not use CHECKSUM_COMPLETE,
but CHECKSUM_UNNECESSARY

I would be nice to track this a bit further, maybe by providing the
full packet content.





As a matter of fact Dimitris found the issue in the patch and is working on a 
fix involving csum_block_sub()

Problems comes from trimming an odd number of bytes.


More exactly, trimming bytes starting at an odd offset.


No hw csum failures here since I deployed Dimitris fix on top of 4.18.16 
32 hours ago.


Thanks


[PATCH] net/mlx5: Allocate enough space for the FDB sub-namespaces

2018-10-21 Thread Or Gerlitz
From: Dan Carpenter 

FDB_MAX_CHAIN is three.  We wanted to allocate enough memory to hold four
structs but there are missing parentheses so we only allocate enough
memory for three structs and the first byte of the fourth one.

Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple 
namespaces")
Signed-off-by: Dan Carpenter 
Reviewed-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 67ba4c9..9d73eb9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -2470,7 +2470,7 @@ static int init_fdb_root_ns(struct mlx5_flow_steering 
*steering)
return -ENOMEM;
 
steering->fdb_sub_ns = kzalloc(sizeof(steering->fdb_sub_ns) *
-  FDB_MAX_CHAIN + 1, GFP_KERNEL);
+  (FDB_MAX_CHAIN + 1), GFP_KERNEL);
if (!steering->fdb_sub_ns)
return -ENOMEM;
 
-- 
2.3.7



Re: [PATCH] net/mlx5: allocate enough space in

2018-10-21 Thread Or Gerlitz
On Sat, Oct 20, 2018 at 11:37 PM Or Gerlitz  wrote:
> On Fri, Oct 19, 2018 at 11:08 PM Dan Carpenter  
> wrote:
> > FDB_MAX_CHAIN is 3.  We wanted to allocate enough memory to hold four
> > structs but there are missing parentheses so we only allocate enough
> > memory for three structs and the first byte of the fourth one.
>
> yeah, seems that we were wrong here and the fix is correct, at some
> point I saw Kasan screams but it was gone later, let me look, thanks for 
> pointing it out.

OK, here's the kasan note:

[  289.005141] BUG: KASAN: slab-out-of-bounds in
mlx5_init_fs+0x6a7/0x1176 [mlx5_core]
[  289.005244] Write of size 8 at addr 8806cfb70e58 by task modprobe/6186

my .config was like this w.r.t kasan:

CONFIG_KASAN_SHADOW_OFFSET=0xdc00
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_KASAN=y
# CONFIG_KASAN_EXTRA is not set
CONFIG_KASAN_OUTLINE=y
# CONFIG_KASAN_INLINE is not set
# CONFIG_TEST_KASAN is not set

where now, when I changed it to be:

CONFIG_KASAN_SHADOW_OFFSET=0xdc00
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_KASAN=y
CONFIG_KASAN_EXTRA=y
# CONFIG_KASAN_OUTLINE is not set
CONFIG_KASAN_INLINE=y
# CONFIG_TEST_KASAN is not set

Kasan is there to spot the bug.

I will re-post your patch, this time to netdev since the original
commit is there
and so should be the fix, thanks for reporting/fixing!

Or.


I NEED YOUR HELP URGENTLY!!!

2018-10-21 Thread GEN KELVIN
Compliment of the day to you. I am Gen.Kelvin W Howard, I am sending this brief 
letter to solicit your partnership of Sixteen  Million Two Hundred Thousand 
United States Dollars ($16,200,000). I shall send you more information and 
procedures when I receive positive response from you.Best Regards,
CONTACT ME: kivenhow...@gmail.com
Gen.Kelvin W Howard

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



Re: [PATCH net-next] octeontx2-af: Remove set but not used variable 'block'

2018-10-21 Thread YueHaibing
On 2018/10/19 21:36, Sunil Kovvuri wrote:
> On Fri, Oct 19, 2018 at 6:11 PM YueHaibing  wrote:
>>
>> Fixes gcc '-Wunused-but-set-variable' warning:
>>
>> drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c: In function 
>> 'rvu_npa_init':
>> drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c:446:20: warning:
>>  variable 'block' set but not used [-Wunused-but-set-variable]
>>
>> It never used since introduction in
>> commit 7a37245ef23f ("octeontx2-af: NPA block admin queue init")
>>
>> Signed-off-by: YueHaibing 
>> ---
>>  drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c | 3 ---
>>  1 file changed, 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c 
>> b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
>> index 0e43a69..7531fdc 100644
>> --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
>> +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
>> @@ -443,15 +443,12 @@ static int npa_aq_init(struct rvu *rvu, struct 
>> rvu_block *block)
>>  int rvu_npa_init(struct rvu *rvu)
>>  {
>> struct rvu_hwinfo *hw = rvu->hw;
>> -   struct rvu_block *block;
>> int blkaddr, err;
>>
>> blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
>> if (blkaddr < 0)
>> return 0;
>>
>> -   block = >block[blkaddr];
>> -
>> /* Initialize admin queue */
>> err = npa_aq_init(rvu, >block[blkaddr]);
>> if (err)
>>
> 
> Thanks for the patch.
> Which GCC version do you use ?

gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)

> Before submitting patches I did test compiling specifically with these
> "make  arch=X86 -j8 -Werror=unused-function -Wunused-but-set-variable"
> but that didn't throw these warnings.
> 
> Thanks,
> Sunil.
> 
> .
> 



Re: [PATCH bpf-next] selftests/bpf: enable (uncomment) all tests in test_libbpf.sh

2018-10-21 Thread Jesper Dangaard Brouer
On Sat, 20 Oct 2018 23:00:24 +0100
Quentin Monnet  wrote:

> libbpf is now able to load successfully test_l4lb_noinline.o and
> samples/bpf/tracex3_kern.o, so we can uncomment related tests from
> test_libbpf.c and remove the associated "TODO"s.

Thanks for working on this, comments below.

> It is also trivial to fix test_xdp_noinline.o so that it provides a
> version and can be loaded. Fix it and uncomment this test as well.
> 
> For the record, the error message obtainted with tracex3_kern.o was
> fixed by commit e3d91b0ca523 ("tools/libbpf: handle issues with bpf ELF
> objects containing .eh_frames")
> 
> I have not been abled to reproduce the "libbpf: incorrect bpf_call
> opcode" error for test_l4lb_noinline.o, even with the version of libbpf
> present at the time when test_libbpf.sh and test_libbpf_open.c were
> created.
> 
> Cc: Jesper Dangaard Brouer 
> Signed-off-by: Quentin Monnet 
> ---
>  tools/testing/selftests/bpf/test_libbpf.sh  | 12 +++-
>  tools/testing/selftests/bpf/test_xdp_meta.c |  2 ++
>  2 files changed, 5 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/test_libbpf.sh 
> b/tools/testing/selftests/bpf/test_libbpf.sh
> index 156d89f1edcc..a426f28163a5 100755
> --- a/tools/testing/selftests/bpf/test_libbpf.sh
> +++ b/tools/testing/selftests/bpf/test_libbpf.sh
> @@ -33,17 +33,11 @@ trap exit_handler 0 2 3 6 9
>  
>  libbpf_open_file test_l4lb.o
>  
> -# TODO: fix libbpf to load noinline functions
> -# [warning] libbpf: incorrect bpf_call opcode
> -#libbpf_open_file test_l4lb_noinline.o
> +libbpf_open_file test_l4lb_noinline.o
>  
> -# TODO: fix test_xdp_meta.c to load with libbpf
> -# [warning] libbpf: test_xdp_meta.o doesn't provide kernel version
> -#libbpf_open_file test_xdp_meta.o
> +libbpf_open_file test_xdp_meta.o
>  
> -# TODO: fix libbpf to handle .eh_frame
> -# [warning] libbpf: relocation failed: no section(10)
> -#libbpf_open_file ../../../../samples/bpf/tracex3_kern.o
> +libbpf_open_file ../../../../samples/bpf/tracex3_kern.o

I don't like the ../../../../samples/bpf/ reference (even-through I
added this TODO), as the kselftests AFAIK support installing the
selftests and then this tests will fail.
Maybe we can find another example kern.o file?
(which isn't compiled with -target bpf)

>  # Success
>  exit 0
> diff --git a/tools/testing/selftests/bpf/test_xdp_meta.c 
> b/tools/testing/selftests/bpf/test_xdp_meta.c
> index 8d0182650653..2f42de66e2bb 100644
> --- a/tools/testing/selftests/bpf/test_xdp_meta.c
> +++ b/tools/testing/selftests/bpf/test_xdp_meta.c
> @@ -8,6 +8,8 @@
>  #define round_up(x, y) x) - 1) | __round_mask(x, y)) + 1)
>  #define ctx_ptr(ctx, mem) (void *)(unsigned long)ctx->mem
>  
> +int _version SEC("version") = 1;
> +
>  SEC("t")
>  int ing_cls(struct __sk_buff *ctx)
>  {



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


r8169: changed rx buffer alignment requirement

2018-10-21 Thread Heiner Kallweit
Hi Eric,

when working on the r8169 driver I came across an old patch from you:
6f0333b8fde4 ("r8169: use 50% less ram for RX ring")

As part of this patch the alignment requirement for rx buffers was
silently changed from 8 to 16 bytes. Can you remember (well, after
eight years) why you did this? In the chip datasheet I find only
8 bytes mentioned as requirement.

Regards,
Heiner


Re: [PATCH] net: ethernet:fec: Consistently use SPEED_ prefix

2018-10-21 Thread Sergei Shtylyov

On 20.10.2018 23:48, Andrew Lunn wrote:


All other calls to phy_set_max_speed() use the SPEED_ prefix. Make the
FEC driver follow this common pattern. This makes no different to


   Difference.


generated code since SPEED_1000 is 1000, and SPEED_100 is 100.

Reported-by: Corentin Labbe 
Signed-off-by: Andrew Lunn 

[...]

MBR, Sergei



Re: [PATCH net-next] net: phy: phy_support_sym_pause: Clear Asym Pause

2018-10-21 Thread Sergei Shtylyov

Hello!

On 20.10.2018 23:41, Andrew Lunn wrote:


When indicating the MAC supports Symmetric Pause, clear the Asymmetric
Pause bit, which could of been already set is the PHY supports it.


   Could've been, s/is/if/.


Reported-by: Labbe Corentin 
Fixes: c306ad36184f ("net: ethernet: Add helper for MACs which support pause")
Signed-off-by: Andrew Lunn 

[...]

MBR, Sergei



Re: [PATCH net-next 0/3] sctp: add support for sk_reuseport

2018-10-21 Thread Xin Long
On Sun, Oct 21, 2018 at 1:43 PM Xin Long  wrote:
>
> sctp sk_reuseport allows multiple socks to listen on the same port and
> addresses, as long as these socks have the same uid. This works pretty
> much as TCP/UDP does, the only difference is that sctp is multi-homing
> and all the bind_addrs in these socks will have to completely matched,
> otherwise listen() will return err.
>
> The below is when 5 sockets are listening on 172.16.254.254:6400 on a
> server, 26 sockets on a client connect to 172.16.254.254:6400 and each
> may be processed by a different socket on the server which is selected
> by hash(lport, pport, paddr) in reuseport_select_sock():
>
>  # ss --sctp -nn
>State  Recv-Q Send-QLocal Address:Port Peer Address:Port
>LISTEN 0  10   172.16.254.254:6400*:*
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.2.1:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.2.4:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.3.3:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.3.4:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.5.2:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.5.3:1234
>LISTEN 0  10   172.16.254.254:6400*:*
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.1.3:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.1.4:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.3.2:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.4.1:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.4.2:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.4.3:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.4.4:1234
>LISTEN 0  10   172.16.254.254:6400*:*
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.1.2:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.3.5:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.4.5:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.253.253:1234
>LISTEN 0  10   172.16.254.254:6400*:*
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.2.2:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.2.3:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.5.4:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.5.5:1234
>LISTEN 0  10   172.16.254.254:6400*:*
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.1.1:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.1.5:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.2.5:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.3.1:1234
>`- ESTAB   0  0   172.16.254.254%eth1:6400   172.16.5.1:1234
Attached is the testcase based on sctp-tests.git.

>
> Xin Long (3):
>   sctp: do reuseport_select_sock in __sctp_rcv_lookup_endpoint
>   sctp: add sock_reuseport for the sock in __sctp_hash_endpoint
>   sctp: process sk_reuseport in sctp_get_port_local
>
>  include/net/sctp/sctp.h|   2 +-
>  include/net/sctp/structs.h |   6 ++-
>  net/core/sock_reuseport.c  |   1 +
>  net/sctp/bind_addr.c   |  28 ++
>  net/sctp/input.c   | 129 
> -
>  net/sctp/socket.c  |  49 +++--
>  6 files changed, 162 insertions(+), 53 deletions(-)
>
> --
> 2.1.0
>


reuseport.tar.gz
Description: GNU Zip compressed data


Re: [PATCH net-next] net: phy: phy_support_sym_pause: Clear Asym Pause

2018-10-21 Thread LABBE Corentin
On Sat, Oct 20, 2018 at 10:41:28PM +0200, Andrew Lunn wrote:
> When indicating the MAC supports Symmetric Pause, clear the Asymmetric
> Pause bit, which could of been already set is the PHY supports it.
> 
> Reported-by: Labbe Corentin 
> Fixes: c306ad36184f ("net: ethernet: Add helper for MACs which support pause")
> Signed-off-by: Andrew Lunn 
> ---
>  drivers/net/phy/phy_device.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 43cb08dcce81..ab33d1777132 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -1940,6 +1940,7 @@ EXPORT_SYMBOL(phy_remove_link_mode);
>   */
>  void phy_support_sym_pause(struct phy_device *phydev)
>  {
> + phydev->supported &= ~SUPPORTED_Asym_Pause;
>   phydev->supported |= SUPPORTED_Pause;
>   phydev->advertising = phydev->supported;
>  }
> -- 
> 2.19.0
> 

Thanks, it made my imx6q-sabrelite works again with next-20181019.
Tested-by: Corentin Labbe 

For completeness, this is the ethtool output which confirm it.
ethtool version 4.16
Settings for eth0:
Supported ports: [ TP MII ]
Supported link modes:   10baseT/Half 10baseT/Full 
100baseT/Half 100baseT/Full 
1000baseT/Full 
Supported pause frame use: Symmetric
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes:  10baseT/Half 10baseT/Full 
100baseT/Half 100baseT/Full 
1000baseT/Full 
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes:  10baseT/Half 10baseT/Full 
 100baseT/Half 100baseT/Full 
 1000baseT/Half 1000baseT/Full 
Link partner advertised pause frame use: Symmetric Receive-only
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 1000Mb/s
Duplex: Full
Port: MII
PHYAD: 6
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: d
Wake-on: d
Link detected: yes


Re: [PATCH bpf-next 0/3] tools: bpftool: bring minor fixes to bpftool

2018-10-21 Thread Alexei Starovoitov
On Sat, Oct 20, 2018 at 11:01:47PM +0100, Quentin Monnet wrote:
> Hi,
> These are three minor fixes for bpftool, its documentation and its bash
> completion function. Please refer to individual patches for details.

Applied, Thanks



Re: [PATCH bpf-next] selftests/bpf: fix return value comparison for tests in test_libbpf.sh

2018-10-21 Thread Alexei Starovoitov
On Sat, Oct 20, 2018 at 10:58:44PM +0100, Quentin Monnet wrote:
> The return value for each test in test_libbpf.sh is compared with
> 
> if (( $? == 0 )) ; then ...
> 
> This works well with bash, but not with dash, that /bin/sh is aliased to
> on some systems (such as Ubuntu).
> 
> Let's replace this comparison by something that works on both shells.
> 
> Signed-off-by: Quentin Monnet 
> Reviewed-by: Jakub Kicinski 

Applied, Thanks



Re: [PATCH bpf-next 0/6] Misc improvements and few minor fixes

2018-10-21 Thread Alexei Starovoitov
On Sun, Oct 21, 2018 at 02:09:22AM +0200, Daniel Borkmann wrote:
> Last batch of misc patches I had in queue: first one removes some left-over
> bits from ULP, second is a fix in the verifier where we wrongly use register
> number as type to fetch the string for the dump, third disables xadd on flow
> keys and subsequent one removes the flow key type from 
> check_helper_mem_access()
> as they cannot be passed into any helper as of today. Next one lets map push,
> pop, peek avoid having to go through retpoline, and last one has a couple of
> minor fixes and cleanups for the ring buffer walk.

Applied, Thanks