Re: inconsistent lock state with usbnet/asix usb ethernet and xhci

2018-03-04 Thread Marek Szyprowski

Hi Oliver,

On 2018-02-27 17:07, Oliver Neukum wrote:

Am Dienstag, den 27.02.2018, 07:13 -0800 schrieb Eric Dumazet:

On Tue, 2018-02-27 at 07:09 -0800, Eric Dumazet wrote:


Note that for this one, it seems we also could perform stats updates in
BH context, since skb is queued via defer_bh()

But simplicity wins I guess.

Thinking more about this, I am not sure we have any guarantee that TX
and RX can not run on multiple cpus.

Using an unique syncp is not going to be safe, even if we make lockdep
happy enough with the local_irq save/restore.

Unfortunately you are right. It is not guaranteed for some hardware.


Does it mean that the fix proposed by Eric is not the proper solution?

Best regards
--
Marek Szyprowski, PhD
Samsung R Institute Poland



Re: [PATCH v2 net-next] selftests: forwarding: Add suppport to create veth interfaces

2018-03-04 Thread Ido Schimmel
On Sun, Mar 04, 2018 at 05:37:47PM -0800, David Ahern wrote:
> For tests using veth interfaces, the test infrastructure can create
> the netdevs if they do not exist. Arguably this is a preferred approach
> since the tests require p$N and p$(N+1) to be pairs.
> 
> Signed-off-by: David Ahern 

Reviewed-by: Ido Schimmel 


Re: [Outreachy kernel] [PATCH v3] staging: ipx: Replace printk() with appropriate net_*macro_ratelimited()

2018-03-04 Thread Julia Lawall


On Mon, 5 Mar 2018, Arushi Singhal wrote:

> Replace printk having a log level with the appropriate
> net_*macro_ratelimited.
> It's better to use actual device name as a prefix in error messages.
>
> Signed-off-by: Arushi Singhal 
> ---
> changes in v2
> *In v1 printk was changed to pr_*macro(), which is used
> in kernel instead of calling printk() directly. And for drivers,
> dev_*macro() or net_*macro_ratelimited() should be used for calling
> printk() directly.
>
> changes in v3
> *Indentation is not changed, as line is exceeding 80 characters limit.

Put the v3 changes on top of the v2 changes, so that one can see
immediately what has changed most recently.

julia

>
>  drivers/staging/ipx/af_ipx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/staging/ipx/af_ipx.c b/drivers/staging/ipx/af_ipx.c
> index d21a9d1..5ec6591 100644
> --- a/drivers/staging/ipx/af_ipx.c
> +++ b/drivers/staging/ipx/af_ipx.c
> @@ -744,7 +744,7 @@ static void ipxitf_discover_netnum(struct ipx_interface 
> *intrfc,
>   intrfc->if_netnum = cb->ipx_source_net;
>   ipxitf_add_local_route(intrfc);
>   } else {
> - printk(KERN_WARNING "IPX: Network number collision "
> + net_warn_ratelimited("IPX: Network number collision "
>   "%lx\n%s %s and %s %s\n",
>   (unsigned long) ntohl(cb->ipx_source_net),
>   ipx_device_name(i),
> --
> 2.7.4
>
> --
> You received this message because you are subscribed to the Google Groups 
> "outreachy-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to outreachy-kernel+unsubscr...@googlegroups.com.
> To post to this group, send email to outreachy-ker...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/outreachy-kernel/20180305041740.GA23378%40seema-Inspiron-15-3567.
> For more options, visit https://groups.google.com/d/optout.
>


Re: linux-next: manual merge of the selinux tree with the net-next tree

2018-03-04 Thread Xin Long
On Mon, Mar 5, 2018 at 9:40 AM, Stephen Rothwell  wrote:
> Hi Paul,
>
> Today's linux-next merge of the selinux tree got a conflict in:
>
>   net/sctp/socket.c
>
> between several refactoring commits from the net-next tree and commit:
>
>   2277c7cd75e3 ("sctp: Add LSM hooks")
>
> from the selinux tree.
>
> I fixed it up (I think - see below) and can carry the fix as
The fixup is great!  the same as I mentioned in:
https://patchwork.ozlabs.org/patch/879898/
for net-next.git

> necessary. This is now fixed as far as linux-next is concerned, but any
> non trivial conflicts should be mentioned to your upstream maintainer
> when your tree is submitted for merging.  You may also want to consider
> cooperating with the maintainer of the conflicting tree to minimise any
> particularly complex conflicts.
[net-next,0/9] sctp: clean up sctp_sendmsg, this patchset was just applied
in net-next. So I just guess it might not yet be there when selinux tree was
being submitted.

>
> --
> Cheers,
> Stephen Rothwell
>
> diff --cc net/sctp/socket.c
> index 7fa76031bb08,73b34a6b5b09..
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@@ -1606,193 -1622,362 +1622,209 @@@ static int sctp_error(struct sock *sk,
>   static int sctp_msghdr_parse(const struct msghdr *msg,
>  struct sctp_cmsgs *cmsgs);
>
>  -static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
>  +static int sctp_sendmsg_parse(struct sock *sk, struct sctp_cmsgs *cmsgs,
>  +struct sctp_sndrcvinfo *srinfo,
>  +const struct msghdr *msg, size_t msg_len)
>   {
>  -  struct net *net = sock_net(sk);
>  -  struct sctp_sock *sp;
>  -  struct sctp_endpoint *ep;
>  -  struct sctp_association *new_asoc = NULL, *asoc = NULL;
>  -  struct sctp_transport *transport, *chunk_tp;
>  -  struct sctp_chunk *chunk;
>  -  union sctp_addr to;
>  -  struct sctp_af *af;
>  -  struct sockaddr *msg_name = NULL;
>  -  struct sctp_sndrcvinfo default_sinfo;
>  -  struct sctp_sndrcvinfo *sinfo;
>  -  struct sctp_initmsg *sinit;
>  -  sctp_assoc_t associd = 0;
>  -  struct sctp_cmsgs cmsgs = { NULL };
>  -  enum sctp_scope scope;
>  -  bool fill_sinfo_ttl = false, wait_connect = false;
>  -  struct sctp_datamsg *datamsg;
>  -  int msg_flags = msg->msg_flags;
>  -  __u16 sinfo_flags = 0;
>  -  long timeo;
>  +  __u16 sflags;
> int err;
>
>  -  err = 0;
>  -  sp = sctp_sk(sk);
>  -  ep = sp->ep;
>  -
>  -  pr_debug("%s: sk:%p, msg:%p, msg_len:%zu ep:%p\n", __func__, sk,
>  -   msg, msg_len, ep);
>  +  if (sctp_sstate(sk, LISTENING) && sctp_style(sk, TCP))
>  +  return -EPIPE;
>
>  -  /* We cannot send a message over a TCP-style listening socket. */
>  -  if (sctp_style(sk, TCP) && sctp_sstate(sk, LISTENING)) {
>  -  err = -EPIPE;
>  -  goto out_nounlock;
>  -  }
>  +  if (msg_len > sk->sk_sndbuf)
>  +  return -EMSGSIZE;
>
>  -  /* Parse out the SCTP CMSGs.  */
>  -  err = sctp_msghdr_parse(msg, );
>  +  memset(cmsgs, 0, sizeof(*cmsgs));
>  +  err = sctp_msghdr_parse(msg, cmsgs);
> if (err) {
> pr_debug("%s: msghdr parse err:%x\n", __func__, err);
>  -  goto out_nounlock;
>  +  return err;
> }
>
>  -  /* Fetch the destination address for this packet.  This
>  -   * address only selects the association--it is not necessarily
>  -   * the address we will send to.
>  -   * For a peeled-off socket, msg_name is ignored.
>  -   */
>  -  if (!sctp_style(sk, UDP_HIGH_BANDWIDTH) && msg->msg_name) {
>  -  int msg_namelen = msg->msg_namelen;
>  -
>  -  err = sctp_verify_addr(sk, (union sctp_addr *)msg->msg_name,
>  - msg_namelen);
>  -  if (err)
>  -  return err;
>  -
>  -  if (msg_namelen > sizeof(to))
>  -  msg_namelen = sizeof(to);
>  -  memcpy(, msg->msg_name, msg_namelen);
>  -  msg_name = msg->msg_name;
>  +  memset(srinfo, 0, sizeof(*srinfo));
>  +  if (cmsgs->srinfo) {
>  +  srinfo->sinfo_stream = cmsgs->srinfo->sinfo_stream;
>  +  srinfo->sinfo_flags = cmsgs->srinfo->sinfo_flags;
>  +  srinfo->sinfo_ppid = cmsgs->srinfo->sinfo_ppid;
>  +  srinfo->sinfo_context = cmsgs->srinfo->sinfo_context;
>  +  srinfo->sinfo_assoc_id = cmsgs->srinfo->sinfo_assoc_id;
>  +  srinfo->sinfo_timetolive = cmsgs->srinfo->sinfo_timetolive;
> }
>
>  -  sinit = cmsgs.init;
>  -  if (cmsgs.sinfo != NULL) {
>  -  memset(_sinfo, 0, sizeof(default_sinfo));
>  -  default_sinfo.sinfo_stream = cmsgs.sinfo->snd_sid;
>  -  

Re: [PATCH v3] staging: ipx: Replace printk() with appropriate net_*macro_ratelimited()

2018-03-04 Thread Greg KH
On Mon, Mar 05, 2018 at 09:47:40AM +0530, Arushi Singhal wrote:
> Replace printk having a log level with the appropriate
> net_*macro_ratelimited.
> It's better to use actual device name as a prefix in error messages.
> 
> Signed-off-by: Arushi Singhal 
> ---
> changes in v2
> *In v1 printk was changed to pr_*macro(), which is used
> in kernel instead of calling printk() directly. And for drivers,
> dev_*macro() or net_*macro_ratelimited() should be used for calling
> printk() directly.
> 
> changes in v3
> *Indentation is not changed, as line is exceeding 80 characters limit.
> 
>  drivers/staging/ipx/af_ipx.c | 2 +-

Did you read drivers/staging/ipx/TODO?

Please go do so.

sorry,

greg k-h


[PATCH] net/mlx4_en: fix potential use-after-free with dma_unmap_page

2018-03-04 Thread Sarah Newman
Take an additional reference to a page whenever it is placed
into the rx ring and put the page again after running
dma_unmap_page.

When swiotlb is in use, calling dma_unmap_page means that
the original page mapped with dma_map_page must still be valid,
as swiotlb will copy data from its internal cache back to the
originally requested DMA location.

When GRO is enabled, before this patch all references to the
original frag may be put and the page freed before dma_unmap_page
in mlx4_en_free_frag is called.

It is possible there is a path where the use-after-free occurs
even with GRO disabled, but this has not been observed so far.

The bug can be trivially detected by doing the following:

* Compile the kernel with DEBUG_PAGEALLOC
* Run the kernel as a Xen Dom0
* Leave GRO enabled on the interface
* Run a 10 second or more test with iperf over the interface.

This bug was likely introduced in
commit 4cce66cdd14a ("mlx4_en: map entire pages to increase throughput"),
first part of u3.6.

It was incidentally fixed in
commit 34db548bfb95 ("mlx4: add page recycling in receive path"),
first part of v4.12.

This version applies to the v4.9 series.

Signed-off-by: Sarah Newman 
Tested-by: Sarah Newman 
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c   | 39 +---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c   |  3 ++-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  1 +
 3 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index bcbb80f..d1fb087 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -80,10 +80,14 @@ static int mlx4_alloc_pages(struct mlx4_en_priv *priv,
page_alloc->page = page;
page_alloc->dma = dma;
page_alloc->page_offset = 0;
+   page_alloc->page_owner = true;
/* Not doing get_page() for each frag is a big win
 * on asymetric workloads. Note we can not use atomic_set().
 */
-   page_ref_add(page, page_alloc->page_size / frag_info->frag_stride - 1);
+   /* Since the page must be valid until after dma_unmap_page is called,
+* take an additional reference we would not have otherwise.
+*/
+   page_ref_add(page, page_alloc->page_size / frag_info->frag_stride);
return 0;
 }
 
@@ -105,9 +109,13 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
page_alloc[i].page_offset += frag_info->frag_stride;
 
if (page_alloc[i].page_offset + frag_info->frag_stride <=
-   ring_alloc[i].page_size)
-   continue;
-
+   ring_alloc[i].page_size) {
+   WARN_ON(!page_alloc[i].page);
+   WARN_ON(!page_alloc[i].page_owner);
+   if (likely(page_alloc[i].page &&
+  page_alloc[i].page_owner))
+   continue;
+   }
if (unlikely(mlx4_alloc_pages(priv, _alloc[i],
  frag_info, gfp)))
goto out;
@@ -131,7 +139,7 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
page = page_alloc[i].page;
/* Revert changes done by mlx4_alloc_pages */
page_ref_sub(page, page_alloc[i].page_size /
-  priv->frag_info[i].frag_stride - 1);
+  priv->frag_info[i].frag_stride);
put_page(page);
}
}
@@ -146,11 +154,13 @@ static void mlx4_en_free_frag(struct mlx4_en_priv *priv,
u32 next_frag_end = frags[i].page_offset + 2 * frag_info->frag_stride;
 
 
-   if (next_frag_end > frags[i].page_size)
+   if (next_frag_end > frags[i].page_size) {
dma_unmap_page(priv->ddev, frags[i].dma, frags[i].page_size,
   frag_info->dma_dir);
+   put_page(frags[i].page);
+   }
 
-   if (frags[i].page)
+   if (frags[i].page_owner)
put_page(frags[i].page);
 }
 
@@ -184,9 +194,10 @@ static int mlx4_en_init_allocator(struct mlx4_en_priv 
*priv,
page = page_alloc->page;
/* Revert changes done by mlx4_alloc_pages */
page_ref_sub(page, page_alloc->page_size /
-  priv->frag_info[i].frag_stride - 1);
+  priv->frag_info[i].frag_stride);
put_page(page);
page_alloc->page = NULL;
+   page_alloc->page_owner = false;
}
return -ENOMEM;
 }
@@ -206,12 +217,14 @@ static void mlx4_en_destroy_allocator(struct mlx4_en_priv 
*priv,
 
dma_unmap_page(priv->ddev, page_alloc->dma,

[PATCH v3] staging: ipx: Replace printk() with appropriate net_*macro_ratelimited()

2018-03-04 Thread Arushi Singhal
Replace printk having a log level with the appropriate
net_*macro_ratelimited.
It's better to use actual device name as a prefix in error messages.

Signed-off-by: Arushi Singhal 
---
changes in v2
*In v1 printk was changed to pr_*macro(), which is used
in kernel instead of calling printk() directly. And for drivers,
dev_*macro() or net_*macro_ratelimited() should be used for calling
printk() directly.

changes in v3
*Indentation is not changed, as line is exceeding 80 characters limit.

 drivers/staging/ipx/af_ipx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/ipx/af_ipx.c b/drivers/staging/ipx/af_ipx.c
index d21a9d1..5ec6591 100644
--- a/drivers/staging/ipx/af_ipx.c
+++ b/drivers/staging/ipx/af_ipx.c
@@ -744,7 +744,7 @@ static void ipxitf_discover_netnum(struct ipx_interface 
*intrfc,
intrfc->if_netnum = cb->ipx_source_net;
ipxitf_add_local_route(intrfc);
} else {
-   printk(KERN_WARNING "IPX: Network number collision "
+   net_warn_ratelimited("IPX: Network number collision "
"%lx\n%s %s and %s %s\n",
(unsigned long) ntohl(cb->ipx_source_net),
ipx_device_name(i),
-- 
2.7.4



Re: [RFC PATCH V1 00/12] audit: implement container id

2018-03-04 Thread Richard Guy Briggs
On 2018-03-04 16:55, Mimi Zohar wrote:
> On Thu, 2018-03-01 at 14:41 -0500, Richard Guy Briggs wrote:
> > Implement audit kernel container ID.
> > 
> > This patchset is a preliminary RFC based on the proposal document (V3)
> > posted:
> > https://www.redhat.com/archives/linux-audit/2018-January/msg00014.html
> > 
> > The first patch implements the proc fs write to set the audit container
> > ID of a process, emitting an AUDIT_CONTAINER record.
> > 
> > The second implements an auxiliary syscall record AUDIT_CONTAINER_INFO
> > if a container ID is present on a task.
> > 
> > The third adds filtering to the exit, exclude and user lists.
> > 
> > The 4th, implements reading the container ID from the proc filesystem
> > for debugging.  This isn't planned for upstream inclusion.
> > 
> > The 5th adds signal and ptrace support.
> > 
> > The 6th attempts to create a local audit context to be able to bind a
> > standalone record with the container ID record.
> > 
> > The 7th, 8th, 9th, 10th patches add container ID records to standalone
> > records.  Some of these may end up being syscall auxiliary records and
> > won't need this specific support since they'll be supported via
> > syscalls.
> > 
> > The 11th is a temporary workaround due to the AUDIT_CONTAINER records
> > not showing up as do AUDIT_LOGIN records.  I suspect this is due to its
> > range (1000 vs 1300), but the intent is to solve it.
> > 
> > The 12th adds debug information not intended for upstream for those
> > brave souls wanting to tinker with it in this early state.
> > 
> > Feedback please!
> 
> Which tree can this patch set be applied to?

git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git next

> Mimi
> 
> > Here's a quick and dirty test script:
> > echo 123455 > /proc/$$/containerid; echo $?
> > sleep 4&  
> > child=$!; sleep 1
> > echo 18446744073709551615 > /proc/$child/containerid; echo $?
> > echo 123456 > /proc/$child/containerid; echo $?
> > echo 123457 > /proc/$child/containerid; echo $?
> > sleep 1
> > ausearch -ts recent |grep " contid=18446744073709551615"; echo $?
> > ausearch -ts recent |grep " contid=123456"; echo $?
> > ausearch -ts recent |grep " contid=123457"; echo $?
> > echo self:$$ contid:$( cat /proc/$$/containerid)
> > echo child:$child contid:$( cat /proc/$child/containerid)
> > 
> > containerid=123458
> > key=tmpcontainerid
> > auditctl -a exit,always -F dir=/tmp -F perm=wa -F containerid=$containerid 
> > -F key=$key || echo failed to add containerid filter rule
> > bash -c "sleep 1; echo test > /tmp/$key"&
> > child=$!
> > echo $containerid > /proc/$child/containerid
> > sleep 2
> > rm -f /tmp/$key
> > ausearch -ts recent -k $key || echo failed to find CONTAINER_INFO record
> > auditctl -d exit,always -F dir=/tmp -F perm=wa -F containerid=$containerid 
> > -F key=$key || echo failed to add containerid filter rule
> > 
> > See:
> > https://github.com/linux-audit/audit-kernel/issues/32
> > https://github.com/linux-audit/audit-userspace/issues/40
> > https://github.com/linux-audit/audit-testsuite/issues/64
> > 
> > Richard Guy Briggs (12):
> >   audit: add container id
> >   audit: log container info of syscalls
> >   audit: add containerid filtering
> >   audit: read container ID of a process
> >   audit: add containerid support for ptrace and signals
> >   audit: add support for non-syscall auxiliary records
> >   audit: add container aux record to watch/tree/mark
> >   audit: add containerid support for tty_audit
> >   audit: add containerid support for config/feature/user records
> >   audit: add containerid support for seccomp and anom_abend records
> >   debug audit: add container id
> >   debug! audit: add container id
> > 
> >  drivers/tty/tty_audit.c|   5 +-
> >  fs/proc/base.c |  63 +++
> >  include/linux/audit.h  |  36 +++
> >  include/linux/init_task.h  |   4 +-
> >  include/linux/sched.h  |   1 +
> >  include/uapi/linux/audit.h |   9 ++-
> >  kernel/audit.c |  74 +++---
> >  kernel/audit.h |   3 +
> >  kernel/audit_fsnotify.c|   5 +-
> >  kernel/audit_tree.c|   5 +-
> >  kernel/audit_watch.c   |  33 +-
> >  kernel/auditfilter.c   |  52 ++-
> >  kernel/auditsc.c   | 154 
> > +++--
> >  13 files changed, 408 insertions(+), 36 deletions(-)
> > 
> 

- RGB

--
Richard Guy Briggs 
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635


[GIT] Networking

2018-03-04 Thread David Miller

1) Use an appropriate TSQ pacing shift in mac80211, from Toke Høiland-Jørgensen.

2) Just like ipv4's ip_route_me_harder(), we have to use skb_to_full_sk
   in ip6_route_me_harder, from Eric Dumazet.

3) Fix several shutdown races and similar other problems in l2tp, from
   James Chapman.

4) Handle missing XDP flush properly in tuntap, for real this time.
   From Jason Wang.

5) Out-of-bounds access in powerpc ebpf tailcalls, from Daniel Borkmann.

6) Fix phy_resume() locking, from Andrew Lunn.

7) IFLA_MTU values are ignored on newlink for some tunnel types, fix
   from Xin Long.

8) Revert F-RTO middle box workarounds, they only handle one dimension
   of the problem.  From Yuchung Cheng.

9) Fix socket refcounting in RDS, from Ka-Cheong Poon.

10) Don't allow ppp unit registration to an unregistered channel, from
Guillaume Nault.

11) Various hv_netvsc fixes from Stephen Hemminger.

Please pull, thanks a lot!

The following changes since commit 9cb9c07d6b0c5fd97d83b8ab14d7e308ba4b612f:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2018-02-23 
15:14:17 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to a7f0fb1bfb66ded5d556d6723d691b77a7146b6f:

  Merge branch 'hv_netvsc-minor-fixes' (2018-03-04 22:18:21 -0500)


Andrew Lunn (1):
  net: phy: Restore phy_resume() locking assumption

Arkadi Sharshevsky (2):
  devlink: Compare to size_new in case of resource child validation
  devlink: Fix resource coverity errors

Arnd Bergmann (1):
  net: ipv4: avoid unused variable warning for sysctl

Bassem Boubaker (1):
  cdc_ether: flag the Cinterion PLS8 modem by gemalto as WWAN

Boris Pismenny (1):
  tls: Use correct sk->sk_prot for IPV6

Claudiu Manoil (1):
  gianfar: Fix Rx byte accounting for ndev stats

Daniel Axtens (4):
  net: rename skb_gso_validate_mtu -> skb_gso_validate_network_len
  net: sched: tbf: handle GSO_BY_FRAGS case in enqueue
  net: xfrm: use skb_gso_validate_network_len() to check gso sizes
  net: make skb_gso_*_seglen functions private

Daniel Borkmann (2):
  bpf: allow xadd only on aligned memory
  bpf, ppc64: fix out of bounds access in tail call

David S. Miller (14):
  Merge branch 'l2tp-fix-API-races-discovered-by-syzbot'
  ARM: orion5x: Revert commit 4904dbda41c8.
  Merge branch 'for-upstream' of 
git://git.kernel.org/.../bluetooth/bluetooth
  Merge branch 'tunnel-mtu-fixes'
  Merge branch 's390-qeth-fixes'
  Merge branch 'tcp-revert-a-F-RTO-extension-due-to-broken-middle-boxes'
  Merge branch 'net-smc-fixes'
  Merge branch 'mlxsw-fixes'
  Merge git://git.kernel.org/.../bpf/bpf
  Merge tag 'mac80211-for-davem-2018-03-02' of 
git://git.kernel.org/.../jberg/mac80211
  Merge git://git.kernel.org/.../pablo/nf
  Merge tag 'batadv-net-for-davem-20180302' of 
git://git.open-mesh.org/linux-merge
  Merge branch 'GSO_BY_FRAGS-correctness-improvements'
  Merge branch 'hv_netvsc-minor-fixes'

Davide Caratti (2):
  net/smc: fix NULL pointer dereference on sock_create_kern() error path
  tc-testing: skbmod: fix match value of ethertype

Denis Du (1):
  hdlc_ppp: carrier detect ok, don't turn off negotiation

Edward Cree (1):
  net: ethtool: don't ignore return from driver get_fecparam method

Emil Tantilov (1):
  ixgbe: fix crash in build_skb Rx code path

Eric Dumazet (4):
  netfilter: use skb_to_full_sk in ip6_route_me_harder
  test_bpf: add a schedule point
  r8152: fix tx packets accounting
  test_bpf: reduce MAX_TESTRUNS

Felix Fietkau (2):
  mac80211: drop frames with unexpected DS bits from fast-rx to slow path
  netfilter: nf_flow_table: fix checksum when handling DNAT

Florian Westphal (7):
  netfilter: ipt_CLUSTERIP: put config struct if we can't increment ct 
refcount
  netfilter: ipt_CLUSTERIP: put config instead of freeing it
  netfilter: ipv6: fix use-after-free Write in nf_nat_ipv6_manip_pkt
  netfilter: bridge: ebt_among: add missing match size checks
  netfilter: ebtables: convert BUG_ONs to WARN_ONs
  netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets
  netfilter: don't set F_IFACE on ipv6 fib lookups

Guillaume Nault (1):
  ppp: prevent unregistered channels from connecting to PPP units

Hans de Goede (1):
  Bluetooth: btusb: Use DMI matching for QCA reset_resume quirking

Ido Schimmel (3):
  bridge: Fix VLAN reference count problem
  mlxsw: spectrum: Treat IPv6 unregistered multicast as broadcast
  spectrum: Reference count VLAN entries

James Chapman (5):
  l2tp: don't use inet_shutdown on tunnel destroy
  l2tp: don't use inet_shutdown on ppp session destroy
  l2tp: fix races with tunnel socket close
  l2tp: fix race in pppol2tp_release with session object destroy
  l2tp: 

Re: [PATCH PATCH net v2 0/9] hv_netvsc: minor fixes

2018-03-04 Thread David Miller
From: Stephen Hemminger 
Date: Fri,  2 Mar 2018 13:49:00 -0800

> These are improvements to netvsc driver. They aren't functionality
> changes so not targeting net-next; and they are not show stopper
> bugs that need to go to stable either.
> 
> v2
>- drop the irq flags patch, defer it to net-next
>- split the multicast filter flag patch out
>- change propogate rx mode patch to handle startup of vf

Series applied, thanks Stephen.


Re: [PATCH net V2] virtio-net: re enable XDP_REDIRECT for mergeable buffer

2018-03-04 Thread David Miller
From: Jason Wang 
Date: Mon, 5 Mar 2018 10:43:41 +0800

> 
> 
> On 2018年03月05日 07:38, David Miller wrote:
>> From: Jason Wang 
>> Date: Fri,  2 Mar 2018 17:29:14 +0800
>>
>>> XDP_REDIRECT support for mergeable buffer was removed since commit
>>> 7324f5399b06 ("virtio_net: disable XDP_REDIRECT in receive_mergeable()
>>> case"). This is because we don't reserve enough tailroom for struct
>>> skb_shared_info which breaks XDP assumption. So this patch fixes this
>>> by reserving enough tailroom and using fixed size of rx buffer.
>>>
>>> Signed-off-by: Jason Wang 
>>> ---
>>> Changes from V1:
>>> - do not add duplicated tracepoint when redirection fails
>> Applied to net-next, thanks Jason.
> 
> Hi David,
> 
> Consider the change is not large, any chance to make it for -net to
> keep XDP redirection work?

Ok, I'll apply this to 'net' too.


Re: [PATCH net V2] virtio-net: re enable XDP_REDIRECT for mergeable buffer

2018-03-04 Thread Jason Wang



On 2018年03月05日 07:38, David Miller wrote:

From: Jason Wang 
Date: Fri,  2 Mar 2018 17:29:14 +0800


XDP_REDIRECT support for mergeable buffer was removed since commit
7324f5399b06 ("virtio_net: disable XDP_REDIRECT in receive_mergeable()
case"). This is because we don't reserve enough tailroom for struct
skb_shared_info which breaks XDP assumption. So this patch fixes this
by reserving enough tailroom and using fixed size of rx buffer.

Signed-off-by: Jason Wang 
---
Changes from V1:
- do not add duplicated tracepoint when redirection fails

Applied to net-next, thanks Jason.


Hi David,

Consider the change is not large, any chance to make it for -net to keep 
XDP redirection work?


Thanks



Re: [PATCH net V2] virtio-net: re enable XDP_REDIRECT for mergeable buffer

2018-03-04 Thread Jason Wang



On 2018年03月03日 01:36, Michael S. Tsirkin wrote:

On Fri, Mar 02, 2018 at 05:29:14PM +0800, Jason Wang wrote:

XDP_REDIRECT support for mergeable buffer was removed since commit
7324f5399b06 ("virtio_net: disable XDP_REDIRECT in receive_mergeable()
case"). This is because we don't reserve enough tailroom for struct
skb_shared_info which breaks XDP assumption. So this patch fixes this
by reserving enough tailroom and using fixed size of rx buffer.

Signed-off-by: Jason Wang

Acked-by: Michael S. Tsirkin

I think the next incremental step is to look at splitting
out fast path XDP processing to a separate set of functions.



Let me try (probably after 1.1 stuffs).

Thanks


Re: [PATCH net V2] virtio-net: re enable XDP_REDIRECT for mergeable buffer

2018-03-04 Thread Jason Wang



On 2018年03月03日 00:07, Jesper Dangaard Brouer wrote:

On Fri,  2 Mar 2018 17:29:14 +0800
Jason Wang  wrote:


XDP_REDIRECT support for mergeable buffer was removed since commit
7324f5399b06 ("virtio_net: disable XDP_REDIRECT in receive_mergeable()
case"). This is because we don't reserve enough tailroom for struct
skb_shared_info which breaks XDP assumption. So this patch fixes this
by reserving enough tailroom and using fixed size of rx buffer.

Signed-off-by: Jason Wang 
---
Changes from V1:
- do not add duplicated tracepoint when redirection fails

Acked-by: Jesper Dangaard Brouer 

I gave it a quick spin on my testlab, and cpumap seems to
work/not-crash now (if I managed to turn back config to
receive_mergeable() correctly ;-)).




Thanks for the testing and reviewing.



Re: [PATCH v2] net: ethernet: Drop unnecessary continue

2018-03-04 Thread Jakub Kicinski
On Sat, 3 Mar 2018 21:44:56 +0530, Arushi Singhal wrote:
> diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c 
> b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
> index 15fa47f..5cd4f3f 100644
> --- a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
> +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
> @@ -258,10 +258,8 @@ nfp_net_pf_alloc_vnics(struct nfp_pf *pf, void __iomem 
> *ctrl_bar,
>   ctrl_bar += NFP_PF_CSR_SLICE_SIZE;
>  
>   /* Kill the vNIC if app init marked it as invalid */
> - if (nn->port && nn->port->type == NFP_PORT_INVALID) {
> + if (nn->port && nn->port->type == NFP_PORT_INVALID)
>   nfp_net_pf_free_vnic(pf, nn);
> - continue;
> - }

This is an error handling path so the continue makes sense here to
indicate the processing can't ever fall through if more statements are
ever added to the loop.  But OK.

>   }
>  
>   if (list_empty(>vnics))


linux-next: manual merge of the selinux tree with the net-next tree

2018-03-04 Thread Stephen Rothwell
Hi Paul,

Today's linux-next merge of the selinux tree got a conflict in:

  net/sctp/socket.c

between several refactoring commits from the net-next tree and commit:

  2277c7cd75e3 ("sctp: Add LSM hooks")

from the selinux tree.

I fixed it up (I think - see below) and can carry the fix as
necessary. This is now fixed as far as linux-next is concerned, but any
non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc net/sctp/socket.c
index 7fa76031bb08,73b34a6b5b09..
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@@ -1606,193 -1622,362 +1622,209 @@@ static int sctp_error(struct sock *sk, 
  static int sctp_msghdr_parse(const struct msghdr *msg,
 struct sctp_cmsgs *cmsgs);
  
 -static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
 +static int sctp_sendmsg_parse(struct sock *sk, struct sctp_cmsgs *cmsgs,
 +struct sctp_sndrcvinfo *srinfo,
 +const struct msghdr *msg, size_t msg_len)
  {
 -  struct net *net = sock_net(sk);
 -  struct sctp_sock *sp;
 -  struct sctp_endpoint *ep;
 -  struct sctp_association *new_asoc = NULL, *asoc = NULL;
 -  struct sctp_transport *transport, *chunk_tp;
 -  struct sctp_chunk *chunk;
 -  union sctp_addr to;
 -  struct sctp_af *af;
 -  struct sockaddr *msg_name = NULL;
 -  struct sctp_sndrcvinfo default_sinfo;
 -  struct sctp_sndrcvinfo *sinfo;
 -  struct sctp_initmsg *sinit;
 -  sctp_assoc_t associd = 0;
 -  struct sctp_cmsgs cmsgs = { NULL };
 -  enum sctp_scope scope;
 -  bool fill_sinfo_ttl = false, wait_connect = false;
 -  struct sctp_datamsg *datamsg;
 -  int msg_flags = msg->msg_flags;
 -  __u16 sinfo_flags = 0;
 -  long timeo;
 +  __u16 sflags;
int err;
  
 -  err = 0;
 -  sp = sctp_sk(sk);
 -  ep = sp->ep;
 -
 -  pr_debug("%s: sk:%p, msg:%p, msg_len:%zu ep:%p\n", __func__, sk,
 -   msg, msg_len, ep);
 +  if (sctp_sstate(sk, LISTENING) && sctp_style(sk, TCP))
 +  return -EPIPE;
  
 -  /* We cannot send a message over a TCP-style listening socket. */
 -  if (sctp_style(sk, TCP) && sctp_sstate(sk, LISTENING)) {
 -  err = -EPIPE;
 -  goto out_nounlock;
 -  }
 +  if (msg_len > sk->sk_sndbuf)
 +  return -EMSGSIZE;
  
 -  /* Parse out the SCTP CMSGs.  */
 -  err = sctp_msghdr_parse(msg, );
 +  memset(cmsgs, 0, sizeof(*cmsgs));
 +  err = sctp_msghdr_parse(msg, cmsgs);
if (err) {
pr_debug("%s: msghdr parse err:%x\n", __func__, err);
 -  goto out_nounlock;
 +  return err;
}
  
 -  /* Fetch the destination address for this packet.  This
 -   * address only selects the association--it is not necessarily
 -   * the address we will send to.
 -   * For a peeled-off socket, msg_name is ignored.
 -   */
 -  if (!sctp_style(sk, UDP_HIGH_BANDWIDTH) && msg->msg_name) {
 -  int msg_namelen = msg->msg_namelen;
 -
 -  err = sctp_verify_addr(sk, (union sctp_addr *)msg->msg_name,
 - msg_namelen);
 -  if (err)
 -  return err;
 -
 -  if (msg_namelen > sizeof(to))
 -  msg_namelen = sizeof(to);
 -  memcpy(, msg->msg_name, msg_namelen);
 -  msg_name = msg->msg_name;
 +  memset(srinfo, 0, sizeof(*srinfo));
 +  if (cmsgs->srinfo) {
 +  srinfo->sinfo_stream = cmsgs->srinfo->sinfo_stream;
 +  srinfo->sinfo_flags = cmsgs->srinfo->sinfo_flags;
 +  srinfo->sinfo_ppid = cmsgs->srinfo->sinfo_ppid;
 +  srinfo->sinfo_context = cmsgs->srinfo->sinfo_context;
 +  srinfo->sinfo_assoc_id = cmsgs->srinfo->sinfo_assoc_id;
 +  srinfo->sinfo_timetolive = cmsgs->srinfo->sinfo_timetolive;
}
  
 -  sinit = cmsgs.init;
 -  if (cmsgs.sinfo != NULL) {
 -  memset(_sinfo, 0, sizeof(default_sinfo));
 -  default_sinfo.sinfo_stream = cmsgs.sinfo->snd_sid;
 -  default_sinfo.sinfo_flags = cmsgs.sinfo->snd_flags;
 -  default_sinfo.sinfo_ppid = cmsgs.sinfo->snd_ppid;
 -  default_sinfo.sinfo_context = cmsgs.sinfo->snd_context;
 -  default_sinfo.sinfo_assoc_id = cmsgs.sinfo->snd_assoc_id;
 -
 -  sinfo = _sinfo;
 -  fill_sinfo_ttl = true;
 -  } else {
 -  sinfo = cmsgs.srinfo;
 -  }
 -  /* Did the user specify SNDINFO/SNDRCVINFO? */
 -  if (sinfo) {
 -  sinfo_flags = sinfo->sinfo_flags;
 -  associd = sinfo->sinfo_assoc_id;
 +   

[PATCH v2 net-next] selftests: forwarding: Add suppport to create veth interfaces

2018-03-04 Thread David Ahern
For tests using veth interfaces, the test infrastructure can create
the netdevs if they do not exist. Arguably this is a preferred approach
since the tests require p$N and p$(N+1) to be pairs.

Signed-off-by: David Ahern 
---
v2
- local on j declaration and line wrap

 .../net/forwarding/forwarding.config.sample|  5 
 tools/testing/selftests/net/forwarding/lib.sh  | 35 ++
 2 files changed, 40 insertions(+)

diff --git a/tools/testing/selftests/net/forwarding/forwarding.config.sample 
b/tools/testing/selftests/net/forwarding/forwarding.config.sample
index ab235c124f20..df54c9eb5100 100644
--- a/tools/testing/selftests/net/forwarding/forwarding.config.sample
+++ b/tools/testing/selftests/net/forwarding/forwarding.config.sample
@@ -14,6 +14,11 @@ NETIFS[p6]=veth5
 NETIFS[p7]=veth6
 NETIFS[p8]=veth7
 
+NETIF_TYPE=veth
+
+# only virtual interfaces (veth) can be created by test infra
+#NETIF_CREATE=yes
+
 ##
 # Defines
 
diff --git a/tools/testing/selftests/net/forwarding/lib.sh 
b/tools/testing/selftests/net/forwarding/lib.sh
index d0af52109360..273511ef2b43 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -76,6 +76,41 @@ done
 ##
 # Network interfaces configuration
 
+create_netif_veth()
+{
+   local i
+
+   for i in $(eval echo {1..$NUM_NETIFS}); do
+   local j=$((i+1))
+
+   ip link show dev ${NETIFS[p$i]} &> /dev/null
+   if [[ $? -ne 0 ]]; then
+   ip link add ${NETIFS[p$i]} type veth \
+   peer name ${NETIFS[p$j]}
+   if [[ $? -ne 0 ]]; then
+   echo "Failed to create netif"
+   exit 1
+   fi
+   fi
+   i=$j
+   done
+}
+
+create_netif()
+{
+   case "$NETIF_TYPE" in
+   veth) create_netif_veth
+ ;;
+   *) echo "Can not create interfaces of type \'$NETIF_TYPE\'"
+  exit 1
+  ;;
+   esac
+}
+
+if [[ "$NETIF_CREATE" = "yes" ]]; then
+   create_netif
+fi
+
 for i in $(eval echo {1..$NUM_NETIFS}); do
ip link show dev ${NETIFS[p$i]} &> /dev/null
if [[ $? -ne 0 ]]; then
-- 
2.11.0



Re: [PATCH net-next] selftests: forwarding: Add suppport to create veth interfaces

2018-03-04 Thread David Ahern
On 3/4/18 1:14 AM, Ido Schimmel wrote:
> On Fri, Mar 02, 2018 at 08:45:53AM -0800, David Ahern wrote:
>> For tests using veth interfaces, the test infrastructure can create
>> the netdevs if they do not exist. Arguably this is a preferred approach
>> since the tests require p$N and p$(N+1) to be pairs.
>>
>> Signed-off-by: David Ahern 
> 
> [...]
> 
>> diff --git a/tools/testing/selftests/net/forwarding/lib.sh 
>> b/tools/testing/selftests/net/forwarding/lib.sh
>> index d0af52109360..2ce98c6a8c25 100644
>> --- a/tools/testing/selftests/net/forwarding/lib.sh
>> +++ b/tools/testing/selftests/net/forwarding/lib.sh
>> @@ -76,6 +76,39 @@ done
>>  
>> ##
>>  # Network interfaces configuration
>>  
>> +create_netif_veth()
>> +{
>> +local i
>> +
>> +for i in $(eval echo {1..$NUM_NETIFS}); do
>> +j=$((i+1))
> 
> local j=$((i+1)) and drop a line.

not sure how it drops a line but added the 'local' for j since it was
missing.


> 
>> +ip link show dev ${NETIFS[p$i]} &> /dev/null
>> +if [[ $? -ne 0 ]]; then
>> +ip link add ${NETIFS[p$i]} type veth peer name 
>> ${NETIFS[p$j]}
> 
> Need to break this one. FWIW, I have this in my config:

going for readability over strict line lengths. Wrapped in v2.


> 
> $ cat ~/.vim/after/ftplugin/sh.vim
> ...
> highlight OverLength ctermbg=red ctermfg=white
> match OverLength /\%81v.\+/
> 
> Cool patch! Tested on my machine.
> 
>> +if [[ $? -ne 0 ]]; then
>> +echo "Failed to create netif"
>> +exit 1
>> +fi
>> +fi
>> +i=$j
>> +done
>> +}
>> +
>> +create_netif()
>> +{
>> +case "$NETIF_TYPE" in
>> +veth) create_netif_veth
>> +  ;;
>> +*) echo "Can not create interfaces of type \'$NETIF_TYPE\'"
>> +   exit 1
>> +   ;;
>> +esac
>> +}
>> +
>> +if [[ "$NETIF_CREATE" = "yes" ]]; then
>> +create_netif
>> +fi
>> +
>>  for i in $(eval echo {1..$NUM_NETIFS}); do
>>  ip link show dev ${NETIFS[p$i]} &> /dev/null
>>  if [[ $? -ne 0 ]]; then
>> -- 
>> 2.11.0
>>



Re: [PATCH net] ipv6: Reflect MTU changes on PMTU of exceptions for MTU-less routes

2018-03-04 Thread David Ahern
On 3/4/18 4:12 PM, Stefano Brivio wrote:
> On Sat, 3 Mar 2018 12:22:36 +0100
> Stefano Brivio  wrote:
> 
>>> And please codify the above expectation as a test under
>>> tools/testing/selftests/net  
>>
>> And this, along with v2.
> 
> On a second thought: I start thinking it doesn't make much sense,
> especially given the current context of self-tests, to explicitly test
> this, because it's a rather particular corner case.
> 
> I think it would make more sense to introduce generic tests first.
> About, say, PMTU, or route exceptions, but not "tunnel causes route
> exception and administrative change doesn't affect PMTU".
> 

I would argue corner cases in particular should be documented.

>From the commit message it seems like you took the time to create a test
setup using network namespaces. Throw those commands into a shell script
-- tools/testing/selftests/net/mtu.sh. It can evolve from there.


[PATCH net-next v2] net/ncsi: Add generic netlink family

2018-03-04 Thread Samuel Mendoza-Jonas
Add a generic netlink family for NCSI. This supports three commands;
NCSI_CMD_PKG_INFO which returns information on packages and their
associated channels, NCSI_CMD_SET_INTERFACE which allows a specific
package or package/channel combination to be set as the preferred
choice, and NCSI_CMD_CLEAR_INTERFACE which clears any preferred setting.

Signed-off-by: Samuel Mendoza-Jonas 
---
v2: Add a separate NCSI_CMD_CLEAR_INTERFACE command instead of allowing
missing attributes in NCSI_CMD_SET_INTERFACE.

 include/uapi/linux/ncsi.h | 115 +
 net/ncsi/Makefile |   2 +-
 net/ncsi/internal.h   |   3 +
 net/ncsi/ncsi-manage.c|  30 +++-
 net/ncsi/ncsi-netlink.c   | 421 ++
 net/ncsi/ncsi-netlink.h   |  20 +++
 6 files changed, 586 insertions(+), 5 deletions(-)
 create mode 100644 include/uapi/linux/ncsi.h
 create mode 100644 net/ncsi/ncsi-netlink.c
 create mode 100644 net/ncsi/ncsi-netlink.h

diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h
new file mode 100644
index ..4c292ecbb748
--- /dev/null
+++ b/include/uapi/linux/ncsi.h
@@ -0,0 +1,115 @@
+/*
+ * Copyright Samuel Mendoza-Jonas, IBM Corporation 2018.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef __UAPI_NCSI_NETLINK_H__
+#define __UAPI_NCSI_NETLINK_H__
+
+/**
+ * enum ncsi_nl_commands - supported NCSI commands
+ *
+ * @NCSI_CMD_UNSPEC: unspecified command to catch errors
+ * @NCSI_CMD_PKG_INFO: list package and channel attributes. Requires
+ * NCSI_ATTR_IFINDEX. If NCSI_ATTR_PACKAGE_ID is specified returns the
+ * specific package and its channels - otherwise a dump request returns
+ * all packages and their associated channels.
+ * @NCSI_CMD_SET_INTERFACE: set preferred package and channel combination.
+ * Requires NCSI_ATTR_IFINDEX and the preferred NCSI_ATTR_PACKAGE_ID and
+ * optionally the preferred NCSI_ATTR_CHANNEL_ID.
+ * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel combination.
+ * Requires NCSI_ATTR_IFINDEX.
+ * @NCSI_CMD_MAX: highest command number
+ */
+enum ncsi_nl_commands {
+   NCSI_CMD_UNSPEC,
+   NCSI_CMD_PKG_INFO,
+   NCSI_CMD_SET_INTERFACE,
+   NCSI_CMD_CLEAR_INTERFACE,
+
+   __NCSI_CMD_AFTER_LAST,
+   NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1
+};
+
+/**
+ * enum ncsi_nl_attrs - General NCSI netlink attributes
+ *
+ * @NCSI_ATTR_UNSPEC: unspecified attributes to catch errors
+ * @NCSI_ATTR_IFINDEX: ifindex of network device using NCSI
+ * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes
+ * @NCSI_ATTR_PACKAGE_ID: package ID
+ * @NCSI_ATTR_CHANNEL_ID: channel ID
+ * @NCSI_ATTR_MAX: highest attribute number
+ */
+enum ncsi_nl_attrs {
+   NCSI_ATTR_UNSPEC,
+   NCSI_ATTR_IFINDEX,
+   NCSI_ATTR_PACKAGE_LIST,
+   NCSI_ATTR_PACKAGE_ID,
+   NCSI_ATTR_CHANNEL_ID,
+
+   __NCSI_ATTR_AFTER_LAST,
+   NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1
+};
+
+/**
+ * enum ncsi_nl_pkg_attrs - NCSI netlink package-specific attributes
+ *
+ * @NCSI_PKG_ATTR_UNSPEC: unspecified attributes to catch errors
+ * @NCSI_PKG_ATTR: nested array of package attributes
+ * @NCSI_PKG_ATTR_ID: package ID
+ * @NCSI_PKG_ATTR_FORCED: flag signifying a package has been set as preferred
+ * @NCSI_PKG_ATTR_CHANNEL_LIST: nested array of NCSI_CHANNEL_ATTR attributes
+ * @NCSI_PKG_ATTR_MAX: highest attribute number
+ */
+enum ncsi_nl_pkg_attrs {
+   NCSI_PKG_ATTR_UNSPEC,
+   NCSI_PKG_ATTR,
+   NCSI_PKG_ATTR_ID,
+   NCSI_PKG_ATTR_FORCED,
+   NCSI_PKG_ATTR_CHANNEL_LIST,
+
+   __NCSI_PKG_ATTR_AFTER_LAST,
+   NCSI_PKG_ATTR_MAX = __NCSI_PKG_ATTR_AFTER_LAST - 1
+};
+
+/**
+ * enum ncsi_nl_channel_attrs - NCSI netlink channel-specific attributes
+ *
+ * @NCSI_CHANNEL_ATTR_UNSPEC: unspecified attributes to catch errors
+ * @NCSI_CHANNEL_ATTR: nested array of channel attributes
+ * @NCSI_CHANNEL_ATTR_ID: channel ID
+ * @NCSI_CHANNEL_ATTR_VERSION_MAJOR: channel major version number
+ * @NCSI_CHANNEL_ATTR_VERSION_MINOR: channel minor version number
+ * @NCSI_CHANNEL_ATTR_VERSION_STR: channel version string
+ * @NCSI_CHANNEL_ATTR_LINK_STATE: channel link state flags
+ * @NCSI_CHANNEL_ATTR_ACTIVE: channels with this flag are in
+ * NCSI_CHANNEL_ACTIVE state
+ * @NCSI_CHANNEL_ATTR_FORCED: flag signifying a channel has been set as
+ * preferred
+ * @NCSI_CHANNEL_ATTR_VLAN_LIST: nested array of NCSI_CHANNEL_ATTR_VLAN_IDs
+ * @NCSI_CHANNEL_ATTR_VLAN_ID: VLAN ID being filtered on this channel
+ * @NCSI_CHANNEL_ATTR_MAX: highest attribute number
+ */
+enum ncsi_nl_channel_attrs {
+   NCSI_CHANNEL_ATTR_UNSPEC,
+   NCSI_CHANNEL_ATTR,
+   NCSI_CHANNEL_ATTR_ID,
+   NCSI_CHANNEL_ATTR_VERSION_MAJOR,
+   

Re: [PATCH 0/5] pull request for net-next: batman-adv 2018-03-02

2018-03-04 Thread David Miller
From: Simon Wunderlich 
Date: Fri,  2 Mar 2018 18:57:40 +0100

> here is a little cleanup pull request of batman-adv to go into net-next.
> 
> Please pull or let me know of any problem!

Pulled, thanks Simon.


Re: [PATCH net] ppp: prevent unregistered channels from connecting to PPP units

2018-03-04 Thread David Miller
From: Guillaume Nault 
Date: Fri, 2 Mar 2018 18:41:16 +0100

> PPP units don't hold any reference on the channels connected to it.
> It is the channel's responsibility to ensure that it disconnects from
> its unit before being destroyed.
> In practice, this is ensured by ppp_unregister_channel() disconnecting
> the channel from the unit before dropping a reference on the channel.
> 
> However, it is possible for an unregistered channel to connect to a PPP
> unit: register a channel with ppp_register_net_channel(), attach a
> /dev/ppp file to it with ioctl(PPPIOCATTCHAN), unregister the channel
> with ppp_unregister_channel() and finally connect the /dev/ppp file to
> a PPP unit with ioctl(PPPIOCCONNECT).
> 
> Once in this situation, the channel is only held by the /dev/ppp file,
> which can be released at anytime and free the channel without letting
> the parent PPP unit know. Then the ppp structure ends up with dangling
> pointers in its ->channels list.
> 
> Prevent this scenario by forbidding unregistered channels from
> connecting to PPP units. This maintains the code logic by keeping
> ppp_unregister_channel() responsible from disconnecting the channel if
> necessary and avoids modification on the reference counting mechanism.
> 
> This issue seems to predate git history (successfully reproduced on
> Linux 2.6.26 and earlier PPP commits are unrelated).
> 
> Signed-off-by: Guillaume Nault 

Applied and queued up for -stable, thank you.


Re: [PATCH net-next] ipvlan: forbid vlan devices on top of ipvlan

2018-03-04 Thread David Miller
From: Paolo Abeni 
Date: Fri,  2 Mar 2018 16:03:32 +0100

> Currently we allow the creation of 8021q devices on top of
> ipvlan, but such devices are nonfunctional, as the underlying
> ipvlan rx_hanlder hook can't match the relevant traffic.
> 
> Be explicit and forbid the creation of such nonfunctional devices.
> 
> Signed-off-by: Paolo Abeni 

Applied.


Re: [PATCH net] tc-testing: skbmod: fix match value of ethertype

2018-03-04 Thread David Miller
From: Davide Caratti 
Date: Fri,  2 Mar 2018 14:44:39 +0100

> iproute2 print_skbmod() prints the configured ethertype using format 0x%X:
> therefore, test 9aa8 systematically fails, because it configures action #4
> using ethertype 0x0031, and expects 0x0031 when it reads it back. Changing
> the expected value to 0x31 lets the test result 'not ok' become 'ok'.
> 
> tested with:
>  # ./tdc.py -e 9aa8
>  Test 9aa8: Get a single skbmod action from a list
>  All test results:
> 
>  1..1
>  ok 1 9aa8 Get a single skbmod action from a list
> 
> Fixes: cf797ac49b94 ("tc-testing: Add test cases for police and skbmod")
> Signed-off-by: Davide Caratti 

Applied, thanks Davide.


Re: [PATCH net V2] virtio-net: re enable XDP_REDIRECT for mergeable buffer

2018-03-04 Thread David Miller
From: Jason Wang 
Date: Fri,  2 Mar 2018 17:29:14 +0800

> XDP_REDIRECT support for mergeable buffer was removed since commit
> 7324f5399b06 ("virtio_net: disable XDP_REDIRECT in receive_mergeable()
> case"). This is because we don't reserve enough tailroom for struct
> skb_shared_info which breaks XDP assumption. So this patch fixes this
> by reserving enough tailroom and using fixed size of rx buffer.
> 
> Signed-off-by: Jason Wang 
> ---
> Changes from V1:
> - do not add duplicated tracepoint when redirection fails

Applied to net-next, thanks Jason.


Re: [PATCH net-next] selftests: rtnetlink: remove testns on test fail

2018-03-04 Thread David Miller
From: Prashant Bhole 
Date: Fri,  2 Mar 2018 11:22:20 +0900

> This patch removes testns after test failure so that next test can
> continue with clean ns
> 
> Signed-off-by: Prashant Bhole 

Applied.


Re: [PATCHv2 net-next 0/2] gre: add sequence number for collect md mode.

2018-03-04 Thread David Miller
From: William Tu 
Date: Thu,  1 Mar 2018 13:49:56 -0800

> Currently GRE sequence number can only be used in native tunnel mode.
> The first patch adds sequence number support for gre collect
> metadata mode, and the second patch tests it using BPF.
> 
> RFC2890 defines GRE sequence number to be specific to the traffic
> flow identified by the key.  However, this patch does not implement
> per-key seqno.  The sequence number is shared in the same tunnel
> device. That is, different tunnel keys using the same collect_md
> tunnel share single sequence number.
> 
> A new BFP uapi tunnel flag 'BPF_F_SEQ_NUMBER' is added.
> --
> v1->v2:
>   rename BPF_F_GRE_SEQ to BPF_F_SEQ_NUMBER suggested by Daniel

Series applied, thank you.


Re: [PATCH net-next 0/6] enic update

2018-03-04 Thread David Miller
From: Govindarajulu Varadarajan 
Date: Thu,  1 Mar 2018 11:07:18 -0800

> This series adds support for IPv6 vxlan offload and UDP rss along with a
> bug fix in filling the rq ring.

Applied, thank you.


Re: [PATCH] net: amd8111e: remove redundant assignment to 'tx_index'

2018-03-04 Thread David Miller
From: Colin King 
Date: Thu,  1 Mar 2018 16:42:40 +

> From: Colin Ian King 
> 
> The variable tx_index is being initialized with a value that is never
> read and re-assigned a little later, hence the initialization is redundant
> and can be removed.
> 
> Cleans up clang warning:
> drivers/net/ethernet/amd/amd8111e.c:652:6: warning: Value stored to
> 'tx_index' during its initialization is never read
> 
> Signed-off-by: Colin Ian King 

Applied to net-next.


Re: [PATCH v4 1/2] r8169: Dereference MMIO address immediately before use

2018-03-04 Thread David Miller

Applied.


Re: [PATCH v4 2/2] r8169: switch to device-managed functions in probe (part 2)

2018-03-04 Thread David Miller
From: Andy Shevchenko 
Date: Thu,  1 Mar 2018 13:27:35 +0200

> This is a follow up to the commit
> 
>   4c45d24a759d ("r8169: switch to device-managed functions in probe")
> 
> to move towards managed resources even more.
> 
> Cc: Heiner Kallweit 
> Signed-off-by: Andy Shevchenko 

Applied.


Re: [patch net] mlxsw: spectrum_switchdev: Check success of FDB add operation

2018-03-04 Thread David Miller
From: Jiri Pirko 
Date: Thu,  1 Mar 2018 11:37:05 +0100

> From: Shalom Toledo 
> 
> Until now, we assumed that in case of error when adding FDB entries, the
> write operation will fail, but this is not the case. Instead, we need to
> check that the number of entries reported in the response is equal to
> the number of entries specified in the request.
> 
> Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
> Reported-by: Ido Schimmel 
> Signed-off-by: Shalom Toledo 
> Reviewed-by: Ido Schimmel 
> Signed-off-by: Jiri Pirko 

Applied and queued up for -stable.


Re: [PATCH net] ipv6: Reflect MTU changes on PMTU of exceptions for MTU-less routes

2018-03-04 Thread Stefano Brivio
On Sat, 3 Mar 2018 12:22:36 +0100
Stefano Brivio  wrote:

> > And please codify the above expectation as a test under
> > tools/testing/selftests/net  
> 
> And this, along with v2.

On a second thought: I start thinking it doesn't make much sense,
especially given the current context of self-tests, to explicitly test
this, because it's a rather particular corner case.

I think it would make more sense to introduce generic tests first.
About, say, PMTU, or route exceptions, but not "tunnel causes route
exception and administrative change doesn't affect PMTU".

-- 
Stefano


Re: [PATCH net-next v2] cxgb4vf: Forcefully link up virtual interfaces

2018-03-04 Thread David Miller
From: Ganesh Goudar 
Date: Thu,  1 Mar 2018 15:01:04 +0530

> From: Arjun Vynipadath 
> 
> The Virtual Interfaces are connected to an internal switch on the chip
> which allows VIs attached to the same port to talk to each other even
> when the port link is down.  As a result, we generally want to always
> report a VI's link as being "up".
> 
> Based on the original work by: Casey Leedom 
> Signed-off-by: Arjun Vynipadath 
> Signed-off-by: Ganesh Goudar 
> ---
> V2: Doing force_link_up unconditionally

Applied.


Re: [PATCH][net-next] net: phy: Fix spelling mistake: "advertisment"-> "advertisement"

2018-03-04 Thread David Miller
From: Colin King 
Date: Thu,  1 Mar 2018 10:23:03 +

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in comments and error message text.
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH v2 0/4] GSO_BY_FRAGS correctness improvements

2018-03-04 Thread David Miller
From: Daniel Axtens 
Date: Thu,  1 Mar 2018 17:13:36 +1100

> As requested [1], I went through and had a look at users of gso_size to
> see if there were things that need to be fixed to consider
> GSO_BY_FRAGS, and I have tried to improve our helper functions to deal
> with this case.
 ...

Series applied, thanks Daniel.


linux-next: manual merge of the net-next tree with the net tree

2018-03-04 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  net/ipv6/netfilter/nft_fib_ipv6.c

between commit:

  47b7e7f82802 ("netfilter: don't set F_IFACE on ipv6 fib lookups")

from the net tree and commit:

  b75cc8f90f07 ("net/ipv6: Pass skb to route lookup")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc net/ipv6/netfilter/nft_fib_ipv6.c
index 62fc84d7bdff,3230b3d7b11b..
--- a/net/ipv6/netfilter/nft_fib_ipv6.c
+++ b/net/ipv6/netfilter/nft_fib_ipv6.c
@@@ -180,7 -180,9 +180,8 @@@ void nft_fib6_eval(const struct nft_exp
}
  
*dest = 0;
-   rt = (void *)ip6_route_lookup(nft_net(pkt), , lookup_flags);
 - again:
+   rt = (void *)ip6_route_lookup(nft_net(pkt), , pkt->skb,
+ lookup_flags);
if (rt->dst.error)
goto put_rt_err;
  


pgp6Wknc9MaBN.pgp
Description: OpenPGP digital signature


Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available

2018-03-04 Thread Alexander Duyck
On Sun, Mar 4, 2018 at 10:50 AM, Jiri Pirko  wrote:
> Sun, Mar 04, 2018 at 07:24:12PM CET, alexander.du...@gmail.com wrote:
>>On Sat, Mar 3, 2018 at 11:13 PM, Jiri Pirko  wrote:
>>> Sun, Mar 04, 2018 at 01:26:53AM CET, alexander.du...@gmail.com wrote:
On Sat, Mar 3, 2018 at 1:25 PM, Jiri Pirko  wrote:
> Sat, Mar 03, 2018 at 07:04:57PM CET, alexander.du...@gmail.com wrote:
>>On Sat, Mar 3, 2018 at 3:31 AM, Jiri Pirko  wrote:
>>> Fri, Mar 02, 2018 at 08:42:47PM CET, m...@redhat.com wrote:
On Fri, Mar 02, 2018 at 05:20:17PM +0100, Jiri Pirko wrote:
> >Yeah, this code essentially calls out the "shareable" code with a
> >comment at the start and end of the section what defines the
> >virtio_bypass functionality. It would just be a matter of mostly
> >cutting and pasting to put it into a separate driver module.
>
> Please put it there and unite the use of it with netvsc.

Surely, adding this to other drivers (e.g. might this be handy for xen
too?) can be left for a separate patchset. Let's get one device merged
first.
>>>
>>> Why? Let's do the generic infra alongside with the driver. I see no good
>>> reason to rush into merging driver and only later, if ever, to convert
>>> it to generic solution. On contrary. That would lead into multiple
>>> approaches and different behavious in multiple drivers. That is plain
>>> wrong.
>>
>>If nothing else it doesn't hurt to do this in one driver in a generic
>>way, and once it has been proven to address all the needs of that one
>>driver we can then start moving other drivers to it. The current
>>solution is quite generic, that was my contribution to this patch set
>>as I didn't like how invasive it was being to virtio and thought it
>>would be best to keep this as minimally invasive as possible.
>>
>>My preference would be to give this a release or two in virtio to
>>mature before we start pushing it onto other drivers. It shouldn't
>>take much to cut/paste this into a new driver file once we decide it
>>is time to start extending it out to other drivers.
>
> I'm not talking about cut/paste and in fact that is what I'm worried
> about. I'm talking about common code in net/core/ or somewhere that
> would take care of this in-driver bonding. Each driver, like virtio_net,
> netvsc would just register some ops to it and the core would do all
> logic. I believe it is essential take this approach from the start.

Sorry, I didn't mean cut/paste into another driver, I meant to make it
a driver of its own. My thought was to eventually create a shared/core
driver module that is then used by the other drivers.

My concern right now is that Stephen has indicated he doesn't want
this approach taken with netvsc, and most of the community doesn't
>>>
>>> IIUC, he only does not like the extra netdev. Is there anything else?
>>
>>Nope that is pretty much it. It doesn't seem like a big deal for
>>virtio, but for netvsc it is significant since they don't have any
>>"backup" bit feature differentiation, so they would likely be stuck
>>with 2 netdevs even in their basic setup.
>
> Okay. If that is a strict "no-go" for netvsc, this should be
> just a flag passed down to the in-driver bond code.

Are you serious? We might as well just do a per-driver bond then if
that is what you want. Once you go back to the "2 netdev" model for
this the bond becomes tightly woven into the driver and becomes a
separate beast entirely. At that point sharing kind of goes out the
window since you have to be tightly coupled into all of the per-driver
ops. I would argue there is no way to do the "2 netdev" model
generically. It is kind of inherent to the "2 netdev" model in the
first place since you can't have a third driver pop up so now
everything is pulled into the paravirtual interface unless you invert
everything and require the netvsc driver to provide the driver with a
set of function pointers allowing it to call back into it. In addition
you suddenly have to deal with all the qdisc and Tx queue locking
mess. So the 3 netdev model let the driver be lockless and run with no
queue disc. Are you telling us you expect our solution to run in both
modes or are you pushing the qdisc overhead and Tx queue locking into
the 3 netdev model?

What it ultimately comes down to is how do you create a new netdev
without exposing a new netdev? In the 3 netdev model this all makes
sense as we can leave the paravirtual interface in tact. Now you are
telling us that based on a flag we either have to embed ourselves into
the paravirtual interface without exposing our operations, or we have
to embed the paravirtual interface into our device without letting it
be visible. The sheer overhead of that will end up more then doubling
the 

Re: [RFC PATCH V1 00/12] audit: implement container id

2018-03-04 Thread Mimi Zohar
On Thu, 2018-03-01 at 14:41 -0500, Richard Guy Briggs wrote:
> Implement audit kernel container ID.
> 
> This patchset is a preliminary RFC based on the proposal document (V3)
> posted:
>   https://www.redhat.com/archives/linux-audit/2018-January/msg00014.html
> 
> The first patch implements the proc fs write to set the audit container
> ID of a process, emitting an AUDIT_CONTAINER record.
> 
> The second implements an auxiliary syscall record AUDIT_CONTAINER_INFO
> if a container ID is present on a task.
> 
> The third adds filtering to the exit, exclude and user lists.
> 
> The 4th, implements reading the container ID from the proc filesystem
> for debugging.  This isn't planned for upstream inclusion.
> 
> The 5th adds signal and ptrace support.
> 
> The 6th attempts to create a local audit context to be able to bind a
> standalone record with the container ID record.
> 
> The 7th, 8th, 9th, 10th patches add container ID records to standalone
> records.  Some of these may end up being syscall auxiliary records and
> won't need this specific support since they'll be supported via
> syscalls.
> 
> The 11th is a temporary workaround due to the AUDIT_CONTAINER records
> not showing up as do AUDIT_LOGIN records.  I suspect this is due to its
> range (1000 vs 1300), but the intent is to solve it.
> 
> The 12th adds debug information not intended for upstream for those
> brave souls wanting to tinker with it in this early state.
> 
> Feedback please!

Which tree can this patch set be applied to?

Mimi

> Here's a quick and dirty test script:
> echo 123455 > /proc/$$/containerid; echo $?
> sleep 4&  
> child=$!; sleep 1
> echo 18446744073709551615 > /proc/$child/containerid; echo $?
> echo 123456 > /proc/$child/containerid; echo $?
> echo 123457 > /proc/$child/containerid; echo $?
> sleep 1
> ausearch -ts recent |grep " contid=18446744073709551615"; echo $?
> ausearch -ts recent |grep " contid=123456"; echo $?
> ausearch -ts recent |grep " contid=123457"; echo $?
> echo self:$$ contid:$( cat /proc/$$/containerid)
> echo child:$child contid:$( cat /proc/$child/containerid)
> 
> containerid=123458
> key=tmpcontainerid
> auditctl -a exit,always -F dir=/tmp -F perm=wa -F containerid=$containerid -F 
> key=$key || echo failed to add containerid filter rule
> bash -c "sleep 1; echo test > /tmp/$key"&
> child=$!
> echo $containerid > /proc/$child/containerid
> sleep 2
> rm -f /tmp/$key
> ausearch -ts recent -k $key || echo failed to find CONTAINER_INFO record
> auditctl -d exit,always -F dir=/tmp -F perm=wa -F containerid=$containerid -F 
> key=$key || echo failed to add containerid filter rule
> 
> See:
>   https://github.com/linux-audit/audit-kernel/issues/32
>   https://github.com/linux-audit/audit-userspace/issues/40
>   https://github.com/linux-audit/audit-testsuite/issues/64
> 
> Richard Guy Briggs (12):
>   audit: add container id
>   audit: log container info of syscalls
>   audit: add containerid filtering
>   audit: read container ID of a process
>   audit: add containerid support for ptrace and signals
>   audit: add support for non-syscall auxiliary records
>   audit: add container aux record to watch/tree/mark
>   audit: add containerid support for tty_audit
>   audit: add containerid support for config/feature/user records
>   audit: add containerid support for seccomp and anom_abend records
>   debug audit: add container id
>   debug! audit: add container id
> 
>  drivers/tty/tty_audit.c|   5 +-
>  fs/proc/base.c |  63 +++
>  include/linux/audit.h  |  36 +++
>  include/linux/init_task.h  |   4 +-
>  include/linux/sched.h  |   1 +
>  include/uapi/linux/audit.h |   9 ++-
>  kernel/audit.c |  74 +++---
>  kernel/audit.h |   3 +
>  kernel/audit_fsnotify.c|   5 +-
>  kernel/audit_tree.c|   5 +-
>  kernel/audit_watch.c   |  33 +-
>  kernel/auditfilter.c   |  52 ++-
>  kernel/auditsc.c   | 154 
> +++--
>  13 files changed, 408 insertions(+), 36 deletions(-)
> 



Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available

2018-03-04 Thread Samudrala, Sridhar



On 3/4/2018 10:50 AM, Jiri Pirko wrote:

Sun, Mar 04, 2018 at 07:24:12PM CET, alexander.du...@gmail.com wrote:

On Sat, Mar 3, 2018 at 11:13 PM, Jiri Pirko  wrote:

Sun, Mar 04, 2018 at 01:26:53AM CET, alexander.du...@gmail.com wrote:

On Sat, Mar 3, 2018 at 1:25 PM, Jiri Pirko  wrote:

Sat, Mar 03, 2018 at 07:04:57PM CET, alexander.du...@gmail.com wrote:

On Sat, Mar 3, 2018 at 3:31 AM, Jiri Pirko  wrote:

Fri, Mar 02, 2018 at 08:42:47PM CET, m...@redhat.com wrote:

On Fri, Mar 02, 2018 at 05:20:17PM +0100, Jiri Pirko wrote:

Yeah, this code essentially calls out the "shareable" code with a
comment at the start and end of the section what defines the
virtio_bypass functionality. It would just be a matter of mostly
cutting and pasting to put it into a separate driver module.

Please put it there and unite the use of it with netvsc.

Surely, adding this to other drivers (e.g. might this be handy for xen
too?) can be left for a separate patchset. Let's get one device merged
first.

Why? Let's do the generic infra alongside with the driver. I see no good
reason to rush into merging driver and only later, if ever, to convert
it to generic solution. On contrary. That would lead into multiple
approaches and different behavious in multiple drivers. That is plain
wrong.

If nothing else it doesn't hurt to do this in one driver in a generic
way, and once it has been proven to address all the needs of that one
driver we can then start moving other drivers to it. The current
solution is quite generic, that was my contribution to this patch set
as I didn't like how invasive it was being to virtio and thought it
would be best to keep this as minimally invasive as possible.

My preference would be to give this a release or two in virtio to
mature before we start pushing it onto other drivers. It shouldn't
take much to cut/paste this into a new driver file once we decide it
is time to start extending it out to other drivers.

I'm not talking about cut/paste and in fact that is what I'm worried
about. I'm talking about common code in net/core/ or somewhere that
would take care of this in-driver bonding. Each driver, like virtio_net,
netvsc would just register some ops to it and the core would do all
logic. I believe it is essential take this approach from the start.

Sorry, I didn't mean cut/paste into another driver, I meant to make it
a driver of its own. My thought was to eventually create a shared/core
driver module that is then used by the other drivers.

My concern right now is that Stephen has indicated he doesn't want
this approach taken with netvsc, and most of the community doesn't

IIUC, he only does not like the extra netdev. Is there anything else?

Nope that is pretty much it. It doesn't seem like a big deal for
virtio, but for netvsc it is significant since they don't have any
"backup" bit feature differentiation, so they would likely be stuck
with 2 netdevs even in their basic setup.

Okay. If that is a strict "no-go" for netvsc, this should be
just a flag passed down to the in-driver bond code.


This results in a 3 driver model (virtio/netvsc, vf & bypass) with 2 netdevs
created when bypass is based on netvsc and 3 netdevs created when the bypass
is based on virtio_net.

Unless we agree on a common netdev model between netvsc and virtio_net,
i am not sure if it is useful to commonize the code into a separate driver.

-Sridhar


Re: [Outreachy kernel] [PATCH v2] staging: Replace printk() with appropriate net_*macro_ratelimited()

2018-03-04 Thread Julia Lawall


On Mon, 5 Mar 2018, Arushi Singhal wrote:

> Replace printk having a log level with the appropriate
> net_*macro_ratelimited.

Why did you choose this function?

> It's better to use actual device name as a prefix in error messages.

What does this message relate to.

> Indentation is also changed, to fix the  checkpatch issue.

It would be better to no exceed 80 characters than to follow the
suggestion abotu the argument being to the right of the (.

julia


> Signed-off-by: Arushi Singhal 
> ---
> changes in v2
> *In previous version printk was changed to pr_*macro(), which is used
> in kernel instead of calling printk() directly. And for drivers,
> dev_*macro() or net_*macro_ratelimited() should be used for calling
> printk() directly.
>
>  drivers/staging/ipx/af_ipx.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/staging/ipx/af_ipx.c b/drivers/staging/ipx/af_ipx.c
> index d21a9d1..9a96962 100644
> --- a/drivers/staging/ipx/af_ipx.c
> +++ b/drivers/staging/ipx/af_ipx.c
> @@ -744,13 +744,13 @@ static void ipxitf_discover_netnum(struct ipx_interface 
> *intrfc,
>   intrfc->if_netnum = cb->ipx_source_net;
>   ipxitf_add_local_route(intrfc);
>   } else {
> - printk(KERN_WARNING "IPX: Network number collision "
> - "%lx\n%s %s and %s %s\n",
> - (unsigned long) ntohl(cb->ipx_source_net),
> - ipx_device_name(i),
> - ipx_frame_name(i->if_dlink_type),
> - ipx_device_name(intrfc),
> - ipx_frame_name(intrfc->if_dlink_type));
> + net_warn_ratelimited("IPX: Network number collision "
> +  "%lx\n%s %s and %s %s\n",
> +  (unsigned long) 
> ntohl(cb->ipx_source_net),
> +  ipx_device_name(i),
> +  ipx_frame_name(i->if_dlink_type),
> +  ipx_device_name(intrfc),
> +  
> ipx_frame_name(intrfc->if_dlink_type));
>   ipxitf_put(i);
>   }
>   }
> --
> 2.7.4
>
> --
> You received this message because you are subscribed to the Google Groups 
> "outreachy-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to outreachy-kernel+unsubscr...@googlegroups.com.
> To post to this group, send email to outreachy-ker...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/outreachy-kernel/20180304204910.GA4840%40seema-Inspiron-15-3567.
> For more options, visit https://groups.google.com/d/optout.
>


[PATCH v2] staging: Replace printk() with appropriate net_*macro_ratelimited()

2018-03-04 Thread Arushi Singhal
Replace printk having a log level with the appropriate
net_*macro_ratelimited.
It's better to use actual device name as a prefix in error messages.
Indentation is also changed, to fix the  checkpatch issue.

Signed-off-by: Arushi Singhal 
---
changes in v2
*In previous version printk was changed to pr_*macro(), which is used
in kernel instead of calling printk() directly. And for drivers,
dev_*macro() or net_*macro_ratelimited() should be used for calling
printk() directly.

 drivers/staging/ipx/af_ipx.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/ipx/af_ipx.c b/drivers/staging/ipx/af_ipx.c
index d21a9d1..9a96962 100644
--- a/drivers/staging/ipx/af_ipx.c
+++ b/drivers/staging/ipx/af_ipx.c
@@ -744,13 +744,13 @@ static void ipxitf_discover_netnum(struct ipx_interface 
*intrfc,
intrfc->if_netnum = cb->ipx_source_net;
ipxitf_add_local_route(intrfc);
} else {
-   printk(KERN_WARNING "IPX: Network number collision "
-   "%lx\n%s %s and %s %s\n",
-   (unsigned long) ntohl(cb->ipx_source_net),
-   ipx_device_name(i),
-   ipx_frame_name(i->if_dlink_type),
-   ipx_device_name(intrfc),
-   ipx_frame_name(intrfc->if_dlink_type));
+   net_warn_ratelimited("IPX: Network number collision "
+"%lx\n%s %s and %s %s\n",
+(unsigned long) 
ntohl(cb->ipx_source_net),
+ipx_device_name(i),
+ipx_frame_name(i->if_dlink_type),
+ipx_device_name(intrfc),
+
ipx_frame_name(intrfc->if_dlink_type));
ipxitf_put(i);
}
}
-- 
2.7.4



Re: [crypto v8 04/12] chtls: structure and macro definiton

2018-03-04 Thread David Miller
From: Atul Gupta 
Date: Thu,  1 Mar 2018 11:19:35 +0530

> + __u8   reneg_to_write_rx;
> + __u8   protocol;

You should use "u8" rather than "__u8" except in UAPI headers
which this file is not.

Please audit your entire patch series for this issue.

Thank you.


[RFC,POC 1/3] bpfilter: add experimental IMR bpf translator

2018-03-04 Thread Florian Westphal
This is a basic intermediate representation to decouple
the ruleset representation (iptables, nftables) from the
ebpf translation.

The IMR currently assumes that translation will always be
into ebpf, its pseudo-registers map 1:1 to ebpf ones.

Objects implemented at the moment:
- relop (eq, ne only for now)
- immediate (32, 64 bit constants)
- payload, with relative addressing (mac header, network header, transport 
header)

This doesn't add a user; files will not even be compiled yet.

Signed-off-by: Florian Westphal 
---
 net/bpfilter/imr.c | 655 +
 net/bpfilter/imr.h |  78 +++
 2 files changed, 733 insertions(+)
 create mode 100644 net/bpfilter/imr.c
 create mode 100644 net/bpfilter/imr.h

diff --git a/net/bpfilter/imr.c b/net/bpfilter/imr.c
new file mode 100644
index ..09c557ea7c21
--- /dev/null
+++ b/net/bpfilter/imr.c
@@ -0,0 +1,655 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+typedef __u16 __bitwise __sum16; /* hack */
+#include 
+#include 
+
+#include "imr.h"
+#include "bpfilter_gen.h"
+
+#define EMIT(ctx, x)   \
+   do {\
+   if ((ctx)->len_cur + 1 > (ctx)->len_max)\
+   return -ENOMEM; \
+   (ctx)->img[(ctx)->len_cur++] = x;   \
+   } while (0)
+
+struct imr_object {
+   enum imr_obj_type type:8;
+   uint8_t len;
+
+   union {
+   struct {
+   union {
+   uint64_t value64;
+   uint32_t value32;
+   };
+   } immedate;
+   struct {
+   struct imr_object *left;
+   struct imr_object *right;
+   enum imr_relop op:8;
+   } relational;
+   struct {
+   uint16_t offset;
+   enum imr_payload_base base:8;
+   } payload;
+   struct {
+   enum imr_verdict verdict;
+   } verdict;
+   };
+};
+
+struct imr_state {
+   struct bpf_insn *img;
+   uint32_t len_cur;
+   uint32_t len_max;
+
+   struct imr_object *registers[IMR_REG_COUNT];
+   uint8_t regcount;
+
+   uint32_t num_objects;
+   struct imr_object **objects;
+};
+
+static int imr_jit_object(struct bpfilter_gen_ctx *ctx,
+ struct imr_state *, const struct imr_object *o);
+
+static void internal_error(const char *s)
+{
+   fprintf(stderr, "FIXME: internal error %s\n", s);
+   exit(1);
+}
+
+/* FIXME: consider len too (e.g. reserve 2 registers for len == 8) */
+static int imr_register_alloc(struct imr_state *s, uint32_t len)
+{
+   uint8_t reg = s->regcount;
+
+   if (s->regcount >= IMR_REG_COUNT)
+   return -1;
+
+   s->regcount++;
+
+   return reg;
+}
+
+static int imr_register_get(const struct imr_state *s, uint32_t len)
+{
+   if (len > sizeof(uint64_t))
+   internal_error(">64bit types not yet implemented");
+   if (s->regcount == 0)
+   internal_error("no registers in use");
+
+   return s->regcount - 1;
+}
+
+static int imr_to_bpf_reg(enum imr_reg_num n)
+{
+   /* currently maps 1:1 */
+   return (int)n;
+}
+
+static int bpf_reg_width(unsigned int len)
+{
+   switch (len) {
+   case sizeof(uint8_t): return BPF_B;
+   case sizeof(uint16_t): return BPF_H;
+   case sizeof(uint32_t): return BPF_W;
+   case sizeof(uint64_t): return BPF_DW;
+   default:
+   internal_error("reg size not supported");
+   }
+
+   return -EINVAL;
+}
+
+static void imr_register_release(struct imr_state *s)
+{
+   if (s->regcount == 0)
+   internal_error("regcount underflow");
+   s->regcount--;
+}
+
+void imr_register_store(struct imr_state *s, enum imr_reg_num reg, struct 
imr_object *o)
+{
+   s->registers[reg] = o;
+}
+
+struct imr_object *imr_register_load(const struct imr_state *s, enum 
imr_reg_num reg)
+{
+   return s->registers[reg];
+}
+
+struct imr_state *imr_state_alloc(void)
+{
+   struct imr_state *s = calloc(1, sizeof(*s));
+
+   return s;
+}
+
+void imr_state_free(struct imr_state *s)
+{
+   int i;
+
+   for (i = 0; i < s->num_objects; i++)
+   imr_object_free(s->objects[i]);
+
+   free(s);
+}
+
+struct imr_object *imr_object_alloc(enum imr_obj_type t)
+{
+   struct imr_object *o = calloc(1, sizeof(*o));
+
+   if (o)
+   o->type = t;
+
+   return o;
+}
+
+void imr_object_free(struct imr_object *o)
+{
+   switch (o->type) {
+   case IMR_OBJ_TYPE_VERDICT:
+   case IMR_OBJ_TYPE_IMMEDIATE:
+   case IMR_OBJ_TYPE_PAYLOAD:
+   break;
+   case IMR_OBJ_TYPE_RELATIONAL:
+   

[RFC,POC 2/3] bpfilter: add nftables jit proof-of-concept

2018-03-04 Thread Florian Westphal
This adds a nftables frontend for the IMR->BPF translator.

This doesn't work via UMH yet.

AFAIU it should be possible to get transparent ebpf translation for
nftables, similar to the bpfilter/iptables UMH.

However, at this time I think its better to get IMR "right".

nftjit.ko currently needs libnftnl/libmnl but thats convenince on
my end and not a "must have".

Signed-off-by: Florian Westphal 
---
 net/bpfilter/Makefile   |   7 +-
 net/bpfilter/nftables.c | 679 
 2 files changed, 685 insertions(+), 1 deletion(-)
 create mode 100644 net/bpfilter/nftables.c

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 5a85ef7d7a4d..a4064986dc2f 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -3,7 +3,12 @@
 # Makefile for the Linux BPFILTER layer.
 #
 
-hostprogs-y := bpfilter.ko
+hostprogs-y := nftjit.ko bpfilter.ko
 always := $(hostprogs-y)
 bpfilter.ko-objs := bpfilter.o tgts.o targets.o tables.o init.o ctor.o 
sockopt.o gen.o
+
+NFT_LIBS = -lnftnl
+nftjit.ko-objs := tgts.o targets.o tables.o init.o ctor.o gen.o nftables.o 
imr.o
+HOSTLOADLIBES_nftjit.ko = `pkg-config --libs libnftnl libmnl`
+
 HOSTCFLAGS += -I. -Itools/include/
diff --git a/net/bpfilter/nftables.c b/net/bpfilter/nftables.c
new file mode 100644
index ..5a756ccd03a1
--- /dev/null
+++ b/net/bpfilter/nftables.c
@@ -0,0 +1,679 @@
+/*
+ * based on previous code from:
+ *
+ * Copyright (c) 2013 Arturo Borrero Gonzalez 
+ * Copyright (c) 2013 Pablo Neira Ayuso 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "bpfilter_mod.h"
+#include "imr.h"
+
+/* Hack, we don't link bpfilter.o */
+extern long int syscall (long int __sysno, ...);
+
+int sys_bpf(int cmd, union bpf_attr *attr, unsigned int size)
+{
+   return syscall(321, cmd, attr, size);
+}
+
+static int seq;
+
+static void memory_allocation_error(void) { perror("allocation failed"); 
exit(1); }
+
+static int nft_reg_to_imr_reg(int nfreg)
+{
+   switch (nfreg) {
+   case NFT_REG_VERDICT:
+   return IMR_REG_0;
+   /* old register numbers, 4 128 bit registers. */
+   case NFT_REG_1:
+   return IMR_REG_4;
+   case NFT_REG_2:
+   return IMR_REG_6;
+   case NFT_REG_3:
+   return IMR_REG_8;
+   case NFT_REG_4:
+   break;
+   /* new register numbers, 16 32 bit registers, map to old ones */
+   case NFT_REG32_00:
+   return IMR_REG_4;
+   case NFT_REG32_01:
+   return IMR_REG_5;
+   case NFT_REG32_02:
+   return IMR_REG_6;
+   default:
+   return -1;
+   }
+   return -1;
+}
+
+static int netlink_parse_immediate(const struct nftnl_expr *nle, void *out)
+{
+   struct imr_state *state = out;
+   struct imr_object *o = NULL;
+
+   if (nftnl_expr_is_set(nle, NFTNL_EXPR_IMM_DATA)) {
+   uint32_t len;
+   int reg;
+
+   nftnl_expr_get(nle, NFTNL_EXPR_IMM_DATA, );
+
+   switch (len) {
+   case sizeof(uint32_t):
+   o = imr_object_alloc_imm32(nftnl_expr_get_u32(nle, 
NFTNL_EXPR_IMM_DATA));
+   break;
+   case sizeof(uint64_t):
+   o = imr_object_alloc_imm64(nftnl_expr_get_u64(nle, 
NFTNL_EXPR_IMM_DATA));
+   break;
+   default:
+   return -ENOTSUPP;
+   }
+   reg = nft_reg_to_imr_reg(nftnl_expr_get_u32(nle,
+NFTNL_EXPR_IMM_DREG));
+   if (reg < 0) {
+   imr_object_free(o);
+   return reg;
+   }
+
+   imr_register_store(state, reg, o);
+   return 0;
+   } else if (nftnl_expr_is_set(nle, NFTNL_EXPR_IMM_VERDICT)) {
+   uint32_t verdict;
+   int ret;
+
+   if (nftnl_expr_is_set(nle, NFTNL_EXPR_IMM_CHAIN))
+   return -ENOTSUPP;
+
+verdict = nftnl_expr_get_u32(nle, NFTNL_EXPR_IMM_VERDICT);
+
+   switch (verdict) {
+   case NF_ACCEPT:
+   o = imr_object_alloc_verdict(IMR_VERDICT_PASS);
+   break;
+   case NF_DROP:
+   o = imr_object_alloc_verdict(IMR_VERDICT_DROP);
+   break;
+   default:
+   fprintf(stderr, "Unhandled verdict %d\n", verdict);
+   o = 

[RFC,POC 3/3] bpfilter: switch bpfilter to iptables->IMR translation

2018-03-04 Thread Florian Westphal
Translate basic iptables rule blob to the IMR, then ask
IMR to translate to ebpf.

IMR is shared between nft and bpfilter translators.
iptables_gen_append() is the only relevant function here,
as it demonstrates simple 'source/destination matches x' test.

Signed-off-by: Florian Westphal 
---
 net/bpfilter/Makefile   |  2 +-
 net/bpfilter/bpfilter_gen.h | 15 +
 net/bpfilter/bpfilter_mod.h | 16 +-
 net/bpfilter/iptables.c | 76 +
 net/bpfilter/iptables.h |  4 +++
 net/bpfilter/sockopt.c  | 73 +--
 6 files changed, 154 insertions(+), 32 deletions(-)
 create mode 100644 net/bpfilter/bpfilter_gen.h
 create mode 100644 net/bpfilter/iptables.c
 create mode 100644 net/bpfilter/iptables.h

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index a4064986dc2f..21a8afb60b7c 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -5,7 +5,7 @@
 
 hostprogs-y := nftjit.ko bpfilter.ko
 always := $(hostprogs-y)
-bpfilter.ko-objs := bpfilter.o tgts.o targets.o tables.o init.o ctor.o 
sockopt.o gen.o
+bpfilter.ko-objs := bpfilter.o tgts.o targets.o tables.o init.o ctor.o 
sockopt.o gen.o iptables.o imr.o
 
 NFT_LIBS = -lnftnl
 nftjit.ko-objs := tgts.o targets.o tables.o init.o ctor.o gen.o nftables.o 
imr.o
diff --git a/net/bpfilter/bpfilter_gen.h b/net/bpfilter/bpfilter_gen.h
new file mode 100644
index ..71c6e8a73e24
--- /dev/null
+++ b/net/bpfilter/bpfilter_gen.h
@@ -0,0 +1,15 @@
+struct bpfilter_gen_ctx {
+   struct bpf_insn *img;
+   u32 len_cur;
+   u32 len_max;
+   u32 default_verdict;
+   int fd;
+   int ifindex;
+   booloffloaded;
+};
+
+int bpfilter_gen_init(struct bpfilter_gen_ctx *ctx);
+int bpfilter_gen_prologue(struct bpfilter_gen_ctx *ctx);
+int bpfilter_gen_epilogue(struct bpfilter_gen_ctx *ctx);
+int bpfilter_gen_commit(struct bpfilter_gen_ctx *ctx);
+void bpfilter_gen_destroy(struct bpfilter_gen_ctx *ctx);
diff --git a/net/bpfilter/bpfilter_mod.h b/net/bpfilter/bpfilter_mod.h
index b4209985efff..dc3a90df1788 100644
--- a/net/bpfilter/bpfilter_mod.h
+++ b/net/bpfilter/bpfilter_mod.h
@@ -4,6 +4,7 @@
 
 #include "include/uapi/linux/bpfilter.h"
 #include 
+#include "bpfilter_gen.h"
 
 struct bpfilter_table {
struct hlist_node   hash;
@@ -71,26 +72,11 @@ struct bpfilter_target {
u8  rev;
 };
 
-struct bpfilter_gen_ctx {
-   struct bpf_insn *img;
-   u32 len_cur;
-   u32 len_max;
-   u32 default_verdict;
-   int fd;
-   int ifindex;
-   booloffloaded;
-};
-
 union bpf_attr;
 int sys_bpf(int cmd, union bpf_attr *attr, unsigned int size);
 
-int bpfilter_gen_init(struct bpfilter_gen_ctx *ctx);
-int bpfilter_gen_prologue(struct bpfilter_gen_ctx *ctx);
-int bpfilter_gen_epilogue(struct bpfilter_gen_ctx *ctx);
 int bpfilter_gen_append(struct bpfilter_gen_ctx *ctx,
struct bpfilter_ipt_ip *ent, int verdict);
-int bpfilter_gen_commit(struct bpfilter_gen_ctx *ctx);
-void bpfilter_gen_destroy(struct bpfilter_gen_ctx *ctx);
 
 struct bpfilter_target *bpfilter_target_get_by_name(const char *name);
 void bpfilter_target_put(struct bpfilter_target *tgt);
diff --git a/net/bpfilter/iptables.c b/net/bpfilter/iptables.c
new file mode 100644
index ..055cfa8fbf21
--- /dev/null
+++ b/net/bpfilter/iptables.c
@@ -0,0 +1,76 @@
+#include 
+#include 
+
+typedef uint16_t __sum16; /* hack */
+#include 
+
+#include "bpfilter_mod.h"
+#include "iptables.h"
+#include "imr.h"
+
+static int check_entry(const struct bpfilter_ipt_ip *ent)
+{
+#define M_FF   "\xff\xff\xff\xff"
+   static const __u8 mask1[IFNAMSIZ] = M_FF M_FF M_FF M_FF;
+   static const __u8 mask0[IFNAMSIZ] = { };
+   int ones = strlen(ent->in_iface); ones += ones > 0;
+#undef M_FF
+   if (strlen(ent->out_iface) > 0)
+   return -ENOTSUPP;
+   if (memcmp(ent->in_iface_mask, mask1, ones) ||
+   memcmp(>in_iface_mask[ones], mask0, sizeof(mask0) - ones))
+   return -ENOTSUPP;
+   if ((ent->src_mask != 0 && ent->src_mask != 0x) ||
+   (ent->dst_mask != 0 && ent->dst_mask != 0x))
+   return -ENOTSUPP;
+
+   return 0;
+}
+
+int iptables_gen_append(struct imr_state *state,
+   struct bpfilter_ipt_ip *ent, int verdict)
+{
+   struct imr_object *left, *right, *relop;
+   int ret;
+
+   ret = check_entry(ent);
+   if (ret < 0)
+   return ret;
+   if (ent->src_mask == 0 && ent->dst_mask == 0)
+   return 0;
+
+   imr_state_rule_begin(state);
+
+   if (ent->src_mask) {
+   left = 

[RFC,POC] iptables/nftables to epbf/xdp via common intermediate layer

2018-03-04 Thread Florian Westphal
These patches, which go on top of the 'bpfilter' RFC patches,
demonstrate an nftables to ebpf translation (done in userspace).
In order to not duplicate the ebpf code generation efforts, the rules

iptables -i lo -d 127.0.0.2 -j DROP
and
nft add rule ip filter input ip daddr 127.0.0.2 drop

are first translated to a common intermediate representation, and then
to ebpf, which attaches resulting prog to the XDP hook.

IMR representation is identical in both cases so therefore both
rules result in the same ebpf program.

The IMR currently assumes that translation will always be to ebpf.
As per previous discussion it doesn't consider other targets, so
for instance IMR pseudo-registers map 1:1 to ebpf ones.

The IMR is also supposed to be generic enough to make it easy to convert
'fronted' formats (iptables rule blob, nftables netlink) to it, and
also extend it to cover ip rule, ovs or any other inputs in the future
without need for major changes to the IMR.

The IMR currently implements following basic operations:
 - Relational (equal, not equal)
 - immediates (32 and 64bit constants)
 - payload with relative addressing (macr, network, transport header)
 - verdict (pass, drop, next rule)

Its still in early stage, but I think its good enough as
a proof-of-concept.

Known differences between nftjit.ko and bpfilter.ko:
nftjit.ko currently doesn't run transparently, but thats only
because I wanted to focus on the IMR and get the POC out of the door.

It should be possible to get it transparent via the bpfilter.ko approach.

Next steps for the IMR could be addition of binary operations for
prefixes ("-d 192.168.0.1/24"), its also needed e.g. for tcp flag
matching (-p tcp --syn in iptables) and so on.

I'd also be interested in wheter XDP is seen as appropriate
target hook.  AFAICS the XDP and the nftables ingress hook are similar
enough to consider just (re)using the XDP hook to jit the nftables ingress
hook.  The translator could check if the hook is unused, and return
early if some other program is already attached.

Comments welcome, especially wrt. IMR concept and what might be
next step(s) in moving forward.

The patches are also available via git at
https://git.breakpoint.cc/cgit/fw/net-next.git/log/?h=bpfilter7 .



[PATCH net] xfrm: Verify MAC header exists before overwriting eth_hdr(skb)->h_proto

2018-03-04 Thread yossiku
From: Yossi Kuperman 

Artem Savkov reported that commit 5efec5c655dd leads to a packet loss under
IPSec configuration. It appears that his setup consists of a TUN device,
which does not have a MAC header.

Make sure MAC header exists.

Note: TUN device sets a MAC header pointer, although it does not have one.

Fixes: 5efec5c655dd ("xfrm: Fix eth_hdr(skb)->h_proto to reflect inner IP 
version")
Reported-by: Artem Savkov 
Tested-by: Artem Savkov 
Signed-off-by: Yossi Kuperman 
---
 net/ipv4/xfrm4_mode_tunnel.c | 3 ++-
 net/ipv6/xfrm6_mode_tunnel.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/xfrm4_mode_tunnel.c b/net/ipv4/xfrm4_mode_tunnel.c
index 63faeee..2a9764b 100644
--- a/net/ipv4/xfrm4_mode_tunnel.c
+++ b/net/ipv4/xfrm4_mode_tunnel.c
@@ -92,7 +92,8 @@ static int xfrm4_mode_tunnel_input(struct xfrm_state *x, 
struct sk_buff *skb)
 
skb_reset_network_header(skb);
skb_mac_header_rebuild(skb);
-   eth_hdr(skb)->h_proto = skb->protocol;
+   if (skb->mac_len)
+   eth_hdr(skb)->h_proto = skb->protocol;
 
err = 0;
 
diff --git a/net/ipv6/xfrm6_mode_tunnel.c b/net/ipv6/xfrm6_mode_tunnel.c
index bb935a3..de1b0b8 100644
--- a/net/ipv6/xfrm6_mode_tunnel.c
+++ b/net/ipv6/xfrm6_mode_tunnel.c
@@ -92,7 +92,8 @@ static int xfrm6_mode_tunnel_input(struct xfrm_state *x, 
struct sk_buff *skb)
 
skb_reset_network_header(skb);
skb_mac_header_rebuild(skb);
-   eth_hdr(skb)->h_proto = skb->protocol;
+   if (skb->mac_len)
+   eth_hdr(skb)->h_proto = skb->protocol;
 
err = 0;
 
-- 
2.8.1



Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available

2018-03-04 Thread Jiri Pirko
Sun, Mar 04, 2018 at 07:24:12PM CET, alexander.du...@gmail.com wrote:
>On Sat, Mar 3, 2018 at 11:13 PM, Jiri Pirko  wrote:
>> Sun, Mar 04, 2018 at 01:26:53AM CET, alexander.du...@gmail.com wrote:
>>>On Sat, Mar 3, 2018 at 1:25 PM, Jiri Pirko  wrote:
 Sat, Mar 03, 2018 at 07:04:57PM CET, alexander.du...@gmail.com wrote:
>On Sat, Mar 3, 2018 at 3:31 AM, Jiri Pirko  wrote:
>> Fri, Mar 02, 2018 at 08:42:47PM CET, m...@redhat.com wrote:
>>>On Fri, Mar 02, 2018 at 05:20:17PM +0100, Jiri Pirko wrote:
 >Yeah, this code essentially calls out the "shareable" code with a
 >comment at the start and end of the section what defines the
 >virtio_bypass functionality. It would just be a matter of mostly
 >cutting and pasting to put it into a separate driver module.

 Please put it there and unite the use of it with netvsc.
>>>
>>>Surely, adding this to other drivers (e.g. might this be handy for xen
>>>too?) can be left for a separate patchset. Let's get one device merged
>>>first.
>>
>> Why? Let's do the generic infra alongside with the driver. I see no good
>> reason to rush into merging driver and only later, if ever, to convert
>> it to generic solution. On contrary. That would lead into multiple
>> approaches and different behavious in multiple drivers. That is plain
>> wrong.
>
>If nothing else it doesn't hurt to do this in one driver in a generic
>way, and once it has been proven to address all the needs of that one
>driver we can then start moving other drivers to it. The current
>solution is quite generic, that was my contribution to this patch set
>as I didn't like how invasive it was being to virtio and thought it
>would be best to keep this as minimally invasive as possible.
>
>My preference would be to give this a release or two in virtio to
>mature before we start pushing it onto other drivers. It shouldn't
>take much to cut/paste this into a new driver file once we decide it
>is time to start extending it out to other drivers.

 I'm not talking about cut/paste and in fact that is what I'm worried
 about. I'm talking about common code in net/core/ or somewhere that
 would take care of this in-driver bonding. Each driver, like virtio_net,
 netvsc would just register some ops to it and the core would do all
 logic. I believe it is essential take this approach from the start.
>>>
>>>Sorry, I didn't mean cut/paste into another driver, I meant to make it
>>>a driver of its own. My thought was to eventually create a shared/core
>>>driver module that is then used by the other drivers.
>>>
>>>My concern right now is that Stephen has indicated he doesn't want
>>>this approach taken with netvsc, and most of the community doesn't
>>
>> IIUC, he only does not like the extra netdev. Is there anything else?
>
>Nope that is pretty much it. It doesn't seem like a big deal for
>virtio, but for netvsc it is significant since they don't have any
>"backup" bit feature differentiation, so they would likely be stuck
>with 2 netdevs even in their basic setup.

Okay. If that is a strict "no-go" for netvsc, this should be
just a flag passed down to the in-driver bond code.


>
>>>want the netvsc approach applied to virtio. Until that impasse can be
>>>resolved there isn't much value in trying to split this up so it is
>>>available to other drivers. In addition I would imagine it would make
>>>it a pain for others to back-port into distros since it would break
>>>legacy netvsc driver behavior. Patches are always welcome. Once this
>>>is in you are free to try fighting to get this made into a generic
>>>module and applied to both drivers, but we have already spent close to
>>>3 months on this and it seems like there has been significantly more
>>
>> Alex, time is never a good argument for poor design and shortcuts.
>
>I'm not saying we should go with a poor design due to time. But
>expecting us to implement something where the maintainer of said
>driver has not agreed to is pointless, and I don't see it as a design

He just does not like the third netdev, not the fact that the code
for in-driver bonding would be shared.


>shortcut to implement something in one driver with the expectation
>that we will then make it core later once it has proven itself and has
>use elsewhere. In the meantime I would imagine it also makes it easier
>for things like backports and such for us to do it this way since we
>are only impacting one driver.

When you are working on upstream kernel, you should not care about
backports. That leads to poor design. Not an argument.


>
>You are telling us to do something that not everyone has agreed to.

Who did not?


>Currently we only have agreement from Michael on taking this code, as
>such we are working with virtio only for now. When the time comes that

If you do duplication of 

Re: [PATCH v2 net-next 5/5] net: dsa: mv88e6xxx: Get mv88e6352 SERDES statistics

2018-03-04 Thread Florian Fainelli


On 03/01/2018 07:10 PM, Andrew Lunn wrote:
>> +void mv88e6352_serdes_get_strings(struct mv88e6xxx_chip *chip,
>> +  int port, uint8_t *data)
>> +{
>> +struct mv88e6352_serdes_hw_stat *stat;
>> +int i;
>> +
>> +if (!mv88e6352_port_has_serdes(chip, port))
>> +return;
>> +
>> +for (i = 0; i < ARRAY_SIZE(mv88e6352_serdes_hw_stats); i++) {
>> +stat = _serdes_hw_stats[i];
>> +memcpy(data + i * ETH_GSTRING_LEN, stat->string,
>> +   ETH_GSTRING_LEN);
> 
> This has the same problem as Florain just fixed, using memcpy instead
> of strcnpy. I will spin a new version with this fixed.

This is fine actually, your strings are defined as an array of
ETH_GSTRING_LEN characters so while the memcpy() is a bit inefficient
and will typically lead to copying a lot of NUL bytes, this won't be
causing out of bounds accesses though.
-- 
Florian


[PATCH] fsl/fman: avoid sleeping in atomic context while adding an address

2018-03-04 Thread Denis Kirjanov
__dev_mc_add grabs an adress spinlock so use
atomic context in kmalloc.

/ # ifconfig eth0 inet 192.168.0.111
[   89.331622] BUG: sleeping function called from invalid context at 
mm/slab.h:420
[   89.339002] in_atomic(): 1, irqs_disabled(): 0, pid: 1035, name: ifconfig
[   89.345799] 2 locks held by ifconfig/1035:
[   89.349908]  #0:  (rtnl_mutex){+.+.}, at: [<(ptrval)>] 
devinet_ioctl+0xc0/0x8a0
[   89.357258]  #1:  (_xmit_ETHER){+...}, at: [<(ptrval)>] 
__dev_mc_add+0x28/0x80
[   89.364520] CPU: 1 PID: 1035 Comm: ifconfig Not tainted 4.16.0-rc3-dirty #8
[   89.371464] Call Trace:
[   89.373908] [e959db60] [c066f948] dump_stack+0xa4/0xfc (unreliable)
[   89.380177] [e959db80] [c00671d8] ___might_sleep+0x248/0x280
[   89.385833] [e959dba0] [c01aec34] kmem_cache_alloc_trace+0x174/0x320
[   89.392179] [e959dbd0] [c04ab920] dtsec_add_hash_mac_address+0x130/0x240
[   89.398874] [e959dc00] [c04a9d74] set_multi+0x174/0x1b0
[   89.404093] [e959dc30] [c04afb68] dpaa_set_rx_mode+0x68/0xe0
[   89.409745] [e959dc40] [c057baf8] __dev_mc_add+0x58/0x80
[   89.415052] [e959dc60] [c060fd64] igmp_group_added+0x164/0x190
[   89.420878] [e959dca0] [c060ffa8] ip_mc_inc_group+0x218/0x460
[   89.426617] [e959dce0] [c06120fc] ip_mc_up+0x3c/0x190
[   89.431662] [e959dd10] [c0607270] inetdev_event+0x250/0x620
[   89.437227] [e959dd50] [c005f190] notifier_call_chain+0x80/0xf0
[   89.443138] [e959dd80] [c0573a74] __dev_notify_flags+0x54/0xf0
[   89.448964] [e959dda0] [c05743f8] dev_change_flags+0x48/0x60
[   89.454615] [e959ddc0] [c0606744] devinet_ioctl+0x544/0x8a0
[   89.460180] [e959de10] [c060987c] inet_ioctl+0x9c/0x1f0
[   89.465400] [e959de80] [c05479a8] sock_ioctl+0x168/0x460
[   89.470708] [e959ded0] [c01cf3ec] do_vfs_ioctl+0xac/0x8c0
[   89.476099] [e959df20] [c01cfc40] SyS_ioctl+0x40/0xc0
[   89.481147] [e959df40] [c0011318] ret_from_syscall+0x0/0x3c
[   89.486715] --- interrupt: c01 at 0x1006943c
[   89.486715] LR = 0x100c45ec

Signed-off-by: Denis Kirjanov 
---
 drivers/net/ethernet/freescale/fman/fman_dtsec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fman/fman_dtsec.c 
b/drivers/net/ethernet/freescale/fman/fman_dtsec.c
index ea43b4974149..7af31ddd093f 100644
--- a/drivers/net/ethernet/freescale/fman/fman_dtsec.c
+++ b/drivers/net/ethernet/freescale/fman/fman_dtsec.c
@@ -1100,7 +1100,7 @@ int dtsec_add_hash_mac_address(struct fman_mac *dtsec, 
enet_addr_t *eth_addr)
set_bucket(dtsec->regs, bucket, true);
 
/* Create element to be added to the driver hash table */
-   hash_entry = kmalloc(sizeof(*hash_entry), GFP_KERNEL);
+   hash_entry = kmalloc(sizeof(*hash_entry), GFP_ATOMIC);
if (!hash_entry)
return -ENOMEM;
hash_entry->addr = addr;
-- 
2.13.6



[PATCH net-next 1/2] tcp: add send queue size stat in SCM_TIMESTAMPING_OPT_STATS

2018-03-04 Thread Priyaranjan Jha
This patch adds TCP_NLA_SENDQ_SIZE stat into SCM_TIMESTAMPING_OPT_STATS.
It reports no. of bytes present in send queue, when timestamp is
generated.

Signed-off-by: Priyaranjan Jha 
Signed-off-by: Neal Cardwell 
Signed-off-by: Yuchung Cheng 
Signed-off-by: Soheil Hassas Yeganeh 
---
 include/uapi/linux/tcp.h | 1 +
 net/ipv4/tcp.c   | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index b4a4f64635fa..93bad2128ef6 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -241,6 +241,7 @@ enum {
TCP_NLA_MIN_RTT,/* minimum RTT */
TCP_NLA_RECUR_RETRANS,  /* Recurring retransmits for the current pkt */
TCP_NLA_DELIVERY_RATE_APP_LMT, /* delivery rate application limited ? */
+   TCP_NLA_SNDQ_SIZE,  /* Data (bytes) pending in send queue */
 
 };
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index a33539798bf6..162ba4227446 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3031,7 +3031,7 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const 
struct sock *sk)
u32 rate;
 
stats = alloc_skb(7 * nla_total_size_64bit(sizeof(u64)) +
- 3 * nla_total_size(sizeof(u32)) +
+ 4 * nla_total_size(sizeof(u32)) +
  2 * nla_total_size(sizeof(u8)), GFP_ATOMIC);
if (!stats)
return NULL;
@@ -3061,6 +3061,8 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const 
struct sock *sk)
 
nla_put_u8(stats, TCP_NLA_RECUR_RETRANS, 
inet_csk(sk)->icsk_retransmits);
nla_put_u8(stats, TCP_NLA_DELIVERY_RATE_APP_LMT, 
!!tp->rate_app_limited);
+
+   nla_put_u32(stats, TCP_NLA_SNDQ_SIZE, tp->write_seq - tp->snd_una);
return stats;
 }
 
-- 
2.16.2.395.g2e18187dfd-goog



[PATCH net-next 2/2] tcp: add ca_state stat in SCM_TIMESTAMPING_OPT_STATS

2018-03-04 Thread Priyaranjan Jha
This patch adds TCP_NLA_CA_STATE stat into SCM_TIMESTAMPING_OPT_STATS.
It reports ca_state of socket, when timestamp is generated.

Signed-off-by: Priyaranjan Jha 
Signed-off-by: Neal Cardwell 
Signed-off-by: Yuchung Cheng 
Signed-off-by: Soheil Hassas Yeganeh 
---
 include/uapi/linux/tcp.h | 1 +
 net/ipv4/tcp.c   | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 93bad2128ef6..4c0ae0faf7ca 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -242,6 +242,7 @@ enum {
TCP_NLA_RECUR_RETRANS,  /* Recurring retransmits for the current pkt */
TCP_NLA_DELIVERY_RATE_APP_LMT, /* delivery rate application limited ? */
TCP_NLA_SNDQ_SIZE,  /* Data (bytes) pending in send queue */
+   TCP_NLA_CA_STATE,   /* ca_state of socket */
 
 };
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 162ba4227446..fb350f740f69 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3032,7 +3032,7 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const 
struct sock *sk)
 
stats = alloc_skb(7 * nla_total_size_64bit(sizeof(u64)) +
  4 * nla_total_size(sizeof(u32)) +
- 2 * nla_total_size(sizeof(u8)), GFP_ATOMIC);
+ 3 * nla_total_size(sizeof(u8)), GFP_ATOMIC);
if (!stats)
return NULL;
 
@@ -3063,6 +3063,7 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const 
struct sock *sk)
nla_put_u8(stats, TCP_NLA_DELIVERY_RATE_APP_LMT, 
!!tp->rate_app_limited);
 
nla_put_u32(stats, TCP_NLA_SNDQ_SIZE, tp->write_seq - tp->snd_una);
+   nla_put_u8(stats, TCP_NLA_CA_STATE, inet_csk(sk)->icsk_ca_state);
return stats;
 }
 
-- 
2.16.2.395.g2e18187dfd-goog



Re: [PATCH v2 net-next 0/5] Export SERDES stats via ethtool -S

2018-03-04 Thread David Miller
From: Andrew Lunn 
Date: Thu,  1 Mar 2018 02:02:26 +0100

> The mv88e6352 family has a SERDES interface which can be used for
> example to connect to SFF/SFP modules. This interface has a couple of
> statistics counters. Add support for including these counters in the
> output of ethtool -S.

Series applied, thanks Andrew.


Re: [PATCH 2/2] net: usb: asix88179_178a: de-duplicate code

2018-03-04 Thread David Miller
From: Alexander Kurz 
Date: Wed, 28 Feb 2018 21:27:39 +

> -static int ax88179_bind(struct usbnet *dev, struct usb_interface *intf)
> +static int ax88179_link_bind_or_reset(struct usbnet *dev, int do_reset)

"do_reset" is a boolean, therefore please use type 'bool' and true/false.

Thank you.


Re: [PATCH v2 net-next 1/1] tools: tc-testing: Add notap option

2018-03-04 Thread David Miller
From: "Brenda J. Butler" 
Date: Wed, 28 Feb 2018 15:36:19 -0500

> Add a command line arg to suppress tap output.  Handy in case
> all the tap output is being supplied by the plugins.
> 
> Signed-off-by: Brenda J. Butler 
> ---
> 
> v2:  Drop the first patch that changes the format
>  of the tap output.  The second "notap" patch is
>  reworked to apply cleanly without the first patch.

Applied.


Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available

2018-03-04 Thread Alexander Duyck
On Sat, Mar 3, 2018 at 11:13 PM, Jiri Pirko  wrote:
> Sun, Mar 04, 2018 at 01:26:53AM CET, alexander.du...@gmail.com wrote:
>>On Sat, Mar 3, 2018 at 1:25 PM, Jiri Pirko  wrote:
>>> Sat, Mar 03, 2018 at 07:04:57PM CET, alexander.du...@gmail.com wrote:
On Sat, Mar 3, 2018 at 3:31 AM, Jiri Pirko  wrote:
> Fri, Mar 02, 2018 at 08:42:47PM CET, m...@redhat.com wrote:
>>On Fri, Mar 02, 2018 at 05:20:17PM +0100, Jiri Pirko wrote:
>>> >Yeah, this code essentially calls out the "shareable" code with a
>>> >comment at the start and end of the section what defines the
>>> >virtio_bypass functionality. It would just be a matter of mostly
>>> >cutting and pasting to put it into a separate driver module.
>>>
>>> Please put it there and unite the use of it with netvsc.
>>
>>Surely, adding this to other drivers (e.g. might this be handy for xen
>>too?) can be left for a separate patchset. Let's get one device merged
>>first.
>
> Why? Let's do the generic infra alongside with the driver. I see no good
> reason to rush into merging driver and only later, if ever, to convert
> it to generic solution. On contrary. That would lead into multiple
> approaches and different behavious in multiple drivers. That is plain
> wrong.

If nothing else it doesn't hurt to do this in one driver in a generic
way, and once it has been proven to address all the needs of that one
driver we can then start moving other drivers to it. The current
solution is quite generic, that was my contribution to this patch set
as I didn't like how invasive it was being to virtio and thought it
would be best to keep this as minimally invasive as possible.

My preference would be to give this a release or two in virtio to
mature before we start pushing it onto other drivers. It shouldn't
take much to cut/paste this into a new driver file once we decide it
is time to start extending it out to other drivers.
>>>
>>> I'm not talking about cut/paste and in fact that is what I'm worried
>>> about. I'm talking about common code in net/core/ or somewhere that
>>> would take care of this in-driver bonding. Each driver, like virtio_net,
>>> netvsc would just register some ops to it and the core would do all
>>> logic. I believe it is essential take this approach from the start.
>>
>>Sorry, I didn't mean cut/paste into another driver, I meant to make it
>>a driver of its own. My thought was to eventually create a shared/core
>>driver module that is then used by the other drivers.
>>
>>My concern right now is that Stephen has indicated he doesn't want
>>this approach taken with netvsc, and most of the community doesn't
>
> IIUC, he only does not like the extra netdev. Is there anything else?

Nope that is pretty much it. It doesn't seem like a big deal for
virtio, but for netvsc it is significant since they don't have any
"backup" bit feature differentiation, so they would likely be stuck
with 2 netdevs even in their basic setup.

>>want the netvsc approach applied to virtio. Until that impasse can be
>>resolved there isn't much value in trying to split this up so it is
>>available to other drivers. In addition I would imagine it would make
>>it a pain for others to back-port into distros since it would break
>>legacy netvsc driver behavior. Patches are always welcome. Once this
>>is in you are free to try fighting to get this made into a generic
>>module and applied to both drivers, but we have already spent close to
>>3 months on this and it seems like there has been significantly more
>
> Alex, time is never a good argument for poor design and shortcuts.

I'm not saying we should go with a poor design due to time. But
expecting us to implement something where the maintainer of said
driver has not agreed to is pointless, and I don't see it as a design
shortcut to implement something in one driver with the expectation
that we will then make it core later once it has proven itself and has
use elsewhere. In the meantime I would imagine it also makes it easier
for things like backports and such for us to do it this way since we
are only impacting one driver.

You are telling us to do something that not everyone has agreed to.
Currently we only have agreement from Michael on taking this code, as
such we are working with virtio only for now. When the time comes that
we can get other maintainers, specifically Stephen, to agree to it
then we can cut/paste this code into a core file or into a module of
its own. Alternatively I suppose we could take this up to Dave if you
can't get Stephen to agree. If you can get Dave to say we need to
change netvsc then we will go ahead with it, but generally I prefer to
respect when the maintainer of something says they don't want us
modifying their code in some way.


Re: [PATCH v3 net-next 00/10] net/ipv6: Add support for path selection using hash of 5-tuple

2018-03-04 Thread David Miller
From: David Ahern 
Date: Fri,  2 Mar 2018 08:32:11 -0800

> Hardware supports multipath selection using the standard L4 5-tuple
> instead of just L3 and the flow label. In addition, some network
> operators prefer IPv6 path selection to use the 5-tuple. To that end,
> add support to IPv6 for multipath hash policy similar to
> bf4e0a3db97eb ("net: ipv4: add support for ECMP hash policy choice").
> The default is still L3 which covers source and destination addresses
> along with flow label and IPv6 protocol. This gives users a choice in
> hash algorithms if they believe L3 only and the IPv6 flow label are not
> sufficient for their use case.
> 
> A separate sysctl is added for IPv6, allowing IPv4 and IPv6 to use
> different algorithms if desired.
> 
> The first 3 patches modify the IPv4 variant so that at the end of the
> patch set the ipv4 and ipv6 implementations are direct parallels.
> 
> Patch 4 refactors the existing rt6_multipath_hash in preparation for
> adding the policy option.
> 
> Patch 5 renames the existing netevent to have IPv4 in the name so ipv4
> changes can be distinguished from IPv6 if the netevent handler cares.
> 
> Patch 6 adds the skb as an argument through the FIB lookup functions
> to the multipath selection. Needed for the forwarding case.
> 
> Patch 7 adds the L4 hash support.
> 
> Patch 8 adds the hook for the netevent to the spectrum driver to update
> the ASIC.
> 
> Patch 9 removes no longer used code.
> 
> Patch 10 adds a testcase for IPv6 multipath with L4 hash.
 ...

Series applied, nice work David.


Re: [PATCH net-next 0/9] sctp: clean up sctp_sendmsg

2018-03-04 Thread David Miller
From: Xin Long 
Date: Thu,  1 Mar 2018 23:05:09 +0800

> This cleanup mostly does three things:
> 
>  - extract some codes into functions to make sendmsg more readable.
> 
>  - tidy up some codes to avoid the unnecessary checks.
> 
>  - adjust some logic so that it will be easier to add the send flags
>and cmsgs features that I will post after this.
> 
> To make it easy to review and to check if the code is compatible with
> before, this patchset is to do it step by step in 9 patches.
> 
> NOTE:
> There will be a conflict when merging
> Commit 2277c7cd75e3 ("sctp: Add LSM hooks") from selinux tree,
> the solution is to:
> 
> 1. remove all the lines in [B]:
> 
> <<< HEAD
> [A]
> ===
> [B]
> >>> 2277c7c... sctp: Add LSM hooks
> 
> 2. and apply the following diff-output:
 ...

Series applied, thank you.

In particular, thanks for the merge resolution details.


Re: [PATCH iproute2] tc: fix parsing of the control action

2018-03-04 Thread Stephen Hemminger
On Fri,  2 Mar 2018 19:36:16 +0100
Davide Caratti  wrote:

> If the user didn't specify any control action, don't pop the command line
> arguments: otherwise, parsing of the next argument (tipically the 'index'
> keyword) results in an error, causing the following 'tc-testing' failures:
> 
>  Test a6d6: Add skbedit action with index
>  Test 38f3: Delete skbedit action
>  Test a568: Add action with ife type
>  Test b983: Add action without ife type
>  Test 7d50: Add skbmod action to set destination mac
>  Test 9b29: Add skbmod action to set source mac
>  Test e93a: Delete an skbmod action
> 
> Also, add missing parse for 'ok' control action to m_police, to fix the
> following 'tc-testing' failure:
> 
>  Test 8dd5: Add police action with control ok
> 
> tested with:
>  # ./tdc.py
> 
> test results:
>  all tests ok using kernel 4.16-rc2, except 9aa8 "Get a single skbmod
>  action from a list" (which is failing also before this commit)
> 
> Fixes: 3572e01a090a ("tc: util: Don't call NEXT_ARG_FWD() in 
> __parse_action_control()")
> Cc: Michal Privoznik 
> Cc: Wolfgang Bumiller 
> Signed-off-by: Davide Caratti 
> ---

Applied thanks.


Re: lnstat

2018-03-04 Thread Stephen Hemminger
On Sat, 3 Mar 2018 22:56:02 +0100
David Kaufmann  wrote:

> Hi!
> 
> `lnstat` segfaults (tested on Debian 9, CentOS 6+7, Fedora 27) if it is
> started as `lnstat -w 1`
> 
> according to gdb the crash is in `build_hdr_string` at lnstat.c:212
> 
> as it seems to be an useless value for the option anyway it might make
> sense to just handle a single "1" the same as if "0" was specified.
> `-w 0,1`, `-w 1,0`, `-w 1,1` and other variations do work.

Right having one character width is breaking the header building code.
Probably should just catch it in the option parsing.

> 
> All the best,
> Astra
> 
> PS: I did not find any other place to report this, if this is the wrong
> place please tell we where to post.

This the right place.


pgptUEIId_lnX.pgp
Description: OpenPGP digital signature


Re: [PATCH v2 0/4] net: Use strlcpy() for ethtool::get_strings

2018-03-04 Thread Andrew Lunn
On Fri, Mar 02, 2018 at 03:08:35PM -0800, Florian Fainelli wrote:
> Hi all,
> 
> After turning on KASAN on one of my systems, I started getting lots of out of
> bounds errors while fetching a given port's statistics, and indeed using
> memcpy() is unsafe for copying strings which have not been declared as an 
> array
> of ETH_GSTRING_LEN bytes, so let's use strlcpy() instead. This allows the best
> of both worlds: we still keep the efficient memory usage of variably sized
> strings, but we don't copy more than we need to.

Reviewed-by: Andrew Lunn 

Andrew


Re: [PATCH net v2 1/2] rhashtable: Fix rhlist duplicates insertion

2018-03-04 Thread Paul Blakey



On 04/03/2018 17:13, Mark Bloch wrote:



On 04/03/2018 15:26, Paul Blakey wrote:

When inserting duplicate objects (those with the same key),
current rhlist implementation messes up the chain pointers by
updating the bucket pointer instead of prev next pointer to the
newly inserted node. This causes missing elements on removal and
travesal.

Fix that by properly updating pprev pointer to point to
the correct rhash_head next pointer.

Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
Signed-off-by: Paul Blakey 
---
  include/linux/rhashtable.h | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index c9df252..668a21f 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -766,8 +766,10 @@ static inline void *__rhashtable_insert_fast(
if (!key ||
(params.obj_cmpfn ?
 params.obj_cmpfn(, rht_obj(ht, head)) :
-rhashtable_compare(, rht_obj(ht, head
+rhashtable_compare(, rht_obj(ht, head {
+   pprev = >next;


It seems rhashtable_lookup_one() might need the same fix.


yes was just about to send it!
it's in v3 with a test that shows it.




continue;
+   }
  
  		data = rht_obj(ht, head);
  



Mark



Re: dsa with 2 rgmii channels (1st is 2 switches & 2nd is phy)

2018-03-04 Thread Andrew Lunn
Hi Michael

> mdio {
> compatible = "cdns,macb-mdio";
> /*   reg = <0xe000b000 0x1000>; */
> /*   clocks = < 30>, < 30>, < 13>; */
> /*   clock-names = "pclk", "hclk", "tx_clk"; */
> #address-cells = <1>;
> #size-cells = <0>;
> status = "okay";
> switch0: switch@0 {
> compatible = 
> "marvell,mv88e6352";

Please use marvell,mv88e6085. That is what the 6352 is compatible
with.

It would also be good to sort out your mixup between tabs and spaces.

> mdio {
> compatible = "cdns,macb-mdio";
> /*   reg = <0xe000c000 0x1000>; */
> /*   clocks = < 31>, < 31>, < 14>; */
> /*   clock-names = "pclk", "hclk", "tx_clk"; */
> #address-cells = <1>;
> #size-cells = <0>;
> status = "okay";
> ethernet_phy: ethernet-phy@0 {
> compatible = 
> "marvell,mv88e1510";
> device_type = "ethernet-phy";
> reg = <0>;
> };

PHYs don't have compatible strings. It is not needed, you can read the
vendor and model from its registers.

   Andrew


[PATCH net v3 0/2] rhashtable: Fix rhltable duplicates insertion

2018-03-04 Thread Paul Blakey
On our mlx5 driver fs_core.c, we use the rhltable interface to store
flow groups. We noticed that sometimes we get a warning that flow group isn't
found at removal. This rare case was caused when a specific scenario happened,
insertion of a flow group with a similar match criteria (a duplicate),
but only where the flow group rhash_head was second (or not first)
on the relevant rhashtable bucket list.

The first patch fixes it, and the second one adds a test that show
it is now working.

Paul.

v3 --> v2 changes:
* added missing fix in rhashtable_lookup_one code path as well.

v1 --> v2 changes:
* Changed commit messages to better reflect the change

Paul Blakey (2):
  rhashtable: Fix rhlist duplicates insertion
  test_rhashtable: add test case for rhltable with duplicate objects

 include/linux/rhashtable.h |   4 +-
 lib/rhashtable.c   |   4 +-
 lib/test_rhashtable.c  | 134 +
 3 files changed, 140 insertions(+), 2 deletions(-)

-- 
1.8.4.3



[PATCH net v3 2/2] test_rhashtable: add test case for rhltable with duplicate objects

2018-03-04 Thread Paul Blakey
Tries to insert duplicates in the middle of bucket's chain:
bucket 1:  [[val 21 (tid=1)]] -> [[ val 1 (tid=2),  val 1 (tid=0) ]]

Reuses tid to distinguish the elements insertion order.

Signed-off-by: Paul Blakey 
---
 lib/test_rhashtable.c | 134 ++
 1 file changed, 134 insertions(+)

diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index 76d3667..f4000c1 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -79,6 +79,21 @@ struct thread_data {
struct test_obj *objs;
 };
 
+static u32 my_hashfn(const void *data, u32 len, u32 seed)
+{
+   const struct test_obj_rhl *obj = data;
+
+   return (obj->value.id % 10) << RHT_HASH_RESERVED_SPACE;
+}
+
+static int my_cmpfn(struct rhashtable_compare_arg *arg, const void *obj)
+{
+   const struct test_obj_rhl *test_obj = obj;
+   const struct test_obj_val *val = arg->key;
+
+   return test_obj->value.id - val->id;
+}
+
 static struct rhashtable_params test_rht_params = {
.head_offset = offsetof(struct test_obj, node),
.key_offset = offsetof(struct test_obj, value),
@@ -87,6 +102,17 @@ struct thread_data {
.nulls_base = (3U << RHT_BASE_SHIFT),
 };
 
+static struct rhashtable_params test_rht_params_dup = {
+   .head_offset = offsetof(struct test_obj_rhl, list_node),
+   .key_offset = offsetof(struct test_obj_rhl, value),
+   .key_len = sizeof(struct test_obj_val),
+   .hashfn = jhash,
+   .obj_hashfn = my_hashfn,
+   .obj_cmpfn = my_cmpfn,
+   .nelem_hint = 128,
+   .automatic_shrinking = false,
+};
+
 static struct semaphore prestart_sem;
 static struct semaphore startup_sem = __SEMAPHORE_INITIALIZER(startup_sem, 0);
 
@@ -465,6 +491,112 @@ static int __init test_rhashtable_max(struct test_obj 
*array,
return err;
 }
 
+static unsigned int __init print_ht(struct rhltable *rhlt)
+{
+   struct rhashtable *ht;
+   const struct bucket_table *tbl;
+   char buff[512] = "";
+   unsigned int i, cnt = 0;
+
+   ht = >ht;
+   tbl = rht_dereference(ht->tbl, ht);
+   for (i = 0; i < tbl->size; i++) {
+   struct rhash_head *pos, *next;
+   struct test_obj_rhl *p;
+
+   pos = rht_dereference(tbl->buckets[i], ht);
+   next = !rht_is_a_nulls(pos) ? rht_dereference(pos->next, ht) : 
NULL;
+
+   if (!rht_is_a_nulls(pos)) {
+   sprintf(buff, "%s\nbucket[%d] -> ", buff, i);
+   }
+
+   while (!rht_is_a_nulls(pos)) {
+   struct rhlist_head *list = container_of(pos, struct 
rhlist_head, rhead);
+   sprintf(buff, "%s[[", buff);
+   do {
+   pos = >rhead;
+   list = rht_dereference(list->next, ht);
+   p = rht_obj(ht, pos);
+
+   sprintf(buff, "%s val %d (tid=%d)%s", buff, 
p->value.id, p->value.tid,
+   list? ", " : " ");
+   cnt++;
+   } while (list);
+
+   pos = next,
+   next = !rht_is_a_nulls(pos) ?
+   rht_dereference(pos->next, ht) : NULL;
+
+   sprintf(buff, "%s]]%s", buff, !rht_is_a_nulls(pos) ? " 
-> " : "");
+   }
+   }
+   printk(KERN_ERR "\n ht: %s\n-\n", buff);
+
+   return cnt;
+}
+
+static int __init test_insert_dup(struct test_obj_rhl *rhl_test_objects,
+ int cnt, bool slow)
+{
+   struct rhltable rhlt;
+   unsigned int i, ret;
+   const char *key;
+   int err = 0;
+
+   err = rhltable_init(, _rht_params_dup);
+   if (WARN_ON(err))
+   return err;
+
+   for (i = 0; i < cnt; i++) {
+   rhl_test_objects[i].value.tid = i;
+   key = rht_obj(, _test_objects[i].list_node.rhead);
+   key += test_rht_params_dup.key_offset;
+
+   if (slow) {
+   err = PTR_ERR(rhashtable_insert_slow(, key,
+
_test_objects[i].list_node.rhead));
+   if (err == -EAGAIN)
+   err = 0;
+   } else
+   err = rhltable_insert(,
+ _test_objects[i].list_node,
+ test_rht_params_dup);
+   if (WARN(err, "error %d on element %d/%d (%s)\n", err, i, cnt, 
slow? "slow" : "fast"))
+   goto skip_print;
+   }
+
+   ret = print_ht();
+   WARN(ret != cnt, "missing rhltable elements (%d != %d, %s)\n", ret, 
cnt, slow? "slow" : "fast");
+
+skip_print:
+   rhltable_destroy();
+
+   return 0;
+}
+
+static int __init 

[PATCH net v3 1/2] rhashtable: Fix rhlist duplicates insertion

2018-03-04 Thread Paul Blakey
When inserting duplicate objects (those with the same key),
current rhlist implementation messes up the chain pointers by
updating the bucket pointer instead of prev next pointer to the
newly inserted node. This causes missing elements on removal and
travesal.

Fix that by properly updating pprev pointer to point to
the correct rhash_head next pointer.

Issue: 1241076
Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7
Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
Signed-off-by: Paul Blakey 
---
 include/linux/rhashtable.h | 4 +++-
 lib/rhashtable.c   | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index c9df252..668a21f 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -766,8 +766,10 @@ static inline void *__rhashtable_insert_fast(
if (!key ||
(params.obj_cmpfn ?
 params.obj_cmpfn(, rht_obj(ht, head)) :
-rhashtable_compare(, rht_obj(ht, head
+rhashtable_compare(, rht_obj(ht, head {
+   pprev = >next;
continue;
+   }
 
data = rht_obj(ht, head);
 
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 3825c30..47de025 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -506,8 +506,10 @@ static void *rhashtable_lookup_one(struct rhashtable *ht,
if (!key ||
(ht->p.obj_cmpfn ?
 ht->p.obj_cmpfn(, rht_obj(ht, head)) :
-rhashtable_compare(, rht_obj(ht, head
+rhashtable_compare(, rht_obj(ht, head {
+   pprev = >next;
continue;
+   }
 
if (!ht->rhlist)
return rht_obj(ht, head);
-- 
1.8.4.3



Re: [PATCH net v2 1/2] rhashtable: Fix rhlist duplicates insertion

2018-03-04 Thread Mark Bloch


On 04/03/2018 15:26, Paul Blakey wrote:
> When inserting duplicate objects (those with the same key),
> current rhlist implementation messes up the chain pointers by
> updating the bucket pointer instead of prev next pointer to the
> newly inserted node. This causes missing elements on removal and
> travesal.
> 
> Fix that by properly updating pprev pointer to point to
> the correct rhash_head next pointer.
> 
> Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
> Signed-off-by: Paul Blakey 
> ---
>  include/linux/rhashtable.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
> index c9df252..668a21f 100644
> --- a/include/linux/rhashtable.h
> +++ b/include/linux/rhashtable.h
> @@ -766,8 +766,10 @@ static inline void *__rhashtable_insert_fast(
>   if (!key ||
>   (params.obj_cmpfn ?
>params.obj_cmpfn(, rht_obj(ht, head)) :
> -  rhashtable_compare(, rht_obj(ht, head
> +  rhashtable_compare(, rht_obj(ht, head {
> + pprev = >next;

It seems rhashtable_lookup_one() might need the same fix.

>   continue;
> + }
>  
>   data = rht_obj(ht, head);
>  
> 

Mark


Re: [RFC PATCH V1 01/12] audit: add container id

2018-03-04 Thread Paul Moore
On Sat, Mar 3, 2018 at 4:19 AM, Serge E. Hallyn  wrote:
> On Thu, Mar 01, 2018 at 02:41:04PM -0500, Richard Guy Briggs wrote:
> ...
>> +static inline bool audit_containerid_set(struct task_struct *tsk)
>
> Hi Richard,
>
> the calls to audit_containerid_set() confused me.  Could you make it
> is_audit_containerid_set() or audit_containerid_isset()?

I haven't gone through the entire patchset yet, but I wanted to
quickly comment on this ... I really dislike the
function-names-as-sentences approach and would would greatly prefer
audit_containerid_isset().

>> +{
>> + return audit_get_containerid(tsk) != INVALID_CID;
>> +}

-- 
paul moore
www.paul-moore.com


[PATCH net-next] selftests: Extend the tc action test for action mirror

2018-03-04 Thread Arkadi Sharshevsky
Currently the tc action test is used only to test mirred redirect
action. This patch extends it for mirred mirror.

Signed-off-by: Jiri Pirko 
Reviewed-by: Ido Schimmel 
Signed-off-by: Arkadi Sharshevsky 
---
 tools/testing/selftests/net/forwarding/tc_actions.sh | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/net/forwarding/tc_actions.sh 
b/tools/testing/selftests/net/forwarding/tc_actions.sh
index 8423431..bc09a36 100755
--- a/tools/testing/selftests/net/forwarding/tc_actions.sh
+++ b/tools/testing/selftests/net/forwarding/tc_actions.sh
@@ -45,8 +45,10 @@ switch_destroy()
simple_if_fini $swp1 192.0.2.2/24
 }
 
-mirred_egress_redirect_test()
+mirred_egress_test()
 {
+   local action=$1
+
RET=0
 
tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \
@@ -59,19 +61,19 @@ mirred_egress_redirect_test()
check_fail $? "Matched without redirect rule inserted"
 
tc filter add dev $swp1 ingress protocol ip pref 1 handle 101 flower \
-   $tcflags dst_ip 192.0.2.2 action mirred egress redirect \
+   $tcflags dst_ip 192.0.2.2 action mirred egress $action \
dev $swp2
 
$MZ $h1 -c 1 -p 64 -a $h1mac -b $h2mac -A 192.0.2.1 -B 192.0.2.2 \
-t ip -q
 
tc_check_packets "dev $h2 ingress" 101 1
-   check_err $? "Did not match incoming redirected packet"
+   check_err $? "Did not match incoming $action packet"
 
tc filter del dev $swp1 ingress protocol ip pref 1 handle 101 flower
tc filter del dev $h2 ingress protocol ip pref 1 handle 101 flower
 
-   log_test "mirred egress redirect ($tcflags)"
+   log_test "mirred egress $action ($tcflags)"
 }
 
 gact_drop_and_ok_test()
@@ -180,7 +182,8 @@ setup_prepare
 setup_wait
 
 gact_drop_and_ok_test
-mirred_egress_redirect_test
+mirred_egress_test "redirect"
+mirred_egress_test "mirror"
 
 tc_offload_check
 if [[ $? -ne 0 ]]; then
@@ -188,7 +191,8 @@ if [[ $? -ne 0 ]]; then
 else
tcflags="skip_sw"
gact_drop_and_ok_test
-   mirred_egress_redirect_test
+   mirred_egress_test "redirect"
+   mirred_egress_test "mirror"
gact_trap_test
 fi
 
-- 
2.4.11



Re: [PATCH net v2 2/2] test_rhashtable: add test case for rhltable with duplicate objects

2018-03-04 Thread Herbert Xu
On Sun, Mar 04, 2018 at 03:26:49PM +0200, Paul Blakey wrote:
> Tries to insert duplicates in the middle of bucket's chain:
> bucket 1:  [[val 21 (tid=1)]] -> [[ val 1 (tid=2),  val 1 (tid=0) ]]
> 
> Reuses tid to distinguish the elements insertion order.
> 
> Signed-off-by: Paul Blakey 

Acked-by: Herbert Xu 
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH net v2 1/2] rhashtable: Fix rhlist duplicates insertion

2018-03-04 Thread Herbert Xu
On Sun, Mar 04, 2018 at 03:26:48PM +0200, Paul Blakey wrote:
> When inserting duplicate objects (those with the same key),
> current rhlist implementation messes up the chain pointers by
> updating the bucket pointer instead of prev next pointer to the
> newly inserted node. This causes missing elements on removal and
> travesal.
> 
> Fix that by properly updating pprev pointer to point to
> the correct rhash_head next pointer.
> 
> Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
> Signed-off-by: Paul Blakey 

Oops, replied to the wrong email.

Acked-by: Herbert Xu 

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH net 1/2] rhashtable: Fix rhltable duplicates insertion

2018-03-04 Thread Herbert Xu
On Sun, Mar 04, 2018 at 02:34:26PM +0200, Paul Blakey wrote:
> When inserting duplicate objects (those with the same key),
> current rhashtable implementation messes up the chain pointers by
> updating the bucket pointer instead of prev next pointer to the
> newly inserted node. This causes missing elements on removal and
> travesal.
> 
> Fix that by properly updating pprev pointer to point to
> the correct rhash_head next pointer.
> 
> Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
> Signed-off-by: Paul Blakey 

Ah I see, thanks for catching this!

Acked-by: Herbert Xu 

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] staging: ipx: Replace printk() with appropriate pr_*() macro

2018-03-04 Thread Greg KH
On Sun, Mar 04, 2018 at 02:29:35PM +0530, Arushi Singhal wrote:
> Using pr_() is more concise than printk(KERN_).
> Replace printks having a log level with the appropriate pr_*() macros.
> 
> Signed-off-by: Arushi Singhal 
> ---
>  drivers/staging/ipx/af_ipx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/ipx/af_ipx.c b/drivers/staging/ipx/af_ipx.c
> index d21a9d1..27f4461 100644
> --- a/drivers/staging/ipx/af_ipx.c
> +++ b/drivers/staging/ipx/af_ipx.c
> @@ -744,7 +744,7 @@ static void ipxitf_discover_netnum(struct ipx_interface 
> *intrfc,
>   intrfc->if_netnum = cb->ipx_source_net;
>   ipxitf_add_local_route(intrfc);
>   } else {
> - printk(KERN_WARNING "IPX: Network number collision "
> + pr_warn("IPX: Network number collision "

It is a driver, so it would be best to use dev_warn() or even better
yet, net_warn().  Please try to make that change instead.

thanks,

greg k-h


Re: [PATCH net 1/2] rhashtable: Fix rhltable duplicates insertion

2018-03-04 Thread Paul Blakey



On 04/03/2018 14:57, Herbert Xu wrote:

On Sun, Mar 04, 2018 at 02:34:26PM +0200, Paul Blakey wrote:

When inserting duplicate objects (those with the same key),
current rhashtable implementation messes up the chain pointers by
updating the bucket pointer instead of prev next pointer to the
newly inserted node. This causes missing elements on removal and
travesal.

Fix that by properly updating pprev pointer to point to
the correct rhash_head next pointer.

Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
Signed-off-by: Paul Blakey 


Nack.  You must not insert objects with the same key through
rhashtable.  The reason is that we cannot reliably fetch all
of the objects with the same key during a resize.

If you need duplicate objects, you should use rhlist.

Cheers,




Hi, I meant the rhlist interface here, sent  v2.
Thanks,
Paul.


[PATCH net v2 0/2] rhlist: Fix rhltable duplicates insertion

2018-03-04 Thread Paul Blakey
On our mlx5 driver fs_core.c, we use the rhltable interface to store
flow groups. We noticed that sometimes we get a warning that flow group isn't
found at removal. This rare case was caused when a specific scenario happened,
insertion of a flow group with a similar match criteria (a duplicate),
but only where the flow group rhash_head was second (or not first)
on the relevant rhashtable bucket list.

The first patch fixes it, and the second one adds a test that show
it is now working.

Paul.

v1 --> v2 changes:
* Changed commit messages to better reflect the change

Paul Blakey (2):
  rhashtable: Fix rhlist duplicates insertion
  test_rhashtable: add test case for rhltable with duplicate objects

 include/linux/rhashtable.h |   4 +-
 lib/test_rhashtable.c  | 121 +
 2 files changed, 124 insertions(+), 1 deletion(-)

-- 
1.8.4.3



[PATCH net v2 1/2] rhashtable: Fix rhlist duplicates insertion

2018-03-04 Thread Paul Blakey
When inserting duplicate objects (those with the same key),
current rhlist implementation messes up the chain pointers by
updating the bucket pointer instead of prev next pointer to the
newly inserted node. This causes missing elements on removal and
travesal.

Fix that by properly updating pprev pointer to point to
the correct rhash_head next pointer.

Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
Signed-off-by: Paul Blakey 
---
 include/linux/rhashtable.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index c9df252..668a21f 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -766,8 +766,10 @@ static inline void *__rhashtable_insert_fast(
if (!key ||
(params.obj_cmpfn ?
 params.obj_cmpfn(, rht_obj(ht, head)) :
-rhashtable_compare(, rht_obj(ht, head
+rhashtable_compare(, rht_obj(ht, head {
+   pprev = >next;
continue;
+   }
 
data = rht_obj(ht, head);
 
-- 
1.8.4.3



[PATCH net v2 2/2] test_rhashtable: add test case for rhltable with duplicate objects

2018-03-04 Thread Paul Blakey
Tries to insert duplicates in the middle of bucket's chain:
bucket 1:  [[val 21 (tid=1)]] -> [[ val 1 (tid=2),  val 1 (tid=0) ]]

Reuses tid to distinguish the elements insertion order.

Signed-off-by: Paul Blakey 
---
 lib/test_rhashtable.c | 121 ++
 1 file changed, 121 insertions(+)

diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index 76d3667..4a5f331 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -79,6 +79,21 @@ struct thread_data {
struct test_obj *objs;
 };
 
+static u32 my_hashfn(const void *data, u32 len, u32 seed)
+{
+   const struct test_obj_rhl *obj = data;
+
+   return (obj->value.id % 10) << RHT_HASH_RESERVED_SPACE;
+}
+
+static int my_cmpfn(struct rhashtable_compare_arg *arg, const void *obj)
+{
+   const struct test_obj_rhl *test_obj = obj;
+   const struct test_obj_val *val = arg->key;
+
+   return test_obj->value.id - val->id;
+}
+
 static struct rhashtable_params test_rht_params = {
.head_offset = offsetof(struct test_obj, node),
.key_offset = offsetof(struct test_obj, value),
@@ -87,6 +102,17 @@ struct thread_data {
.nulls_base = (3U << RHT_BASE_SHIFT),
 };
 
+static struct rhashtable_params test_rht_params_dup = {
+   .head_offset = offsetof(struct test_obj_rhl, list_node),
+   .key_offset = offsetof(struct test_obj_rhl, value),
+   .key_len = sizeof(struct test_obj_val),
+   .hashfn = jhash,
+   .obj_hashfn = my_hashfn,
+   .obj_cmpfn = my_cmpfn,
+   .nelem_hint = 128,
+   .automatic_shrinking = false,
+};
+
 static struct semaphore prestart_sem;
 static struct semaphore startup_sem = __SEMAPHORE_INITIALIZER(startup_sem, 0);
 
@@ -465,6 +491,99 @@ static int __init test_rhashtable_max(struct test_obj 
*array,
return err;
 }
 
+static unsigned int __init print_ht(struct rhltable *rhlt)
+{
+   struct rhashtable *ht;
+   const struct bucket_table *tbl;
+   char buff[512] = "";
+   unsigned int i, cnt = 0;
+
+   ht = >ht;
+   tbl = rht_dereference(ht->tbl, ht);
+   for (i = 0; i < tbl->size; i++) {
+   struct rhash_head *pos, *next;
+   struct test_obj_rhl *p;
+
+   pos = rht_dereference(tbl->buckets[i], ht);
+   next = !rht_is_a_nulls(pos) ? rht_dereference(pos->next, ht) : 
NULL;
+
+   if (!rht_is_a_nulls(pos)) {
+   sprintf(buff, "%s\nbucket[%d] -> ", buff, i);
+   }
+
+   while (!rht_is_a_nulls(pos)) {
+   struct rhlist_head *list = container_of(pos, struct 
rhlist_head, rhead);
+   sprintf(buff, "%s[[", buff);
+   do {
+   pos = >rhead;
+   list = rht_dereference(list->next, ht);
+   p = rht_obj(ht, pos);
+
+   sprintf(buff, "%s val %d (tid=%d)%s", buff, 
p->value.id, p->value.tid,
+   list? ", " : " ");
+   cnt++;
+   } while (list);
+
+   pos = next,
+   next = !rht_is_a_nulls(pos) ?
+   rht_dereference(pos->next, ht) : NULL;
+
+   sprintf(buff, "%s]]%s", buff, !rht_is_a_nulls(pos) ? " 
-> " : "");
+   }
+   }
+   printk(KERN_ERR "\n ht: %s\n-\n", buff);
+
+   return cnt;
+}
+
+static int __init test_insert_dup(struct test_obj_rhl *rhl_test_objects,
+ int cnt)
+{
+   struct rhltable rhlt;
+   unsigned int i, ret;
+   int err;
+
+   err = rhltable_init(, _rht_params_dup);
+   if (WARN_ON(err))
+   return err;
+
+   for (i = 0; i < cnt; i++) {
+   rhl_test_objects[i].value.tid = i;
+   err = rhltable_insert(, _test_objects[i].list_node,
+ test_rht_params_dup);
+   if (WARN(err, "error %d on element %d\n", err, i))
+   goto skip_print;
+   }
+
+   ret = print_ht();
+   WARN(ret != cnt, "missing rhltable elements (%d != %d)\n", ret, cnt);
+
+skip_print:
+   rhltable_destroy();
+
+   return 0;
+}
+
+static int __init test_insert_duplicates_run(void)
+{
+   struct test_obj_rhl rhl_test_objects[3] = {};
+
+   pr_info("test inserting duplicates\n");
+
+   /* two different values that map to same bucket */
+   rhl_test_objects[0].value.id = 1;
+   rhl_test_objects[1].value.id = 21;
+
+   /* and another duplicate with same as [0] value
+* which will be second on the bucket list */
+   rhl_test_objects[2].value.id = rhl_test_objects[0].value.id;
+
+   test_insert_dup(rhl_test_objects, 2);
+   test_insert_dup(rhl_test_objects, 3);
+
+   return 0;
+}
+
 static int 

Re: [PATCH net 1/2] rhashtable: Fix rhltable duplicates insertion

2018-03-04 Thread Herbert Xu
On Sun, Mar 04, 2018 at 02:34:26PM +0200, Paul Blakey wrote:
> When inserting duplicate objects (those with the same key),
> current rhashtable implementation messes up the chain pointers by
> updating the bucket pointer instead of prev next pointer to the
> newly inserted node. This causes missing elements on removal and
> travesal.
> 
> Fix that by properly updating pprev pointer to point to
> the correct rhash_head next pointer.
> 
> Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
> Signed-off-by: Paul Blakey 

Nack.  You must not insert objects with the same key through
rhashtable.  The reason is that we cannot reliably fetch all
of the objects with the same key during a resize.

If you need duplicate objects, you should use rhlist.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH net 0/2] rhashtable: Fix rhltable duplicates insertion

2018-03-04 Thread Paul Blakey
On our mlx5 driver fs_core.c, we use the rhltable interface to store
flow groups. We noticed that sometimes we get a warning that flow group isn't
found at removal. This rare case was caused when a specific scenrio happened, 
insertion of a flow group with a similar match criteria (duplicate),
but only where the flow group rhash_head was second (or not first)
on the relevant rhashtable bucket list.

The first patch fixes it, and the second one adds a test that show
it is now working.

Paul Blakey (2):
  rhashtable: Fix rhltable duplicates insertion
  test_rhashtable: add test case for rhl_table with duplicate objects

 include/linux/rhashtable.h |   4 +-
 lib/test_rhashtable.c  | 121 +
 2 files changed, 124 insertions(+), 1 deletion(-)

-- 
1.8.4.3



[PATCH net 2/2] test_rhashtable: add test case for rhl_table with duplicate objects

2018-03-04 Thread Paul Blakey
Tries to insert duplicates in the middle of bucket's chain:
bucket 1:  [[val 21 (tid=1)]] -> [[ val 1 (tid=2),  val 1 (tid=0) ]]

Reuses tid to distinguish the elements insertion order.

Signed-off-by: Paul Blakey 
---
 lib/test_rhashtable.c | 121 ++
 1 file changed, 121 insertions(+)

diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index 76d3667..4a5f331 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -79,6 +79,21 @@ struct thread_data {
struct test_obj *objs;
 };
 
+static u32 my_hashfn(const void *data, u32 len, u32 seed)
+{
+   const struct test_obj_rhl *obj = data;
+
+   return (obj->value.id % 10) << RHT_HASH_RESERVED_SPACE;
+}
+
+static int my_cmpfn(struct rhashtable_compare_arg *arg, const void *obj)
+{
+   const struct test_obj_rhl *test_obj = obj;
+   const struct test_obj_val *val = arg->key;
+
+   return test_obj->value.id - val->id;
+}
+
 static struct rhashtable_params test_rht_params = {
.head_offset = offsetof(struct test_obj, node),
.key_offset = offsetof(struct test_obj, value),
@@ -87,6 +102,17 @@ struct thread_data {
.nulls_base = (3U << RHT_BASE_SHIFT),
 };
 
+static struct rhashtable_params test_rht_params_dup = {
+   .head_offset = offsetof(struct test_obj_rhl, list_node),
+   .key_offset = offsetof(struct test_obj_rhl, value),
+   .key_len = sizeof(struct test_obj_val),
+   .hashfn = jhash,
+   .obj_hashfn = my_hashfn,
+   .obj_cmpfn = my_cmpfn,
+   .nelem_hint = 128,
+   .automatic_shrinking = false,
+};
+
 static struct semaphore prestart_sem;
 static struct semaphore startup_sem = __SEMAPHORE_INITIALIZER(startup_sem, 0);
 
@@ -465,6 +491,99 @@ static int __init test_rhashtable_max(struct test_obj 
*array,
return err;
 }
 
+static unsigned int __init print_ht(struct rhltable *rhlt)
+{
+   struct rhashtable *ht;
+   const struct bucket_table *tbl;
+   char buff[512] = "";
+   unsigned int i, cnt = 0;
+
+   ht = >ht;
+   tbl = rht_dereference(ht->tbl, ht);
+   for (i = 0; i < tbl->size; i++) {
+   struct rhash_head *pos, *next;
+   struct test_obj_rhl *p;
+
+   pos = rht_dereference(tbl->buckets[i], ht);
+   next = !rht_is_a_nulls(pos) ? rht_dereference(pos->next, ht) : 
NULL;
+
+   if (!rht_is_a_nulls(pos)) {
+   sprintf(buff, "%s\nbucket[%d] -> ", buff, i);
+   }
+
+   while (!rht_is_a_nulls(pos)) {
+   struct rhlist_head *list = container_of(pos, struct 
rhlist_head, rhead);
+   sprintf(buff, "%s[[", buff);
+   do {
+   pos = >rhead;
+   list = rht_dereference(list->next, ht);
+   p = rht_obj(ht, pos);
+
+   sprintf(buff, "%s val %d (tid=%d)%s", buff, 
p->value.id, p->value.tid,
+   list? ", " : " ");
+   cnt++;
+   } while (list);
+
+   pos = next,
+   next = !rht_is_a_nulls(pos) ?
+   rht_dereference(pos->next, ht) : NULL;
+
+   sprintf(buff, "%s]]%s", buff, !rht_is_a_nulls(pos) ? " 
-> " : "");
+   }
+   }
+   printk(KERN_ERR "\n ht: %s\n-\n", buff);
+
+   return cnt;
+}
+
+static int __init test_insert_dup(struct test_obj_rhl *rhl_test_objects,
+ int cnt)
+{
+   struct rhltable rhlt;
+   unsigned int i, ret;
+   int err;
+
+   err = rhltable_init(, _rht_params_dup);
+   if (WARN_ON(err))
+   return err;
+
+   for (i = 0; i < cnt; i++) {
+   rhl_test_objects[i].value.tid = i;
+   err = rhltable_insert(, _test_objects[i].list_node,
+ test_rht_params_dup);
+   if (WARN(err, "error %d on element %d\n", err, i))
+   goto skip_print;
+   }
+
+   ret = print_ht();
+   WARN(ret != cnt, "missing rhltable elements (%d != %d)\n", ret, cnt);
+
+skip_print:
+   rhltable_destroy();
+
+   return 0;
+}
+
+static int __init test_insert_duplicates_run(void)
+{
+   struct test_obj_rhl rhl_test_objects[3] = {};
+
+   pr_info("test inserting duplicates\n");
+
+   /* two different values that map to same bucket */
+   rhl_test_objects[0].value.id = 1;
+   rhl_test_objects[1].value.id = 21;
+
+   /* and another duplicate with same as [0] value
+* which will be second on the bucket list */
+   rhl_test_objects[2].value.id = rhl_test_objects[0].value.id;
+
+   test_insert_dup(rhl_test_objects, 2);
+   test_insert_dup(rhl_test_objects, 3);
+
+   return 0;
+}
+
 static int 

[PATCH net 1/2] rhashtable: Fix rhltable duplicates insertion

2018-03-04 Thread Paul Blakey
When inserting duplicate objects (those with the same key),
current rhashtable implementation messes up the chain pointers by
updating the bucket pointer instead of prev next pointer to the
newly inserted node. This causes missing elements on removal and
travesal.

Fix that by properly updating pprev pointer to point to
the correct rhash_head next pointer.

Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
Signed-off-by: Paul Blakey 
---
 include/linux/rhashtable.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index c9df252..668a21f 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -766,8 +766,10 @@ static inline void *__rhashtable_insert_fast(
if (!key ||
(params.obj_cmpfn ?
 params.obj_cmpfn(, rht_obj(ht, head)) :
-rhashtable_compare(, rht_obj(ht, head
+rhashtable_compare(, rht_obj(ht, head {
+   pprev = >next;
continue;
+   }
 
data = rht_obj(ht, head);
 
-- 
1.8.4.3



[PATCH net-next] net: Make RX-FCS and LRO mutually exclusive

2018-03-04 Thread Gal Pressman
LRO and RX-FCS offloads cannot be enabled at the same time since it is
not clear what should happen to the FCS of each coalesced packet.
The FCS is not really part of the TCP payload, hence cannot be merged
into one big packet. On the other hand, providing one big LRO packet
with one FCS contradicts the RX-FCS feature goal.

Use the fix features mechanism in order to prevent intersection of the
features and drop LRO in case RX-FCS is requested.

Enabling RX-FCS while LRO is enabled will result in:
$ ethtool -K ens6 rx-fcs on
Actual changes:
large-receive-offload: off [requested on]
rx-fcs: on

Signed-off-by: Gal Pressman 
Reviewed-by: Tariq Toukan 
---
 net/core/dev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index c9d3058..1bc3792 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7542,6 +7542,12 @@ static netdev_features_t netdev_fix_features(struct 
net_device *dev,
}
}
 
+   /* LRO feature cannot be combined with RX-FCS */
+   if ((features & NETIF_F_LRO) && (features & NETIF_F_RXFCS)) {
+   netdev_dbg(dev, "Dropping LRO feature since RX-FCS is 
requested.\n");
+   features &= ~NETIF_F_LRO;
+   }
+
return features;
 }
 
-- 
2.7.4



RE: [PATCH net-next 5/5] net: mvpp2: jumbo frames support

2018-03-04 Thread Stefan Chulski
> > To perform checksum in HW, HW obviously should work in store and
> forward mode. Store all frame in TX FIFO and then check checksum.
> > If mtu 1500B, everything fine and all port can do this.
> >
> > If mtu is 9KB and 9KB frame transmitted, Port 0 still can do HW checksum.
> But ports 1 and 2 doesn't has enough FIFO for this.
> > So we cannot offload this feature and SW should perform checksum.
> 
> So perhaps the real check should not be "port 0", but whether the MTU is
> higher or lower than the TX FIFO size assigned to the current port.
> This would express in much better way the reason why HW checksum can be
> used or not.

I really don't want involve MTU size here, for each packet we should add to MTU 
overhead added by HW(offset, CRC, DSA tags and etc).
I prefer just to check: port TX FIFO size is 10KB -> port can support HW 
checksum offload.
Do you suggest to keep some shadow table with ports TX FIFO sizes for this?  

Thanks,
Stefan.


RE: [PATCH net-next 3/5] net: mvpp2: use a data size of 10kB for Tx FIFO on port 0

2018-03-04 Thread Stefan Chulski


> -Original Message-
> From: Thomas Petazzoni [mailto:thomas.petazz...@bootlin.com]
> Sent: Sunday, March 04, 2018 11:25 AM
> To: Stefan Chulski 
> Cc: Antoine Tenart ; da...@davemloft.net;
> Yan Markman ; netdev@vger.kernel.org; linux-
> ker...@vger.kernel.org; maxime.chevall...@bootlin.com;
> gregory.clem...@bootlin.com; miquel.ray...@bootlin.com; Nadav Haklai
> ; m...@semihalf.com
> Subject: Re: [PATCH net-next 3/5] net: mvpp2: use a data size of 10kB for Tx
> FIFO on port 0
> 
> Hello,
> 
> On Sun, 4 Mar 2018 06:29:59 +, Stefan Chulski wrote:
> 
> > > Is there a reason to hardcode 10KB for port 0, and 3KB for the other ports
> ?
> > > Would there be use cases where the user may want different
> > > configurations ?
> >
> > Design requirement are 10KB TX FIFO for the 10Gb/sec and 2.5KB for the
> 2.5Gb/sec.
> 
> What is a "design requirement" ? Is it a HW design limitation ?

We can call it HW design limitation. Anyway to support 10Gb/sec port should 
have at least 10KB TX FIFO.

> So, the limitation has nothing to do with CP110 really, it's just a 
> limitation of
> PPv2.2, and mentioning CP110 in the comment doesn't make much sense,
> correct ?

I will change it.

Stefan.




Re: [PATCH net-next 5/5] net: mvpp2: jumbo frames support

2018-03-04 Thread Thomas Petazzoni
Hello,

On Sun, 4 Mar 2018 06:56:02 +, Stefan Chulski wrote:

> > > + if (port->pool_long->id == MVPP2_BM_JUMBO && port->id != 0) {  
> > 
> > Again, all over the place we hardcode the fact that Jumbo frames can only be
> > used on port 0. I know port 0 is the only one that can do 10G, but are there
> > possibly some use cases where you may want Jumbo frame on another port
> > ?
> > 
> > This all really feels very hardcoded to me.
> >   
> 
> All ports support Jumbo frames.
> But only port 0 can do TX HW checksum offload(due to TX FIFO size).
> 
> Packet processor 2.2 has only 19KB TX FIFO size.
> So in TX FIFO config code assign for Port 0 - 10KB, Port 1 - 3KB and Port 1 - 
> 3KB.

Yes, but I was also questioning whether hardcoding this configuration
was correct.

> To perform checksum in HW, HW obviously should work in store and forward 
> mode. Store all frame in TX FIFO and then check checksum.
> If mtu 1500B, everything fine and all port can do this.
> 
> If mtu is 9KB and 9KB frame transmitted, Port 0 still can do HW checksum. But 
> ports 1 and 2 doesn't has enough FIFO for this.
> So we cannot offload this feature and SW should perform checksum.

So perhaps the real check should not be "port 0", but whether the MTU
is higher or lower than the TX FIFO size assigned to the current port.
This would express in much better way the reason why HW checksum can be
used or not.

> > > + /* 9704 == 9728 - 20 and rounding to 8 */
> > > + dev->max_mtu = MVPP2_BM_JUMBO_PKT_SIZE;  
> > 
> > Is this correct for all ports ? Shouldn't the maximum MTU be different
> > between port 0 (that supports Jumbo frames) and the other ports ?  
> 
> This is correct for all ports. All ports can support Jumbo frames.

OK. With your explanation above, I understand better.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com


Re: [PATCH net-next 3/5] net: mvpp2: use a data size of 10kB for Tx FIFO on port 0

2018-03-04 Thread Thomas Petazzoni
Hello,

On Sun, 4 Mar 2018 06:29:59 +, Stefan Chulski wrote:

> > Is there a reason to hardcode 10KB for port 0, and 3KB for the other ports ?
> > Would there be use cases where the user may want different configurations
> > ?
> 
> Design requirement are 10KB TX FIFO for the 10Gb/sec and 2.5KB for the 
> 2.5Gb/sec.

What is a "design requirement" ? Is it a HW design limitation ?

> Since only port 0 support 10Gb/sec and ports 1&2 support up to 2.5Gb/sec.
> I don't see any reason to change this configurations.
> Also TX FIFO size could be set only during probe.
> 
> > It's just that it feels very "hardcoded" to enforce specifically those 
> > numbers.
> > 
> > Also, does it make sense to mention the CP110 here ? Is this 19 KB 
> > limitation
> > a limit of the PPv2.2 IP, or of the CP110 ?  
> 
> PPv2.2 IP is part of 110 communication processor.

Thanks, I know this :-)

> Next communication processor will has different Packet processor or next 
> generation of PPv2.x
> Limit is PPv2.2 TX FIFO.

So, the limitation has nothing to do with CP110 really, it's just a
limitation of PPv2.2, and mentioning CP110 in the comment doesn't make
much sense, correct ?

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com


[PATCH] staging: ipx: Replace printk() with appropriate pr_*() macro

2018-03-04 Thread Arushi Singhal
Using pr_() is more concise than printk(KERN_).
Replace printks having a log level with the appropriate pr_*() macros.

Signed-off-by: Arushi Singhal 
---
 drivers/staging/ipx/af_ipx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/ipx/af_ipx.c b/drivers/staging/ipx/af_ipx.c
index d21a9d1..27f4461 100644
--- a/drivers/staging/ipx/af_ipx.c
+++ b/drivers/staging/ipx/af_ipx.c
@@ -744,7 +744,7 @@ static void ipxitf_discover_netnum(struct ipx_interface 
*intrfc,
intrfc->if_netnum = cb->ipx_source_net;
ipxitf_add_local_route(intrfc);
} else {
-   printk(KERN_WARNING "IPX: Network number collision "
+   pr_warn("IPX: Network number collision "
"%lx\n%s %s and %s %s\n",
(unsigned long) ntohl(cb->ipx_source_net),
ipx_device_name(i),
-- 
2.7.4



Re: [PATCH net-next] selftests: forwarding: Add suppport to create veth interfaces

2018-03-04 Thread Ido Schimmel
On Fri, Mar 02, 2018 at 08:45:53AM -0800, David Ahern wrote:
> For tests using veth interfaces, the test infrastructure can create
> the netdevs if they do not exist. Arguably this is a preferred approach
> since the tests require p$N and p$(N+1) to be pairs.
> 
> Signed-off-by: David Ahern 

[...]

> diff --git a/tools/testing/selftests/net/forwarding/lib.sh 
> b/tools/testing/selftests/net/forwarding/lib.sh
> index d0af52109360..2ce98c6a8c25 100644
> --- a/tools/testing/selftests/net/forwarding/lib.sh
> +++ b/tools/testing/selftests/net/forwarding/lib.sh
> @@ -76,6 +76,39 @@ done
>  
> ##
>  # Network interfaces configuration
>  
> +create_netif_veth()
> +{
> + local i
> +
> + for i in $(eval echo {1..$NUM_NETIFS}); do
> + j=$((i+1))

local j=$((i+1)) and drop a line.

> + ip link show dev ${NETIFS[p$i]} &> /dev/null
> + if [[ $? -ne 0 ]]; then
> + ip link add ${NETIFS[p$i]} type veth peer name 
> ${NETIFS[p$j]}

Need to break this one. FWIW, I have this in my config:

$ cat ~/.vim/after/ftplugin/sh.vim
...
highlight OverLength ctermbg=red ctermfg=white
match OverLength /\%81v.\+/

Cool patch! Tested on my machine.

> + if [[ $? -ne 0 ]]; then
> + echo "Failed to create netif"
> + exit 1
> + fi
> + fi
> + i=$j
> + done
> +}
> +
> +create_netif()
> +{
> + case "$NETIF_TYPE" in
> + veth) create_netif_veth
> +   ;;
> + *) echo "Can not create interfaces of type \'$NETIF_TYPE\'"
> +exit 1
> +;;
> + esac
> +}
> +
> +if [[ "$NETIF_CREATE" = "yes" ]]; then
> + create_netif
> +fi
> +
>  for i in $(eval echo {1..$NUM_NETIFS}); do
>   ip link show dev ${NETIFS[p$i]} &> /dev/null
>   if [[ $? -ne 0 ]]; then
> -- 
> 2.11.0
> 


Re: [PATCH] ravb: remove erroneous comment

2018-03-04 Thread Sergei Shtylyov

Hello!

On 3/4/2018 1:39 AM, Niklas Söderlund wrote:


When addressing a review comment in a early version of the offending
patch a comment where left in which should have been removed. Remove the


   s/where/was/?


comment to keep it consistent with the code.

Fixes: 75efa06f457bbed3 ("ravb: add support for changing MTU")
Reported-by: Sergei Shtylyov 
Signed-off-by: Niklas Söderlund 


Acked-by: Sergei Shtylyov 

[...]

MBR, Sergei